Publications
This list highlights my most relevant publications that make up the bulk of the work carried out during my PhD. You can find my full publication list on my Google Scholar profile.
My thesis topic is Streaming Speech Translation, which is the task of translating (in real-time) an unbounded text stream, which is typically the output of a Streaming ASR system. During my PhD, I have worked in four key elements that are required for Streaming Speech Translation:
- A high quality multilingual Speech Translation dataset
- A Streaming Segmenter that process the output of the ASR system into sentence-like units
- Robust latency evaluation metrics that can be applied to the Streaming scenario
- Context-aware models and methodologies for improved Streaming translation quality
Take a look into each of the following publications in order to learn more:
Published in ACL, 2022
The Streaming ST scenario presents many challenges, but there are also opportunities that can be used to improve translation quality. This work introduces the concept of Streaming history, which holds the information of the previously translated segments. The proposed MT system is able to leverage this contextual information in order to improve translation quality.
Published in Findings of EMNLP, 2021
A reliable evaluation metric is critical for any technical and scientific task. However, the standard simultaneous MT latency metrics (AP, AL and DAL) are not robust when applied to the Streaming scenario. This paper studies this phenomenon and proposes a re-segmentation solution that provides reliable and interpretable results for the Streaming scenario.
Published in Neural Networks, 2021
This paper extends the previous one (EMNLP2020) with additional experiments and by moving from a simulated Streaming scenario, which used an offline MT system, to a real streaming scenario with a simultaneous MT system.
Published in EMNLP, 2020
Machine Translation systems are trained with full sentences, but in the Cascaded Speech Translation scenario, the output of the ASR system does not necessarily form sentences, which hampers performance. This publication introduces a streaming-ready segmenter applied to the output of the ASR system, in order to maximize downstream translation quality.
Published in ICASSP, 2020
Speech Translation datasets are a scarce resource, and this greatly hampers research in the area. Europarl-ST, first released in 2019, was a game-changer for Speech Translation research, thanks to the wide range of languages covered and a careful filtering pipeline. Currently (early 2023), close to 100 publications have cited Europarl-ST.