Publications

Preprints

SSL-TTS: Leveraging Self-Supervised Embeddings and kNN Retrieval for Zero-Shot Multi-speaker TTS
Karl El Hajal, Ajinkya Kulkarni, Enno Hermann, Mathew Magimai.-Doss. arXiv:2408.10771, 2024. [pdf, samples]

2024

Towards interfacing large language models with ASR systems using confidence measures and prompting
Maryam Naderi, Enno Hermann, Alexandre Nanchen, Sevada Hovsepyan, Mathew Magimai.-Doss. In Proc. of Interspeech, pp. 2980–2984. 2024. [pdf]

2023

Few-shot Dysarthric Speech Recognition with Text-to-Speech Data Augmentation
Enno Hermann, Mathew Magimai.-Doss. In Proceedings of Interspeech, pp. 156–160. 2023. [pdf]
Using Commercial ASR Solutions to Assess Reading Skills in Children: A Case Report
Timothy Piton, Enno Hermann, Angela Pasqualotto, Marjolaine Cohen, Mathew Magimai.-Doss, Daphné Bavelier. In Proceedings of Interspeech, pp. 4573–4577. 2023. [pdf]
On matching data and model in LF-MMI-based dysarthric speech recognition
Enno Hermann. PhD thesis, École polytechnique fédérale de Lausanne. 2023. [pdf]

2021

An Objective Evaluation Framework for Pathological Speech Synthesis
Bence Mark Halpern, Julian Fritsch, Enno Hermann, Rob van Son, Odette Scharenborg and Mathew Magimai.-Doss. In Proceedings of ITG. 2021. [pdf, blog]
Handling acoustic variation in dysarthric speech recognition systems through model combination
Enno Hermann, Mathew Magimai.-Doss. In Proceedings of Interspeech. 2021. [pdf]

2020

Dysarthric Speech Recognition with Lattice-Free MMI
Enno Hermann, Mathew Magimai.-Doss. In Proceedings of ICASSP, pp. 6109–6113. 2020. [pdf, code]
Multilingual and Unsupervised Subword Modeling for Zero-Resource Languages
Enno Hermann, Herman Kamper and Sharon Goldwater. Computer Speech and Language. 2020. [pdf, code]

2018

Multilingual bottleneck features for subword modeling in zero-resource languages
Enno Hermann and Sharon Goldwater. In Proceedings of Interspeech, pp. 2668–2672. 2018. [pdf, code, slides]

2017

Iteratively improving unsupervised term discovery and unsupervised speech representations
Enno Hermann. MSc dissertation, University of Edinburgh. 2017. [pdf, code]
Highly Commended Dissertation prize

2016

Exploring bilingual text-to-speech conversion for Irish and Irish English using a statistical parametric synthesis system
Enno Hermann. BA dissertation, Trinity College Dublin. 2016. [code]