Publications
Preprints
- SSL-TTS: Leveraging Self-Supervised Embeddings and kNN Retrieval for Zero-Shot Multi-speaker TTS
- Karl El Hajal, Ajinkya Kulkarni, Enno Hermann, Mathew Magimai.-Doss.
arXiv:2408.10771, 2024. [pdf, samples]
2024
- Towards interfacing large language models with ASR systems using confidence measures and prompting
- Maryam Naderi, Enno Hermann, Alexandre Nanchen, Sevada Hovsepyan, Mathew Magimai.-Doss. In Proc. of Interspeech, pp. 2980–2984. 2024. [pdf]
2023
- Few-shot Dysarthric Speech Recognition with Text-to-Speech Data Augmentation
- Enno Hermann, Mathew Magimai.-Doss. In Proceedings of Interspeech, pp. 156–160. 2023. [pdf]
- Using Commercial ASR Solutions to Assess Reading Skills in Children: A Case Report
- Timothy Piton, Enno Hermann, Angela Pasqualotto, Marjolaine Cohen, Mathew Magimai.-Doss, Daphné Bavelier. In Proceedings of Interspeech, pp. 4573–4577. 2023. [pdf]
- On matching data and model in LF-MMI-based dysarthric speech recognition
- Enno Hermann. PhD thesis, École polytechnique fédérale de Lausanne. 2023. [pdf]
2021
- An Objective Evaluation Framework for Pathological Speech Synthesis
- Bence Mark Halpern, Julian Fritsch, Enno Hermann, Rob van Son, Odette Scharenborg and Mathew Magimai.-Doss. In Proceedings of ITG. 2021. [pdf, blog]
- Handling acoustic variation in dysarthric speech recognition systems through model combination
- Enno Hermann, Mathew Magimai.-Doss. In Proceedings of Interspeech. 2021. [pdf]
2020
- Dysarthric Speech Recognition with Lattice-Free MMI
- Enno Hermann, Mathew Magimai.-Doss. In Proceedings of ICASSP, pp. 6109–6113. 2020. [pdf, code]
- Multilingual and Unsupervised Subword Modeling for Zero-Resource Languages
- Enno Hermann, Herman Kamper and Sharon Goldwater. Computer Speech and Language. 2020. [pdf, code]
2018
- Multilingual bottleneck features for subword modeling in zero-resource languages
- Enno Hermann and Sharon Goldwater. In Proceedings of Interspeech, pp. 2668–2672. 2018. [pdf, code, slides]
2017
- Iteratively improving unsupervised term discovery and unsupervised speech representations
- Enno Hermann. MSc dissertation, University of Edinburgh. 2017. [pdf, code]
- Highly Commended Dissertation prize
2016
- Exploring bilingual text-to-speech conversion for Irish and Irish English using a statistical parametric synthesis system
- Enno Hermann. BA dissertation, Trinity College Dublin. 2016. [code]