Speech-to-text, also known as Speech Recognition, is a technology that is able to recognize and transcribe spoken language into text. In subsequent steps, this transcription can be used to complete a multitude of tasks, such as providing automatic subtitles or parsing voice commands. In recent years, Speech-to-Text models have dramatically improved thanks partially to advances in Deep Learning methods. Starting from the open-source project DeepSpeech, we train speech-to-text models for Dutch, using the Corpus Gesproken Nederlands (CGN). First, we contribute a pre-processing pipeline for this dataset, to make it suitable for the task at hand, obtaining a ready-to-use speech-to-text dataset for Dutch. Second, we investigate the performance of Dutch and Flemish models trained from scratch, establishing a baseline for the CGN dataset for this task. Finally, we investigate the issue of transferring speech-to-text models between related languages. In this case, we analyse how a pre-trained English model can be transferred and fine-tuned for Dutch.

Original languageEnglish
Title of host publicationProceedings of the 31st Benelux Conference on Artificial Intelligence (BNAIC 2019)
EditorsKatrien Beuls, Bart Bogaerts, Gianluca Bontempi, Pierre Geurts, Nick Harley, Bertrand Lebichot, Tom Lenaerts, Gilles Louppe, Paul Van Eecke
PublisherCEUR Workshop Proceedings
Number of pages14
ISBN (Electronic)1613-0073
Publication statusPublished - 8 Nov 2019
Event31st Benelux Conference on Artificial Intelligence - Les Ateliers Des Tanneurs, Brussels, Belgium
Duration: 6 Nov 20198 Nov 2019
Conference number: 2019


Conference31st Benelux Conference on Artificial Intelligence
Abbreviated titleBNAIC
Internet address

ID: 48202248