Abstract
Names pose particular problems for grapheme-to-phoneme (g2p) converters. This is due to their non-standard orthography caused by foreign origin or fossilisation of older spelling forms. In the Autonomata project a variety of techniques is studied to improve the g2p conversion of Dutch names, more specifically: first names, second names, street names and town names. In Autonomata, a standard g2p converter is augmented with a name-specific phoneme-to-phoneme (p2p) converter that captures the peculiarities of names. Based on large collections of names with a manually verified phonetic transcription, the p2p is trained with the specific information it requires. Various inductive and deductive approaches are studied to achive this goal. We will exemplify our approach by showing results on the g2p of Dutch first names.
Autonomata is carried out in the framework of the STEVIN-programme.
Partners in the project are the Radboud University Nijmegen, Ghent University, Utrecht University, Nuance, and TeleAtlas.
Autonomata is carried out in the framework of the STEVIN-programme.
Partners in the project are the Radboud University Nijmegen, Ghent University, Utrecht University, Nuance, and TeleAtlas.
Publication type
Presentation
Year of publication
2006
Conference location
Nijmegen
Conference name
Summer Meeting on Corpus-based Research 2006
Publisher
Nederlandse Vereniging voor Fonetische Wetenschappen