Large Scale Pronunciation Comparison

TitleLarge Scale Pronunciation Comparison
Publication TypePresentation
Year of Publication2008
Conference NameSymposium Taal- en Spraakvariatie
AuthorsLeinonen, Therese, and John Nerbonne
PublisherNederlandse Vereniging voor Fonetische Wetenschappen
Conference LocationAmsterdam, The Netherlands

For many reasons it is desirable to be able to measure the phonetic (dis)similarity of two pronunciations automatically. While most dialectology and sociolinguistics focuses on single "shibboleths", large-scale comparison holds the promise of including much more material, of assessing the importance of single differences, and perhaps even allowing the formulation of general laws. But it requires automatic procedures to be feasible.

In this talk we sketch two approaches and problems in the research program aiming to measure pronunciation dissimilarity automatically. One approach is to measure dissimilarity based on phonetic transcriptions. While this risks "carrying" transcriber errors, it benefits from the implicit focus on the phonetic quality on the part of the transcribers. A puzzle at present is the question of how to include more phonetic sensitivity into the measurements. Current attempts fail, perhaps because the large numbers compensate sufficiently for missing sensitivity, but perhaps for other reasons as well.

The second approach is try to work on acoustic material directly, obviating the need for transcription. But this approach quickly requires techniques for abstracting phonetic quality from waveform, which, as phoneticians know, is no trivial task. Formant measurements need to be hand-corrected and is therefore not a suitable method if large amounts of data are to be analyzed. We use principal component analysis on the Bark-filtered spectra of vowels, which is an acoustic method that can be fully automatized. Normalizing for speaker-dependent variation becomes important when working directly with acoustic data. We average over a number of speakers per dialect in order to even out these speaker-specific differences. Subsequently Euclidean distance is used to measure the distance between vowels in different dialects.