Abstract
Recording and publicizing your âownâ AV-recorded memories is so easy now a days, that nearly everyone can (and maybe will) do it. Of course, not all the recorded material will be of huge historical or social interest, but how to decide what is valuable and what is not? Most of the AV-recorded material is not or only sparsely enriched with useful meta-data. So, to unveil these recordings, meta-data is necessary. One of the most promising technologies for meta-data addition is automatic speech recognition: a technology used to transform the spoken speech in a sequence of adjacent, most likely said words. At least at this time, a reliable, 95% correct recognition of the speech is not possible and we have to deal with imperfections: sometimes not more than 40% of the words are correctly recognized.
Nevertheless, ASR is suitable for the unveiling of spoken memories and the last years we see an increasing number of such projects. In this talk we will present an overview of two upcoming Oral History projects: Sobibor and MATRA.
In the Sobibor project 35 interviews with ânebenklĂ€gerâ (relatives of people killed in Sobibor) and survivors of the Sobibor camp are aligned. Because not all interviewees speak Dutch, multilinguality becomes an issue here.
In the MATRA project 500 inhabitants of Croatia will be interviewed about their memories of the Yugoslavian civil war (1991 and 1995). Full speech recognition for Croatian does not exist yet, so other technologies will be used to unveil these data. Moreover, because only a few people can understand Croatian, full translations in English and automatic term-translation in other languages will be done in order to unveil the data as much as possible.
Nevertheless, ASR is suitable for the unveiling of spoken memories and the last years we see an increasing number of such projects. In this talk we will present an overview of two upcoming Oral History projects: Sobibor and MATRA.
In the Sobibor project 35 interviews with ânebenklĂ€gerâ (relatives of people killed in Sobibor) and survivors of the Sobibor camp are aligned. Because not all interviewees speak Dutch, multilinguality becomes an issue here.
In the MATRA project 500 inhabitants of Croatia will be interviewed about their memories of the Yugoslavian civil war (1991 and 1995). Full speech recognition for Croatian does not exist yet, so other technologies will be used to unveil these data. Moreover, because only a few people can understand Croatian, full translations in English and automatic term-translation in other languages will be done in order to unveil the data as much as possible.