The Project

Nederlands                                                                                                Yn it Frysk

This project was carried out between 2015 and 2018. In  the project  we  disclosed over  2500  hours  of  radio  broadcasts  from  the  Omrop  Fryslân  (Frisian  Broadcast).  The  radio  broadcasts  contain  spoken  Frisian  and  Dutch  covering  the  period  1950–2000.  We are working on a follow-up.

We  used  speech  technology  for  spoken  document  retrieval  (speech  to  text  conversion)  and  for  speaker  tracking  (speaker  diarization  &  recognition).  Thus  we were  able  to  locate  broadcasts  addressing  specific  topics  and  specific  speakers  in  the  audio signal.  In  order  to  guarantee  relevance  in  retrieval,  the  project  also  developed  an  enriched  Frisian  lexicon  and  a  semantic  search  engine  for  Frisian  and  Dutch  to  search  the broadcasts.  The  non-­academic  project  partners  acknowledge  the  disclosure  of  this  data  as  a  rich  source  of  Frisian  cultural  heritage.  The  project  carried  out  innovative  research  since  it  investigated  efficiency  and  performance  of:  1.  Automatic  Speech  Recognition  of  Frisian  and  Dutch  using  either  two  separate  recognizers  or  a  hybrid  one;  2.  the  integration  of  speaker  diarization  and  speaker  recognition  applied  to  a  large  longitudinal  data  set;  3.  a  flexible  semantic  search  interface  targeted  at  various  user  groups.  In  all  these  topics  efficient  processing  is  required,  because  of  the  sheer  volume  of  the  data.

FAME! stands for Frisian Audio Mining Enterprise.  The project is funded by NWO’s  Creative Industry Programme under project number 314-99-119. See here.