Kjetil Røa Hauge


2018

pdf bib
Parallel Web Display of Transcribed Spoken Bulgarian with its Normalised Version and an Indexed List of Lemmas
Marina Dzhonova | Kjetil Røa Hauge | Yovka Tisheva
Proceedings of the Third International Conference on Computational Linguistics in Bulgaria (CLIB 2018)

We present and discuss problems in creating a lemmatised index to transcriptions of Bulgarian speech, including the prerequisites for such an index, and why we consider an index preferable to a search engine for this particular kind of text.