Transformer versus LSTM Language Models trained on Uncertain ASR Hypotheses in Limited Data Scenarios

Imran Sheikh, Emmanuel Vincent, Irina Illina


Abstract
In several ASR use cases, training and adaptation of domain-specific LMs can only rely on a small amount of manually verified text transcriptions and sometimes a limited amount of in-domain speech. Training of LSTM LMs in such limited data scenarios can benefit from alternate uncertain ASR hypotheses, as observed in our recent work. In this paper, we propose a method to train Transformer LMs on ASR confusion networks. We evaluate whether these self-attention based LMs are better at exploiting alternate ASR hypotheses as compared to LSTM LMs. Evaluation results show that Transformer LMs achieve 3-6% relative reduction in perplexity on the AMI scenario meetings but perform similar to LSTM LMs on the smaller Verbmobil conversational corpus. Evaluation on ASR N-best rescoring shows that LSTM and Transformer LMs trained on ASR confusion networks do not bring significant WER reductions. However, a qualitative analysis reveals that they are better at predicting less frequent words.
Anthology ID:
2022.lrec-1.41
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
393–399
Language:
URL:
https://aclanthology.org/2022.lrec-1.41
DOI:
Bibkey:
Cite (ACL):
Imran Sheikh, Emmanuel Vincent, and Irina Illina. 2022. Transformer versus LSTM Language Models trained on Uncertain ASR Hypotheses in Limited Data Scenarios. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 393–399, Marseille, France. European Language Resources Association.
Cite (Informal):
Transformer versus LSTM Language Models trained on Uncertain ASR Hypotheses in Limited Data Scenarios (Sheikh et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.41.pdf