2024
pdf
bib
abs
Conformer-Based Speech Recognition On Extreme Edge-Computing Devices
Mingbin Xu
|
Alex Jin
|
Sicheng Wang
|
Mu Su
|
Tim Ng
|
Henry Mason
|
Shiyi Han
|
Zhihong Lei
|
Yaqiao Deng
|
Zhen Huang
|
Mahesh Krishnamoorthy
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track)
With increasingly more powerful compute capabilities and resources in today’s devices, traditionally compute-intensive automatic speech recognition (ASR) has been moving from the cloud to devices to better protect user privacy. However, it is still challenging to implement on-device ASR on resource-constrained devices, such as smartphones, smart wearables, and other small home automation devices. In this paper, we propose a series of model architecture adaptions, neural network graph transformations, and numerical optimizations to fit an advanced Conformer based end-to-end streaming ASR system on resource-constrained devices without accuracy degradation. We achieve over 5.26 times faster than realtime (0.19 RTF) speech recognition on small wearables while minimizing energy consumption and achieving state-of-the-art accuracy. The proposed methods are widely applicable to other transformer-based server-free AI applications. In addition, we provide a complete theory on optimal pre-normalizers that numerically stabilize layer normalization in any Lp-norm using any floating point precision.
2018
pdf
bib
abs
Dual Fixed-Size Ordinally Forgetting Encoding (FOFE) for Competitive Neural Language Models
Sedtawut Watcharawittayakul
|
Mingbin Xu
|
Hui Jiang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
In this paper, we propose a new approach to employ the fixed-size ordinally-forgetting encoding (FOFE) (Zhang et al., 2015b) in neural languages modelling, called dual-FOFE. The main idea of dual-FOFE is that it allows to use two different forgetting factors so that it can avoid the trade-off in choosing either a small or large values for the single forgetting factor. In our experiments, we have compared the dual-FOFE based neural network language models (NNLM) against the original FOFE counterparts and various traditional NNLMs. Our results on the challenging Google Billion word corpus show that both FOFE and dual FOFE yield very strong performance while significantly reducing the computational complexity over other NNLMs. Furthermore, the proposed dual-FOFE method further gives over 10% improvement in perplexity over the original FOFE model.
2017
pdf
bib
abs
Word Embeddings based on Fixed-Size Ordinally Forgetting Encoding
Joseph Sanu
|
Mingbin Xu
|
Hui Jiang
|
Quan Liu
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
In this paper, we propose to learn word embeddings based on the recent fixed-size ordinally forgetting encoding (FOFE) method, which can almost uniquely encode any variable-length sequence into a fixed-size representation. We use FOFE to fully encode the left and right context of each word in a corpus to construct a novel word-context matrix, which is further weighted and factorized using truncated SVD to generate low-dimension word embedding vectors. We evaluate this alternate method in encoding word-context statistics and show the new FOFE method has a notable effect on the resulting word embeddings. Experimental results on several popular word similarity tasks have demonstrated that the proposed method outperforms other SVD models that use canonical count based techniques to generate word context matrices.
pdf
bib
abs
A Local Detection Approach for Named Entity Recognition and Mention Detection
Mingbin Xu
|
Hui Jiang
|
Sedtawut Watcharawittayakul
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
In this paper, we study a novel approach for named entity recognition (NER) and mention detection (MD) in natural language processing. Instead of treating NER as a sequence labeling problem, we propose a new local detection approach, which relies on the recent fixed-size ordinally forgetting encoding (FOFE) method to fully encode each sentence fragment and its left/right contexts into a fixed-size representation. Subsequently, a simple feedforward neural network (FFNN) is learned to either reject or predict entity label for each individual text fragment. The proposed method has been evaluated in several popular NER and MD tasks, including CoNLL 2003 NER task and TAC-KBP2015 and TAC-KBP2016 Tri-lingual Entity Discovery and Linking (EDL) tasks. Our method has yielded pretty strong performance in all of these examined tasks. This local detection approach has shown many advantages over the traditional sequence labeling methods.
2015
pdf
bib
The Fixed-Size Ordinally-Forgetting Encoding Method for Neural Network Language Models
ShiLiang Zhang
|
Hui Jiang
|
MingBin Xu
|
JunFeng Hou
|
LiRong Dai
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)