Huizhen Wang


pdf bib
Bridging the Granularity Gap for Acoustic Modeling
Chen Xu | Yuhao Zhang | Chengbo Jiao | Xiaoqian Liu | Chi Hu | Xin Zeng | Tong Xiao | Anxiang Ma | Huizhen Wang | Jingbo Zhu
Findings of the Association for Computational Linguistics: ACL 2023

While Transformer has become the de-facto standard for speech, modeling upon the fine-grained frame-level features remains an open challenge of capturing long-distance dependencies and distributing the attention weights. We propose Progressive Down-Sampling (PDS) which gradually compresses the acoustic features into coarser-grained units containing more complete semantic information, like text-level representation. In addition, we develop a representation fusion method to alleviate information loss that occurs inevitably during high compression. In this way, we compress the acoustic features into 1/32 of the initial length while achieving better or comparable performances on the speech recognition task. And as a bonus, it yields inference speedups ranging from 1.20x to 1.47x.By reducing the modeling burden, we also achieve competitive results when training on the more challenging speech translation task.


pdf bib
利用图像描述与知识图谱增强表示的视觉问答(Exploiting Image Captions and External Knowledge as Representation Enhancement for Visual Question Answering)
Gechao Wang (王屹超) | Muhua Zhu (朱慕华) | Chen Xu (许晨) | Yan Zhang (张琰) | Huizhen Wang (王会珍) | Jingbo Zhu (朱靖波)
Proceedings of the 20th Chinese National Conference on Computational Linguistics



pdf bib
Shallow-to-Deep Training for Neural Machine Translation
Bei Li | Ziyang Wang | Hui Liu | Yufan Jiang | Quan Du | Tong Xiao | Huizhen Wang | Jingbo Zhu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Deep encoders have been proven to be effective in improving neural machine translation (NMT) systems, but training an extremely deep encoder is time consuming. Moreover, why deep models help NMT is an open question. In this paper, we investigate the behavior of a well-tuned deep Transformer system. We find that stacking layers is helpful in improving the representation ability of NMT models and adjacent layers perform similarly. This inspires us to develop a shallow-to-deep training method that learns deep models by stacking shallow models. In this way, we successfully train a Transformer system with a 54-layer encoder. Experimental results on WMT’16 English-German and WMT’14 English-French translation tasks show that it is 1:4 faster than training from scratch, and achieves a BLEU score of 30:33 and 43:29 on two tasks. The code is publicly available at

pdf bib
A Simple and Effective Approach to Robust Unsupervised Bilingual Dictionary Induction
Yanyang Li | Yingfeng Luo | Ye Lin | Quan Du | Huizhen Wang | Shujian Huang | Tong Xiao | Jingbo Zhu
Proceedings of the 28th International Conference on Computational Linguistics

Unsupervised Bilingual Dictionary Induction methods based on the initialization and the self-learning have achieved great success in similar language pairs, e.g., English-Spanish. But they still fail and have an accuracy of 0% in many distant language pairs, e.g., English-Japanese. In this work, we show that this failure results from the gap between the actual initialization performance and the minimum initialization performance for the self-learning to succeed. We propose Iterative Dimension Reduction to bridge this gap. Our experiments show that this simple method does not hamper the performance of similar language pairs and achieves an accuracy of 13.64 55.53% between English and four distant languages, i.e., Chinese, Japanese, Vietnamese and Thai.


pdf bib
Exploiting Lexical Dependencies from Large-Scale Data for Better Shift-Reduce Constituency Parsing
Muhua Zhu | Jingbo Zhu | Huizhen Wang
Proceedings of COLING 2012


pdf bib
Boosting-Based System Combination for Machine Translation
Tong Xiao | Jingbo Zhu | Muhua Zhu | Huizhen Wang
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
A Multi-stage Clustering Framework for Chinese Personal Name Disambiguation
Huizhen Wang | Haibo Ding | Yingchao Shi | Ji Ma | Xiao Zhou | Jingbo Zhu
CIPS-SIGHAN Joint Conference on Chinese Language Processing


pdf bib
Chinese-English Organization Name Translation Based on Correlative Expansion
Feiliang Ren | Muhua Zhu | Huizhen Wang | Jingbo Zhu
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)


pdf bib
Multi-Criteria-Based Strategy to Stop Active Learning for Data Annotation
Jingbo Zhu | Huizhen Wang | Eduard Hovy
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Active Learning with Sampling by Uncertainty and Density for Word Sense Disambiguation and Text Classification
Jingbo Zhu | Huizhen Wang | Tianshun Yao | Benjamin K Tsou
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Learning a Stopping Criterion for Active Learning for Word Sense Disambiguation and Text Classification
Jingbo Zhu | Huizhen Wang | Eduard Hovy
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I


pdf bib
Designing Special Post-Processing Rules for SVM-Based Chinese Word Segmentation
Muhua Zhu | Yilin Wang | Zhenxing Wang | Huizhen Wang | Jingbo Zhu
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing