Dongfeng Cai

Also published as: DongFeng Cai


2024

pdf bib
A Corpus and Method for Chinese Named Entity Recognition in Manufacturing
Ruiting Li | Peiyan Wang | Libang Wang | Danqingxin Yang | Dongfeng Cai
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Manufacturing specifications are documents entailing different techniques, processes, and components involved in manufacturing. There is a growing demand for named entity recognition (NER) resources and techniques for manufacturing-specific named entities, with the development of smart manufacturing. In this paper, we introduce a corpus of Chinese manufacturing specifications, named MS-NERC, including 4,424 sentences and 16,383 entities. We also propose an entity recognizer named Trainable State Transducer (TST), which is initialized with a finite state transducer describing the morphological patterns of entities. It can directly recognize entities based on prior morphological knowledge without training. Experimental results show that TST achieves an overall 82.05% F1 score for morphological-specific entities in zero-shot. TST can be improved through training, the result of which outperforms neural methods in few-shot and rich-resource. We believe that our corpus and model will be valuable resources for NER research not only in manufacturing but also in other low-resource domains.

2016

pdf bib
Interactive-Predictive Machine Translation based on Syntactic Constraints of Prefix
Na Ye | Guiping Zhang | Dongfeng Cai
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Interactive-predictive machine translation (IPMT) is a translation mode which combines machine translation technology and human behaviours. In the IPMT system, the utilization of the prefix greatly affects the interaction efficiency. However, state-of-the-art methods filter translation hypotheses mainly according to their matching results with the prefix on character level, and the advantage of the prefix is not fully developed. Focusing on this problem, this paper mines the deep constraints of prefix on syntactic level to improve the performance of IPMT systems. Two syntactic subtree matching rules based on phrase structure grammar are proposed to filter the translation hypotheses more strictly. Experimental results on LDC Chinese-English corpora show that the proposed method outperforms state-of-the-art phrase-based IPMT system while keeping comparable decoding speed.

2015

pdf bib
Productivity promotion strategies for collaborative translation on huge volume technical documents
Guiping Zhang | Na Ye | Fang Cai | Chuang Wu | Xiangkui Sun | Jinfu Yuan | Dongfeng Cai
Proceedings of Machine Translation Summit XV: User Track

pdf bib
Strategy-Based Technology for Estimating MT Quality
Liugang Shang | Dongfeng Cai | Duo Ji
Proceedings of the Tenth Workshop on Statistical Machine Translation

2012

pdf bib
Zhou qiaoli: A divide-and-conquer strategy for semantic dependency parsing
Qiaoli Zhou | Ling Zhang | Fei Liu | Dongfeng Cai | Guiping Zhang
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

2011

pdf bib
Multi-stage Chinese Dependency Parsing Based on Dependency Direction
Wenjing Lang | Qiaoli Zhou | Guiping Zhang | Dongfeng Cai
Proceedings of Machine Translation Summit XIII: Papers

2010

pdf bib
Bigram HMM with Context Distribution Clustering for Unsupervised Chinese Part-of-Speech tagging
Lidan Zhang | Kwok-Ping Chan | Chunyu Kit | Dongfeng Cai
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Automatic Identification of Predicate Heads in Chinese Sentences
Xiaona Ren | Qiaoli Zhou | Chunyu Kit | Dongfeng Cai
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Active Learning Based Corpus Annotation
Hongyan Song | Tianfang Yao | Chunyu Kit | Dongfeng Cai
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
The SAU Report for the 1st CIPS-SIGHAN-ParsEval-2010
Qiaoli Zhou | Wenjing Lang | Yingying Wang | Yan Wang | Dongfeng Cai
CIPS-SIGHAN Joint Conference on Chinese Language Processing

2009

pdf bib
Dependency Grammar Based English Subject-Verb Agreement Evaluation
Dongfeng Cai | Yonghua Hu | Xuelei Miao | Yan Song
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 1

2006

pdf bib
Chinese Word Segmentation Based on an Approach of Maximum Entropy Modeling
Yan Song | Jiaqing Guo | Dongfeng Cai
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing

pdf bib
HowNet Based Chinese Question Classification
Dongfeng Cai | Jingguang Sun | Guiping Zhang | Dexin Lv | Yanju Dong | Yan Song | Chao Yu
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation

pdf bib
Research on concept-sememe tree and semantic relevance computation
GuiPing Zhang | Chao Yu | DongFeng Cai | Yan Song | JingGuang Sun
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation