Xin Ying Qiu

Also published as: Xinying Qiu


2024

pdf bib
Label Confidence Weighted Learning for Target-level Sentence Simplification
Xin Ying Qiu | Jingshen Zhang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Multi-level sentence simplification generates simplified sentences with varying language proficiency levels. We propose Label Confidence Weighted Learning (LCWL), a novel approach that incorporates a label confidence weighting scheme in the training loss of the encoder-decoder model, setting it apart from existing confidence-weighting methods primarily designed for classification. Experimentation on English grade-level simplification dataset shows that LCWL outperforms state-of-the-art unsupervised baselines. Fine-tuning the LCWL model on in-domain data and combining with Symmetric Cross Entropy (SCE) consistently delivers better simplifications compared to strong supervised methods. Our results highlight the effectiveness of label confidence weighting techniques for text simplification tasks with encoder-decoder architectures.

2021

pdf bib
Learning Syntactic Dense Embedding with Correlation Graph for Automatic Readability Assessment
Xinying Qiu | Yuan Chen | Hanwu Chen | Jian-Yun Nie | Yuming Shen | Dawei Lu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Deep learning models for automatic readability assessment generally discard linguistic features traditionally used in machine learning models for the task. We propose to incorporate linguistic features into neural network models by learning syntactic dense embeddings based on linguistic features. To cope with the relationships between the features, we form a correlation graph among features and use it to learn their embeddings so that similar features will be represented by similar embeddings. Experiments with six data sets of two proficiency levels demonstrate that our proposed methodology can complement BERT-only model to achieve significantly better performances for automatic readability assessment.

2016

pdf bib
Learning Indonesian-Chinese Lexicon with Bilingual Word Embedding Models and Monolingual Signals
Xinying Qiu | Gangqin Zhu
Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016)

We present a research on learning Indonesian-Chinese bilingual lexicon using monolingual word embedding and bilingual seed lexicons to build shared bilingual word embedding space. We take the first attempt to examine the impact of different monolingual signals for the choice of seed lexicons on the model performance. We found that although monolingual signals alone do not seem to outperform signals coverings all words, the significant improvement for learning word translation of the same signal types may suggest that linguistic features possess value for further study in distinguishing the semantic margins of the shared word embedding space.

2004

pdf bib
The Language of Bioscience: Facts, Speculations, and Statements In Between
Marc Light | Xin Ying Qiu | Padmini Srinivasan
HLT-NAACL 2004 Workshop: Linking Biological Literature, Ontologies and Databases