Xingyuan Li


2024

pdf bib
Phonetic and Lexical Discovery of Canine Vocalization
Theron S. Wang | Xingyuan Li | Chunhao Zhang | Mengyue Wu | Kenny Q. Zhu
Findings of the Association for Computational Linguistics: EMNLP 2024

This paper attempts to discover communication patterns automatically within dog vocalizations in a data-driven approach, which breaks the barrier previous approaches that rely on human prior knowledge on limited data. We present a self-supervised approach with HuBERT, enabling the accurate classification of phones, and an adaptive grammar induction method that identifies phone sequence patterns that suggest a preliminary vocabulary within dog vocalizations. Our results show that a subset of this vocabulary has substantial causality relations with certain canine activities, suggesting signs of stable semantics associated with these “words”.

2020

pdf bib
CIST@CL-SciSumm 2020, LongSumm 2020: Automatic Scientific Document Summarization
Lei Li | Yang Xie | Wei Liu | Yinan Liu | Yafei Jiang | Siya Qi | Xingyuan Li
Proceedings of the First Workshop on Scholarly Document Processing

Our system participates in two shared tasks, CL-SciSumm 2020 and LongSumm 2020. In the CL-SciSumm shared task, based on our previous work, we apply more machine learning methods on position features and content features for facet classification in Task1B. And GCN is introduced in Task2 to perform extractive summarization. In the LongSumm shared task, we integrate both the extractive and abstractive summarization ways. Three methods were tested which are T5 Fine-tuning, DPPs Sampling, and GRU-GCN/GAT.