Lu Lu


2023

pdf bib
CIF-PT: Bridging Speech and Text Representations for Spoken Language Understanding via Continuous Integrate-and-Fire Pre-Training
Linhao Dong | Zhecheng An | Peihao Wu | Jun Zhang | Lu Lu | Ma Zejun
Findings of the Association for Computational Linguistics: ACL 2023

Speech or text representation generated by pre-trained models contains modal-specific information that could be combined for benefiting spoken language understanding (SLU) tasks. In this work, we propose a novel pre-training paradigm termed Continuous Integrate-and-Fire Pre-Training (CIF-PT). It relies on a simple but effective frame-to-token alignment: continuous integrate-and-fire (CIF) to bridge the representations between speech and text. It jointly performs speech-to-text training and language model distillation through CIF as the pre-training (PT). Evaluated on SLU benchmark SLURP dataset, CIF-PT outperforms the state-of-the-art model by 1.94% of accuracy and 2.71% of SLU-F1 on the tasks of intent classification and slot filling, respectively. We also observe the cross-modal representation extracted by CIF-PT obtains better performance than other neural interfaces for the tasks of SLU, including the dominant speech representation learned from self-supervised pre-training.

2022

pdf bib
Inclusion in CSR Reports: The Lens from a Data-Driven Machine Learning Model
Lu Lu | Jinghang Gu | Chu-Ren Huang
Proceedings of the First Computing Social Responsibility Workshop within the 13th Language Resources and Evaluation Conference

Inclusion, as one of the foundations in the diversity, equity, and inclusion initiative, concerns the degree of being treated as an ingroup member in a workplace. Despite of its importance in a corporate’s ecosystem, the inclusion strategies and its performance are not adequately addressed in corporate social responsibility (CSR) and CSR reporting. This study proposes a machine learning and big data-based model to examine inclusion through the use of stereotype content in actual language use. The distribution of the stereotype content in general corpora of a given society is utilized as a baseline, with which texts about corporate texts are compared. This study not only propose a model to identify and classify inclusion in language use, but also provides insights to measure and track progress by including inclusion in CSR reports as a strategy to build an inclusive corporate team.

2020

pdf bib
Abstract Meaning Representation for MWE: A study of the mapping of aspectuality based on Mandarin light verb jiayi
Lu Lu | Nianwen Xue | Chu-Ren Huang
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation