Chenchen Xu


2022

pdf bib
Automatic Gloss Dictionary for Sign Language Learners
Chenchen Xu | Dongxu Li | Hongdong Li | Hanna Suominen | Ben Swift
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

A multi-language dictionary is a fundamental tool for language learning, allowing the learner to look up unfamiliar words. Searching an unrecognized word in the dictionary does not usually require deep knowledge of the target language. However, this is not true for sign language, where gestural elements preclude this type of easy lookup. This paper introduces GlossFinder, an online tool supporting 2, 000 signs to assist language learners in determining the meaning of given signs. Unlike alternative systems of complex inputs, our system requires only that learners imitate the sign in front of a standard webcam. A user study conducted among sign language speakers of varying ability compared our system against existing alternatives and the interviews indicated a clear preference for our new system. This implies that GlossFinder can lower the barrier in sign language learning by addressing the common problem of sign finding and make it accessible to the wider community.

2019

pdf bib
PostAc : A Visual Interactive Search, Exploration, and Analysis Platform for PhD Intensive Job Postings
Chenchen Xu | Inger Mewburn | Will J Grant | Hanna Suominen
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Over 60% of Australian PhD graduates land their first job after graduation outside academia, but this job market remains largely hidden to these job seekers. Employers’ low awareness and interest in attracting PhD graduates means that the term “PhD” is rarely used as a keyword in job advertisements; 80% of companies looking to employ similar researchers do not specifically ask for a PhD qualification. As a result, typing in “PhD” to a job search engine tends to return mostly academic jobs. We set out to make the market for advanced research skills more visible to job seekers. In this paper, we present PostAc, an online platform of authentic job postings that helps PhD graduates sharpen their career thinking. The platform is underpinned by research on the key factors that identify what an employer is looking for when they want to hire a highly skilled researcher. Its ranking model leverages the free-form text embedded in the job description to quantify the most sought-after PhD skills and educate information seekers about the Australian job-market appetite for PhD skills. The platform makes visible the geographic location, industry sector, job title, working hours, continuity, and wage of the research intensive jobs. This is the first data-driven exploration in this field. Both empirical results and online platform will be presented in this paper.

pdf bib
ALTER: Auxiliary Text Rewriting Tool for Natural Language Generation
Qiongkai Xu | Chenchen Xu | Lizhen Qu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

In this paper, we describe ALTER, an auxiliary text rewriting tool that facilitates the rewriting process for natural language generation tasks, such as paraphrasing, text simplification, fairness-aware text rewriting, and text style transfer. Our tool is characterized by two features, i) recording of word-level revision histories and ii) flexible auxiliary edit support and feedback to annotators. The text rewriting assist and traceable rewriting history are potentially beneficial to the future research of natural language generation.

pdf bib
Privacy-Aware Text Rewriting
Qiongkai Xu | Lizhen Qu | Chenchen Xu | Ran Cui
Proceedings of the 12th International Conference on Natural Language Generation

Biased decisions made by automatic systems have led to growing concerns in research communities. Recent work from the NLP community focuses on building systems that make fair decisions based on text. Instead of relying on unknown decision systems or human decision-makers, we argue that a better way to protect data providers is to remove the trails of sensitive information before publishing the data. In light of this, we propose a new privacy-aware text rewriting task and explore two privacy-aware back-translation methods for the task, based on adversarial training and approximate fairness risk. Our extensive experiments on three real-world datasets with varying demographical attributes show that our methods are effective in obfuscating sensitive attributes. We have also observed that the fairness risk method retains better semantics and fluency, while the adversarial training method tends to leak less sensitive information.