Xiaosong Qiao


2023

pdf bib
Leveraging Multilingual Knowledge Graph to Boost Domain-specific Entity Translation of ChatGPT
Min Zhang | Limin Liu | Zhao Yanqing | Xiaosong Qiao | Su Chang | Xiaofeng Zhao | Junhao Zhu | Ming Zhu | Song Peng | Yinglu Li | Yilun Liu | Wenbing Ma | Mengyao Piao | Shimin Tao | Hao Yang | Yanfei Jiang
Proceedings of Machine Translation Summit XIX, Vol. 2: Users Track

Recently, ChatGPT has shown promising results for Machine Translation (MT) in general domains and is becoming a new paradigm for translation. In this paper, we focus on how to apply ChatGPT to domain-specific translation and propose to leverage Multilingual Knowledge Graph (MKG) to help ChatGPT improve the domain entity translation quality. To achieve this, we extract the bilingual entity pairs from MKG for the domain entities that are recognized from source sentences. We then introduce these pairs into translation prompts, instructing ChatGPT to use the correct translations of the domain entities. To evaluate the novel MKG method for ChatGPT, we conduct comparative experiments on three Chinese-English (zh-en) test datasets constructed from three specific domains, of which one domain is from biomedical science, and the other two are from the Information and Communications Technology (ICT) industry — Visible Light Communication (VLC) and wireless domains. Experimental results demonstrate that both the overall translation quality of ChatGPT (+6.21, +3.13 and +11.25 in BLEU scores) and the translation accuracy of domain entities (+43.2%, +30.2% and +37.9% absolute points) are significantly improved with MKG on the three test datasets.

pdf bib
SmartSpanNER: Making SpanNER Robust in Low Resource Scenarios
Min Zhang | Xiaosong Qiao | Yanqing Zhao | Shimin Tao | Hao Yang
Findings of the Association for Computational Linguistics: EMNLP 2023

Named Entity Recognition (NER) is one of the most fundamental tasks in natural language processing. Span-level prediction (SpanNER) is more naturally suitable for nested NER than sequence labeling (SeqLab). However, according to our experiments, the SpanNER method is more sensitive to the amount of training data, i.e., the F1 score of SpanNER drops much more than that of SeqLab when the amount of training data drops. In order to improve the robustness of SpanNER in low resource scenarios, we propose a simple and effective method SmartSpanNER, which introduces a Named Entity Head (NEH) prediction task to SpanNER and performs multi-task learning together with the task of span classification. Experimental results demonstrate that the robustness of SpanNER could be greatly improved by SmartSpanNER in low resource scenarios constructed on the CoNLL03, Few-NERD, GENIA and ACE05 standard benchmark datasets.

pdf bib
HW-TSC at SemEval-2023 Task 7: Exploring the Natural Language Inference Capabilities of ChatGPT and Pre-trained Language Model for Clinical Trial
Xiaofeng Zhao | Min Zhang | Miaomiao Ma | Chang Su | Yilun Liu | Minghan Wang | Xiaosong Qiao | Jiaxin Guo | Yinglu Li | Wenbing Ma
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

In this paper, we describe the multi strategy system for SemEval-2022 Task 7, This task aims to determine whether a given statement is supported by one or two Clinical Trial reports, and to identify evidence that supports the statement. This is a task that requires high natural language inference capabilities. In Subtask 1, we compare our strategy based on prompt learning and ChatGPT with a baseline constructed using BERT in zero-shot setting, and validate the effectiveness of our strategy. In Subtask 2, we fine-tune DeBERTaV3 for classification without relying on the results from Subtask 1, and we observe that early stopping can effectively prevent model overfitting, which performs well in Subtask 2. In addition, we did not use any ensemble strategies. Ultimately, we achieved the 10th place in Subtask 1 and the 2nd place in Subtask 2.

pdf bib
Empowering a Metric with LLM-assisted Named Entity Annotation: HW-TSC’s Submission to the WMT23 Metrics Shared Task
Zhanglin Wu | Yilun Liu | Min Zhang | Xiaofeng Zhao | Junhao Zhu | Ming Zhu | Xiaosong Qiao | Jingfei Zhang | Ma Miaomiao | Zhao Yanqing | Song Peng | Shimin Tao | Hao Yang | Yanfei Jiang
Proceedings of the Eighth Conference on Machine Translation

This paper presents the submission of Huawei Translation Service Center (HW-TSC) to the WMT23 metrics shared task, in which we submit two metrics: KG-BERTScore and HWTSC-EE-Metric. Among them, KG-BERTScore is our primary submission for the reference-free metric, which can provide both segment-level and system-level scoring. While HWTSC-EE-Metric is our primary submission for the reference-based metric, which can only provide system-level scoring. Overall, our metrics show relatively high correlations with MQM scores on the metrics tasks of previous years. Especially on system-level scoring tasks, our metrics achieve new state-of-the-art in many language pairs.

2022

pdf bib
HW-TSC at SemEval-2022 Task 3: A Unified Approach Fine-tuned on Multilingual Pretrained Model for PreTENS
Yinglu Li | Min Zhang | Xiaosong Qiao | Minghan Wang
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

In the paper, we describe a unified system for task 3 of SemEval-2022. The task aims to recognize the semantic structures of sentences by providing two nominal arguments and to evaluate the degree of taxonomic relations. We utilise the strategy that adding language prefix tag in the training set, which is effective for the model. We split the training set to avoid the translation information to be learnt by the model. For the task, we propose a unified model fine-tuned on the multilingual pretrained model, XLM-RoBERTa. The model performs well in subtask 1 (the binary classification subtask). In order to verify whether our model could also perform better in subtask 2 (the regression subtask), the ranking score is transformed into classification labels by an up-sampling strategy. With the ensemble strategy, the performance of our model can be also improved. As a result, the model obtained the second place for subtask 1 and subtask 2 in the competition evaluation.

pdf bib
HW-TSC at SemEval-2022 Task 7: Ensemble Model Based on Pretrained Models for Identifying Plausible Clarifications
Xiaosong Qiao | Yinglu Li | Min Zhang | Minghan Wang | Hao Yang | Shimin Tao | Qin Ying
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

This paper describes the system for the identifying Plausible Clarifications of Implicit and Underspecified Phrases. This task was set up as an English cloze task, in which clarifications are presented as possible fillers and systems have to score how well each filler plausibly fits in a given context. For this shared task, we propose our own solutions, including supervised proaches, unsupervised approaches with pretrained models, and then we use these models to build an ensemble model. Finally we get the 2nd best result in the subtask1 which is a classification task, and the 3rd best result in the subtask2 which is a regression task.

pdf bib
Partial Could Be Better than Whole. HW-TSC 2022 Submission for the Metrics Shared Task
Yilun Liu | Xiaosong Qiao | Zhanglin Wu | Su Chang | Min Zhang | Yanqing Zhao | Song Peng | Shimin Tao | Hao Yang | Ying Qin | Jiaxin Guo | Minghan Wang | Yinglu Li | Peng Li | Xiaofeng Zhao
Proceedings of the Seventh Conference on Machine Translation (WMT)

In this paper, we present the contribution of HW-TSC to WMT 2022 Metrics Shared Task. We propose one reference-based metric, HWTSC-EE-BERTScore*, and four referencefree metrics including HWTSC-Teacher-Sim, HWTSC-TLM, KG-BERTScore and CROSSQE. Among these metrics, HWTSC-Teacher-Sim and CROSS-QE are supervised, whereas HWTSC-EE-BERTScore*, HWTSC-TLM and KG-BERTScore are unsupervised. We use these metrics in the segment-level and systemlevel tracks. Overall, our systems achieve strong results for all language pairs on previous test sets and a new state-of-the-art in many sys-level case sets.

pdf bib
The HW-TSC’s Offline Speech Translation System for IWSLT 2022 Evaluation
Yinglu Li | Minghan Wang | Jiaxin Guo | Xiaosong Qiao | Yuxia Wang | Daimeng Wei | Chang Su | Yimeng Chen | Min Zhang | Shimin Tao | Hao Yang | Ying Qin
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

This paper describes the HW-TSC’s designation of the Offline Speech Translation System submitted for IWSLT 2022 Evaluation. We explored both cascade and end-to-end system on three language tracks (en-de, en-zh and en-ja), and we chose the cascade one as our primary submission. For the automatic speech recognition (ASR) model of cascade system, there are three ASR models including Conformer, S2T-Transformer and U2 trained on the mixture of five datasets. During inference, transcripts are generated with the help of domain controlled generation strategy. Context-aware reranking and ensemble based anti-interference strategy are proposed to produce better ASR outputs. For machine translation part, we pretrained three translation models on WMT21 dataset and fine-tuned them on in-domain corpora. Our cascade system shows competitive performance than the known offline systems in the industry and academia.

pdf bib
The HW-TSC’s Simultaneous Speech Translation System for IWSLT 2022 Evaluation
Minghan Wang | Jiaxin Guo | Yinglu Li | Xiaosong Qiao | Yuxia Wang | Zongyao Li | Chang Su | Yimeng Chen | Min Zhang | Shimin Tao | Hao Yang | Ying Qin
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

This paper presents our work in the participation of IWSLT 2022 simultaneous speech translation evaluation. For the track of text-to-text (T2T), we participate in three language pairs and build wait-k based simultaneous MT (SimulMT) model for the task. The model was pretrained on WMT21 news corpora, and was further improved with in-domain fine-tuning and self-training. For the speech-to-text (S2T) track, we designed both cascade and end-to-end form in three language pairs. The cascade system is composed of a chunking-based streaming ASR model and the SimulMT model used in the T2T track. The end-to-end system is a simultaneous speech translation (SimulST) model based on wait-k strategy, which is directly trained on a synthetic corpus produced by translating all texts of ASR corpora into specific target language with an offline MT model. It also contains a heuristic sentence breaking strategy, preventing it from finishing the translation before the the end of the speech. We evaluate our systems on the MUST-C tst-COMMON dataset and show that the end-to-end system is competitive to the cascade one. Meanwhile, we also demonstrate that the SimulMT model can be efficiently optimized by these approaches, resulting in the improvements of 1-2 BLEU points.

pdf bib
The HW-TSC’s Speech to Speech Translation System for IWSLT 2022 Evaluation
Jiaxin Guo | Yinglu Li | Minghan Wang | Xiaosong Qiao | Yuxia Wang | Hengchao Shang | Chang Su | Yimeng Chen | Min Zhang | Shimin Tao | Hao Yang | Ying Qin
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

The paper presents the HW-TSC’s pipeline and results of Offline Speech to Speech Translation for IWSLT 2022. We design a cascade system consisted of an ASR model, machine translation model and TTS model to convert the speech from one language into another language(en-de). For the ASR part, we find that better performance can be obtained by ensembling multiple heterogeneous ASR models and performing reranking on beam candidates. And we find that the combination of context-aware reranking strategy and MT model fine-tuned on the in-domain dataset is helpful to improve the performance. Because it can mitigate the problem that the inconsistency in transcripts caused by the lack of context. Finally, we use VITS model provided officially to reproduce audio files from the translation hypothesis.