2023
pdf
bib
abs
Duplex Diffusion Models Improve Speech-to-Speech Translation
Xianchao Wu
Findings of the Association for Computational Linguistics: ACL 2023
Speech-to-speech translation is a typical sequence-to-sequence learning task that naturally has two directions. How to effectively leverage bidirectional supervision signals to produce high-fidelity audio for both directions? Existing approaches either train two separate models or a multitask-learned model with low efficiency and inferior performance. In this paper, we propose a duplex diffusion model that applies diffusion probabilistic models to both sides of a reversible duplex Conformer, so that either end can simultaneously input and output a distinct language’s speech. Our model enables reversible speech translation by simply flipping the input and output ends. Experiments show that our model achieves the first success of reversible speech translation with significant improvements of ASR-BLEU scores compared with a list of state-of-the-art baselines.
pdf
bib
abs
SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
Yi Dong
|
Zhilin Wang
|
Makesh Sreedhar
|
Xianchao Wu
|
Oleksii Kuchaiev
Findings of the Association for Computational Linguistics: EMNLP 2023
Model alignment with human preferences is an essential step in making Large Language Models (LLMs) helpful and consistent with human values. It typically consists of supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) stages. However, RLHF faces inherent limitations stemming from a complex training setup and its tendency to align the model with implicit values that end users cannot control at run-time. Moreover, reward models in RLHF stage commonly rely on single-dimensional feedback as opposed to explicit, multifaceted signals that indicate attributes such as helpfulness, humor, and toxicity. To address these limitations, we propose SteerLM, a supervised fine-tuning method that empowers end-users to control responses during inference. SteerLM conditions responses to conform to an explicitly defined multi-dimensional set of attributes, thereby empowering a steerable AI capable of generating helpful and high-quality responses while maintaining customizability. Experiments show that SteerLM trained on open source datasets generates responses that are preferred by human and automatic evaluators to many state-of-the-art baselines trained with RLHF while being much easier to train. Try SteerLM at https://huggingface.co/nvidia/SteerLM-llama2-13B
2022
pdf
bib
Proceedings of the Second Workshop on When Creative AI Meets Conversational AI
Xianchao Wu
|
Peiying Ruan
|
Sheng Li
|
Yi Dong
Proceedings of the Second Workshop on When Creative AI Meets Conversational AI
pdf
bib
abs
Creative Painting with Latent Diffusion Models
Xianchao Wu
Proceedings of the Second Workshop on When Creative AI Meets Conversational AI
Artistic painting has achieved significant progress during recent years. Using a variational autoencoder to connect the original images with compressed latent spaces and a cross attention enhanced U-Net as the backbone of diffusion, latent diffusion models (LDMs) have achieved stable and high fertility image generation. In this paper, we focus on enhancing the creative painting ability of current LDMs in two directions, textual condition extension and model retraining with Wikiart dataset. Through textual condition extension, users’ input prompts are expanded with rich contextual knowledge for deeper understanding and explaining the prompts. Wikiart dataset contains 80K famous artworks drawn during recent 400 years by more than 1,000 famous artists in rich styles and genres. Through the retraining, we are able to ask these artists to draw artistic and creative paintings on modern topics. Direct comparisons with the original model show that the creativity and artistry are enriched.
2021
pdf
bib
NVJPFSI at FinCausal 2021 Span-based Causality Extraction Task
Xianchao Wu
Proceedings of the 3rd Financial Narrative Processing Workshop
2020
pdf
bib
abs
Event-Driven Learning of Systematic Behaviours in Stock Markets
Xianchao Wu
Findings of the Association for Computational Linguistics: EMNLP 2020
It is reported that financial news, especially financial events expressed in news, provide information to investors’ long/short decisions and influence the movements of stock markets. Motivated by this, we leverage financial event streams to train a classification neural network that detects latent event-stock linkages and stock markets’ systematic behaviours in the U.S. stock market. Our proposed pipeline includes (1) a combined event extraction method that utilizes Open Information Extraction and neural co-reference resolution, (2) a BERT/ALBERT enhanced representation of events, and (3) an extended hierarchical attention network that includes attentions on event, news and temporal levels. Our pipeline achieves significantly better accuracies and higher simulated annualized returns than state-of-the-art models when being applied to predicting Standard&Poor 500, Dow Jones, Nasdaq indices and 10 individual stocks.
2018
pdf
bib
abs
Dialog Generation Using Multi-Turn Reasoning Neural Networks
Xianchao Wu
|
Ander Martínez
|
Momo Klyen
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
In this paper, we propose a generalizable dialog generation approach that adapts multi-turn reasoning, one recent advancement in the field of document comprehension, to generate responses (“answers”) by taking current conversation session context as a “document” and current query as a “question”. The major idea is to represent a conversation session into memories upon which attention-based memory reading mechanism can be performed multiple times, so that (1) user’s query is properly extended by contextual clues and (2) optimal responses are step-by-step generated. Considering that the speakers of one conversation are not limited to be one, we separate the single memory used for document comprehension into different groups for speaker-specific topic and opinion embedding. Namely, we utilize the queries’ memory, the responses’ memory, and their unified memory, following the time sequence of the conversation session. Experiments on Japanese 10-sentence (5-round) conversation modeling show impressive results on how multi-turn reasoning can produce more diverse and acceptable responses than state-of-the-art single-turn and non-reasoning baselines.
pdf
bib
abs
Playing 20 Question Game with Policy-Based Reinforcement Learning
Huang Hu
|
Xianchao Wu
|
Bingfeng Luo
|
Chongyang Tao
|
Can Xu
|
Wei Wu
|
Zhan Chen
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
The 20 Questions (Q20) game is a well known game which encourages deductive reasoning and creativity. In the game, the answerer first thinks of an object such as a famous person or a kind of animal. Then the questioner tries to guess the object by asking 20 questions. In a Q20 game system, the user is considered as the answerer while the system itself acts as the questioner which requires a good strategy of question selection to figure out the correct object and win the game. However, the optimal policy of question selection is hard to be derived due to the complexity and volatility of the game environment. In this paper, we propose a novel policy-based Reinforcement Learning (RL) method, which enables the questioner agent to learn the optimal policy of question selection through continuous interactions with users. To facilitate training, we also propose to use a reward network to estimate the more informative reward. Compared to previous methods, our RL method is robust to noisy answers and does not rely on the Knowledge Base of objects. Experimental results show that our RL method clearly outperforms an entropy-based engineering system and has competitive performance in a noisy-free simulation environment.
2013
pdf
bib
Generalization of Words for Chinese Dependency Parsing
Xianchao Wu
|
Jie Zhou
|
Yu Sun
|
Zhanyi Liu
|
Dianhai Yu
|
Hua Wu
|
Haifeng Wang
Proceedings of the 13th International Conference on Parsing Technologies (IWPT 2013)
pdf
bib
Mining Japanese Compound Words and Their Pronunciations from Web Pages and Tweets
Xianchao Wu
Proceedings of the Sixth International Joint Conference on Natural Language Processing
pdf
bib
Using the Web to Train a Mobile Device Oriented Japanese Input Method Editor
Xianchao Wu
|
Rixin Xiao
|
Xiaoxin Chen
Proceedings of the Sixth International Joint Conference on Natural Language Processing
2012
pdf
bib
Learning to Translate with Multiple Objectives
Kevin Duh
|
Katsuhito Sudoh
|
Xianchao Wu
|
Hajime Tsukada
|
Masaaki Nagata
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
pdf
bib
A Comparative Study of Target Dependency Structures for Statistical Machine Translation
Xianchao Wu
|
Katsuhito Sudoh
|
Kevin Duh
|
Hajime Tsukada
|
Masaaki Nagata
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
pdf
bib
Akamon: An Open Source Toolkit for Tree/Forest-Based Statistical Machine Translation
Xianchao Wu
|
Takuya Matsuzaki
|
Jun’ichi Tsujii
Proceedings of the ACL 2012 System Demonstrations
pdf
bib
Head Finalization Reordering for Chinese-to-Japanese Machine Translation
Dan Han
|
Katsuhito Sudoh
|
Xianchao Wu
|
Kevin Duh
|
Hajime Tsukada
|
Masaaki Nagata
Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
pdf
bib
Using Collocations and K-means Clustering to Improve the N-pos Model for Japanese IME
Long Chen
|
Xianchao Wu
|
Jingzhou He
Proceedings of the Second Workshop on Advances in Text Input Methods
2011
pdf
bib
Effective Use of Function Words for Rule Generalization in Forest-Based Translation
Xianchao Wu
|
Takuya Matsuzaki
|
Jun’ichi Tsujii
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
pdf
bib
Extracting Pre-ordering Rules from Predicate-Argument Structures
Xianchao Wu
|
Katsuhito Sudoh
|
Kevin Duh
|
Hajime Tsukada
|
Masaaki Nagata
Proceedings of 5th International Joint Conference on Natural Language Processing
pdf
bib
Generalized Minimum Bayes Risk System Combination
Kevin Duh
|
Katsuhito Sudoh
|
Xianchao Wu
|
Hajime Tsukada
|
Masaaki Nagata
Proceedings of 5th International Joint Conference on Natural Language Processing
pdf
bib
Extracting Pre-ordering Rules from Chunk-based Dependency Trees for Japanese-to-English Translation
Xianchao Wu
|
Katsuhito Sudoh
|
Kevin Duh
|
Hajime Tsukada
|
Masaaki Nagata
Proceedings of Machine Translation Summit XIII: Papers
pdf
bib
Post-ordering in Statistical Machine Translation
Katsuhito Sudoh
|
Xianchao Wu
|
Kevin Duh
|
Hajime Tsukada
|
Masaaki Nagata
Proceedings of Machine Translation Summit XIII: Papers
2010
pdf
bib
Fine-Grained Tree-to-String Translation Rule Extraction
Xianchao Wu
|
Takuya Matsuzaki
|
Jun’ichi Tsujii
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
2009
pdf
bib
Semi-Supervised Lexicon Mining from Parenthetical Expressions in Monolingual Web Pages
Xianchao Wu
|
Naoaki Okazaki
|
Jun’ichi Tsujii
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
pdf
bib
abs
The UOT system
Xianchao Wu
|
Takuya Matsuzaki
|
Naoaki Okazaki
|
Yusuke Miyao
|
Jun’ichi Tsujii
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign
We present the UOT Machine Translation System that was used in the IWSLT-09 evaluation campaign. This year, we participated in the BTEC track for Chinese-to-English translation. Our system is based on a string-to-tree framework. To integrate deep syntactic information, we propose the use of parse trees and semantic dependencies on English sentences described respectively by Head-driven Phrase Structure Grammar and Predicate-Argument Structures. We report the results of our system on both the development and test sets.
2008
pdf
bib
abs
Improving English-to-Chinese Translation for Technical Terms using Morphological Information
Xianchao Wu
|
Naoaki Okazaki
|
Takashi Tsunakawa
|
Jun’ichi Tsujii
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers
The continuous emergence of new technical terms and the difficulty of keeping up with neologism in parallel corpora deteriorate the performance of statistical machine translation (SMT) systems. This paper explores the use of morphological information to improve English-to-Chinese translation for technical terms. To reduce the morpheme-level translation ambiguity, we group the morphemes into morpheme phrases and propose the use of domain information for translation candidate selection. In order to find correspondences of morpheme phrases between the source and target languages, we propose an algorithm to mine morpheme phrase translation pairs from a bilingual lexicon. We also build a cascaded translation model that dynamically shifts translation units from phrase level to word and morpheme phrase levels. The experimental results show the significant improvements over the current phrase-based SMT systems.