Sébastien Jean

Also published as: Sebastien Jean


pdf bib
Measuring and Mitigating Constraint Violations of In-Context Learning for Utterance-to-API Semantic Parsing
Shufan Wang | Sébastien Jean | Sailik Sengupta | James Gung | Nikolaos Pappas | Yi Zhang
Findings of the Association for Computational Linguistics: EMNLP 2023

In executable task-oriented semantic parsing, the system aims to translate users’ utterances in natural language to machine-interpretable programs (API calls) that can be executed according to pre-defined API specifications. With the popularity of Large Language Models (LLMs), in-context learning offers a strong baseline for such scenarios, especially in data-limited regimes. However, LLMs are known to hallucinate and therefore pose a formidable challenge in constraining generated content. Thus, it remains uncertain if LLMs can effectively perform task-oriented utterance-to-API generation, where respecting the API’s structural and task-specific constraints is crucial. In this work, we seek to measure, analyze and mitigate such constraints violations. First, we identify the categories of various constraints in obtaining API-semantics from task-oriented utterances, and define fine-grained metrics that complement traditional ones. Second, we leverage these metrics to conduct a detailed error analysis of constraints violations seen in state-of-the-art LLMs, which motivates us to investigate two popular mitigation strategies– Semantic-Retrieval of Demonstrations (SRD) and API-aware Constrained Decoding (API-CD). Our experiments show that these strategies are effective at reducing constraints violations and improving the quality of the generated API calls, but require careful consideration given their implementation complexity and latency.


pdf bib
Log-Linear Reformulation of the Noisy Channel Model for Document-Level Neural Machine Translation
Sébastien Jean | Kyunghyun Cho
Proceedings of the Fourth Workshop on Structured Prediction for NLP

We seek to maximally use various data sources, such as parallel and monolingual data, to build an effective and efficient document-level translation system. In particular, we start by considering a noisy channel approach (CITATION) that combines a target-to-source translation model and a language model. By applying Bayes’ rule strategically, we reformulate this approach as a log-linear combination of translation, sentence-level and document-level language model probabilities. In addition to using static coefficients for each term, this formulation alternatively allows for the learning of dynamic per-token weights to more finely control the impact of the language models. Using both static or dynamic coefficients leads to improvements over a context-agnostic baseline and a context-aware concatenation model.


pdf bib
Adversarial Learning for Neural Dialogue Generation
Jiwei Li | Will Monroe | Tianlin Shi | Sébastien Jean | Alan Ritter | Dan Jurafsky
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We apply adversarial training to open-domain dialogue generation, training a system to produce sequences that are indistinguishable from human-generated dialogue utterances. We cast the task as a reinforcement learning problem where we jointly train two systems: a generative model to produce response sequences, and a discriminator—analagous to the human evaluator in the Turing test— to distinguish between the human-generated dialogues and the machine-generated ones. In this generative adversarial network approach, the outputs from the discriminator are used to encourage the system towards more human-like dialogue. Further, we investigate models for adversarial evaluation that uses success in fooling an adversary as a dialogue evaluation metric, while avoiding a number of potential pitfalls. Experimental results on several metrics, including adversarial evaluation, demonstrate that the adversarially-trained system generates higher-quality responses than previous baselines

pdf bib
Neural Machine Translation for Cross-Lingual Pronoun Prediction
Sebastien Jean | Stanislas Lauly | Orhan Firat | Kyunghyun Cho
Proceedings of the Third Workshop on Discourse in Machine Translation

In this paper we present our systems for the DiscoMT 2017 cross-lingual pronoun prediction shared task. For all four language pairs, we trained a standard attention-based neural machine translation system as well as three variants that incorporate information from the preceding source sentence. We show that our systems, which are not specifically designed for pronoun prediction and may be used to generate complete sentence translations, generally achieve competitive results on this task.


pdf bib
Montreal Neural Machine Translation Systems for WMT’15
Sébastien Jean | Orhan Firat | Kyunghyun Cho | Roland Memisevic | Yoshua Bengio
Proceedings of the Tenth Workshop on Statistical Machine Translation

pdf bib
On Using Very Large Target Vocabulary for Neural Machine Translation
Sébastien Jean | Kyunghyun Cho | Roland Memisevic | Yoshua Bengio
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)