Hisami Suzuki


2024

pdf bib
Analysis of LLM’s “Spurious” Correct Answers Using Evidence Information of Multi-hop QA Datasets
Ai Ishii | Naoya Inoue | Hisami Suzuki | Satoshi Sekine
Proceedings of the 1st Workshop on Knowledge Graphs and Large Language Models (KaLLM 2024)

Recent LLMs show an impressive accuracy on one of the hallmark tasks of language understanding, namely Question Answering (QA). However, it is not clear if the correct answers provided by LLMs are actually grounded on the correct knowledge related to the question. In this paper, we use multi-hop QA datasets to evaluate the accuracy of the knowledge LLMs use to answer questions, and show that as much as 31% of the correct answers by the LLMs are in fact spurious, i.e., the knowledge LLMs used to ground the answer is wrong while the answer is correct. We present an analysis of these spurious correct answers by GPT-4 using three datasets in two languages, while suggesting future pathways to correct the grounding information using existing external knowledge bases.

pdf bib
JEMHopQA: Dataset for Japanese Explainable Multi-Hop Question Answering
Ai Ishii | Naoya Inoue | Hisami Suzuki | Satoshi Sekine
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

We present JEMHopQA, a multi-hop QA dataset for the development of explainable QA systems. The dataset consists not only of question-answer pairs, but also of supporting evidence in the form of derivation triples, which contributes to making the QA task more realistic and difficult. It is created based on Japanese Wikipedia using both crowd-sourced human annotation as well as prompting a large language model (LLM), and contains a diverse set of question, answer and topic categories as compared with similar datasets released previously. We describe the details of how we built the dataset as well as the evaluation of the QA task presented by this dataset using GPT-4, and show that the dataset is sufficiently challenging for the state-of-the-art LLM while showing promise for combining such a model with existing knowledge resources to achieve better performance.

2012

pdf bib
MSR SPLAT, a language analysis toolkit
Chris Quirk | Pallavi Choudhury | Jianfeng Gao | Hisami Suzuki | Kristina Toutanova | Michael Gamon | Wen-tau Yih | Colin Cherry | Lucy Vanderwende
Proceedings of the Demonstration Session at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
How Are Spelling Errors Generated and Corrected? A Study of Corrected and Uncorrected Spelling Errors Using Keystroke Logs
Yukino Baba | Hisami Suzuki
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
A Unified Approach to Transliteration-based Text Input with Online Spelling Correction
Hisami Suzuki | Jianfeng Gao
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Proceedings of the Workshop on Advances in Text Input Methods (WTIM 2011)
Hideto Kazawa | Hisami Suzuki | Taku Kudo
Proceedings of the Workshop on Advances in Text Input Methods (WTIM 2011)

pdf bib
From pecher to pêcher... or pécher: Simplifying French Input by Accent Prediction
Pallavi Choudhury | Chris Quirk | Hisami Suzuki
Proceedings of the Workshop on Advances in Text Input Methods (WTIM 2011)

pdf bib
Japanese Pronunciation Prediction as Phrasal Statistical Machine Translation
Jun Hatori | Hisami Suzuki
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
A Discriminative Lexicon Model for Complex Morphology
Minwoo Jeong | Kristina Toutanova | Hisami Suzuki | Chris Quirk
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers

This paper describes successful applications of discriminative lexicon models to the statistical machine translation (SMT) systems into morphologically complex languages. We extend the previous work on discriminatively trained lexicon models to include more contextual information in making lexical selection decisions by building a single global log-linear model of translation selection. In offline experiments, we show that the use of the expanded contextual information, including morphological and syntactic features, help better predict words in three target languages with complex morphology (Bulgarian, Czech and Korean). We also show that these improved lexical prediction models make a positive impact in the end-to-end SMT scenario from English to these languages.

2009

pdf bib
Japanese Query Alteration Based on Lexical Semantic Similarity
Masato Hagiwara | Hisami Suzuki
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Discriminative Substring Decoding for Transliteration
Colin Cherry | Hisami Suzuki
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Discovery of Term Variation in Japanese Web Search Queries
Hisami Suzuki | Xiao Li | Jianfeng Gao
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
NEWS 2009 Machine Transliteration Shared Task System Description: Transliteration with Letter-to-Phoneme Technology
Colin Cherry | Hisami Suzuki
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)

2008

pdf bib
Applying Morphology Generation Models to Machine Translation
Kristina Toutanova | Hisami Suzuki | Achim Ruopp
Proceedings of ACL-08: HLT

pdf bib
Minimally Supervised Learning of Semantic Knowledge from Query Logs
Mamoru Komachi | Hisami Suzuki
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

2007

pdf bib
Generating Case Markers in Machine Translation
Kristina Toutanova | Hisami Suzuki
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Generating Complex Morphology for Machine Translation
Einat Minkov | Kristina Toutanova | Hisami Suzuki
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib
RefRef: A Tool for Viewing and Exploring Coreference Space
Hisami Suzuki | Gary Kacmarcik
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

We present RefRef, a tool for viewing and exploring coreference space, which is publicly available for research purposes. Unlike similar tools currently available whose main goal is to assist the annotation process of coreference links, RefRef is dedicated for viewing and exploring coreference-annotated data, whether manually tagged or automatically resolved. RefRef is also highly customizable, as the tool is being made available with the source code. In this paper we describe the main functionalities of RefRef as well as some possibilities for customization to meet the specific needs of the users of such coreference-annotated text.

pdf bib
Approximation Lasso Methods for Language Modeling
Jianfeng Gao | Hisami Suzuki | Bin Yu
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Learning to Predict Case Markers in Japanese
Hisami Suzuki | Kristina Toutanova
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
An Empirical Study on Language Model Adaptation Using a Metric of Domain Similarity
Wei Yuan | Jianfeng Gao | Hisami Suzuki
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
A Comparative Study on Language Model Adaptation Techniques Using New Evaluation Metrics
Hisami Suzuki | Jianfeng Gao
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
MindNet: An Automatically-Created Lexical Resource
Lucy Vanderwende | Gary Kacmarcik | Hisami Suzuki | Arul Menezes
Proceedings of HLT/EMNLP 2005 Interactive Demonstrations

2004

pdf bib
Using the Penn Treebank to Evaluate Non-Treebank Parsers
Eric K. Ringger | Robert C. Moore | Eugene Charniak | Lucy Vanderwende | Hisami Suzuki
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Phrase-Based Dependency Evaluation of a Japanese Parser
Hisami Suzuki
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Unsupervised Learning of Dependency Structure for Language Modeling
Jianfeng Gao | Hisami Suzuki
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

2002

pdf bib
Exploiting Headword Dependency and Predictive Clustering for Language Modeling
Jianfeng Gao | Hisami Suzuki | Yang Wen
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

pdf bib
A Development Environment for Large-scale Multi-lingual Parsing Systems
Hisami Suzuki
COLING-02: Grammar Engineering and Evaluation

pdf bib
English-Japanese Example-Based Machine Translation Using Abstract Linguistic Representations
Chris Brockett | Takako Aikawa | Anthony Aue | Arul Menezes | Chris Quirk | Hisami Suzuki
COLING-02: Machine Translation in Asia

2001

pdf bib
Using machine learning for system-internal evaluation of transferred linguistic representations
Michael Gamon | Hisami Suzuki | Simon Corston-Oliver
Proceedings of Machine Translation Summit VIII

We present an automated, system-internal evaluation technique for linguistic representations in a large-scale, multilingual MT system. We use machine-learned classifiers to recognize the differences between linguistic representations generated from transfer in an MT context from representations that are produced by "native" analysis of the target language. In the MT scenario, convergence of the two is the desired result. Holding the feature set and the learning algorithm constant, the accuracy of the classifiers provides a measure of the overall difference between the two sets of linguistic representations: classifiers with higher accuracy correspond to more pronounced differences between representations. More importantly, the classifiers yield the basis for error-analysis by providing a ranking of the importance of linguistic features. The more salient a linguistic criterion is in discriminating transferred representations from "native" representations, the more work will be needed in order to get closer to the goal of producing native-like MT. We present results from using this approach on the Microsoft MT system and discuss its advantages and possible extensions.

2000

pdf bib
Tools for Large-Scale Parser Development
Hisami Suzuki | Jessie Pinkham
Proceedings of the COLING-2000 Workshop on Efficiency In Large-Scale Parsing Systems

pdf bib
Robust Segmentation of Japanese Text into a Lattice for Parsing
Gary Kacmarcik | Chris Brockett | Hisami Suzuki
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

pdf bib
Using a Broad-Coverage Parser for Word-Breaking in Japanese
Hisami Suzuki | Chris Brockett | Gary Kaemareik
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics