Jun’ichi Tsujii

Also published as: J. Tsujii, Jun-Ichi Tsujii, Jun-ich Tsujii, Jun-ichi Tsujii, Junichi Tsujii


2024

pdf bib
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
Dina Demner-Fushman | Sophia Ananiadou | Makoto Miwa | Kirk Roberts | Junichi Tsujii
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing

2022

pdf bib
Proceedings of the 21st Workshop on Biomedical Language Processing
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 21st Workshop on Biomedical Language Processing

2021

pdf bib
Proceedings of the 20th Workshop on Biomedical Language Processing
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 20th Workshop on Biomedical Language Processing

pdf bib
Natural Language Processing and Computational Linguistics
Junichi Tsujii
Computational Linguistics, Volume 47, Issue 4 - December 2021

2020

pdf bib
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters
Hicham El Boukkouri | Olivier Ferret | Thomas Lavergne | Hiroshi Noji | Pierre Zweigenbaum | Jun’ichi Tsujii
Proceedings of the 28th International Conference on Computational Linguistics

Due to the compelling improvements brought by BERT, many recent representation models adopted the Transformer architecture as their main building block, consequently inheriting the wordpiece tokenization system despite it not being intrinsically linked to the notion of Transformers. While this system is thought to achieve a good balance between the flexibility of characters and the efficiency of full words, using predefined wordpiece vocabularies from the general domain is not always suitable, especially when building models for specialized domains (e.g., the medical domain). Moreover, adopting a wordpiece tokenization shifts the focus from the word level to the subword level, making the models conceptually more complex and arguably less convenient in practice. For these reasons, we propose CharacterBERT, a new variant of BERT that drops the wordpiece system altogether and uses a Character-CNN module instead to represent entire words by consulting their characters. We show that this new model improves the performance of BERT on a variety of medical domain tasks while at the same time producing robust, word-level, and open-vocabulary representations.

pdf bib
Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing

pdf bib
Compositional Phrase Alignment and Beyond
Yuki Arase | Jun’ichi Tsujii
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Phrase alignment is the basis for modelling sentence pair interactions, such as paraphrase and textual entailment recognition. Most phrase alignments are compositional processes such that an alignment of a phrase pair is constructed based on the alignments of their child phrases. Nonetheless, studies have revealed that non-compositional alignments involving long-distance phrase reordering are prevalent in practice. We address the phrase alignment problem by combining an unordered tree mapping algorithm and phrase representation modelling that explicitly embeds the similarity distribution in the sentences onto powerful contextualized representations. Experimental results demonstrate that our method effectively handles compositional and non-compositional global phrase alignments. Our method significantly outperforms that used in a previous study and achieves a performance competitive with that of experienced human annotators.

2019

pdf bib
Transfer Fine-Tuning: A BERT Case Study
Yuki Arase | Jun’ichi Tsujii
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

A semantic equivalence assessment is defined as a task that assesses semantic equivalence in a sentence pair by binary judgment (i.e., paraphrase identification) or grading (i.e., semantic textual similarity measurement). It constitutes a set of tasks crucial for research on natural language understanding. Recently, BERT realized a breakthrough in sentence representation learning (Devlin et al., 2019), which is broadly transferable to various NLP tasks. While BERT’s performance improves by increasing its model size, the required computational power is an obstacle preventing practical applications from adopting the technology. Herein, we propose to inject phrasal paraphrase relations into BERT in order to generate suitable representations for semantic equivalence assessment instead of increasing the model size. Experiments on standard natural language understanding tasks confirm that our method effectively improves a smaller BERT model while maintaining the model size. The generated model exhibits superior performance compared to a larger BERT model on semantic equivalence assessment tasks. Furthermore, it achieves larger performance gains on tasks with limited training datasets for fine-tuning, which is a property desirable for transfer learning.

pdf bib
Proceedings of the 18th BioNLP Workshop and Shared Task
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 18th BioNLP Workshop and Shared Task

2018

pdf bib
SPADE: Evaluation Dataset for Monolingual Phrase Alignment
Yuki Arase | Junichi Tsujii
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Proceedings of the BioNLP 2018 workshop
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the BioNLP 2018 workshop

pdf bib
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Ellen Riloff | David Chiang | Julia Hockenmaier | Jun’ichi Tsujii
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

2017

pdf bib
Monolingual Phrase Alignment on Parse Forests
Yuki Arase | Junichi Tsujii
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We propose an efficient method to conduct phrase alignment on parse forests for paraphrase detection. Unlike previous studies, our method identifies syntactic paraphrases under linguistically motivated grammar. In addition, it allows phrases to non-compositionally align to handle paraphrases with non-homographic phrase correspondences. A dataset that provides gold parse trees and their phrase alignments is created. The experimental results confirm that the proposed method conducts highly accurate phrase alignment compared to human performance.

pdf bib
Distributed Document and Phrase Co-embeddings for Descriptive Clustering
Motoki Sato | Austin J. Brockmeier | Georgios Kontonatsios | Tingting Mu | John Y. Goulermas | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Descriptive document clustering aims to automatically discover groups of semantically related documents and to assign a meaningful label to characterise the content of each cluster. In this paper, we present a descriptive clustering approach that employs a distributed representation model, namely the paragraph vector model, to capture semantic similarities between documents and phrases. The proposed method uses a joint representation of phrases and documents (i.e., a co-embedding) to automatically select a descriptive phrase that best represents each document cluster. We evaluate our method by comparing its performance to an existing state-of-the-art descriptive clustering method that also uses co-embedding but relies on a bag-of-words representation. Results obtained on benchmark datasets demonstrate that the paragraph vector-based method obtains superior performance over the existing approach in both identifying clusters and assigning appropriate descriptive labels to them.

pdf bib
BioNLP 2017
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | Junichi Tsujii
BioNLP 2017

2016

pdf bib
Proceedings of the 15th Workshop on Biomedical Natural Language Processing
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | Jun-ichi Tsujii
Proceedings of the 15th Workshop on Biomedical Natural Language Processing

pdf bib
A Latent Concept Topic Model for Robust Topic Inference Using Word Embeddings
Weihua Hu | Jun’ichi Tsujii
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
Estimating Numerical Attributes by Bringing Together Fragmentary Clues
Hiroya Takamura | Jun’ichi Tsujii
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Proceedings of BioNLP 15
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | Jun-ichi Tsujii
Proceedings of BioNLP 15

2014

pdf bib
Proceedings of BioNLP 2014
Kevin Cohen | Dina Demner-Fushman | Sophia Ananiadou | Jun-ichi Tsujii
Proceedings of BioNLP 2014

pdf bib
Combining String and Context Similarity for Bilingual Term Alignment from Comparable Corpora
Georgios Kontonatsios | Ioannis Korkontzelos | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
Junichi Tsujii | Jan Hajic
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Common Space Embedding of Primal-Dual Relation Semantic Spaces
Hidekazu Oiwa | Jun’ichi Tsujii
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Using a Random Forest Classifier to Compile Bilingual Dictionaries of Technical Terms from Comparable Corpora
Georgios Kontonatsios | Ioannis Korkontzelos | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

2013

pdf bib
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | John Pestian | Jun’ichi Tsujii
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing

pdf bib
Overview of the Pathway Curation (PC) task of BioNLP Shared Task 2013
Tomoko Ohta | Sampo Pyysalo | Rafal Rak | Andrew Rowley | Hong-Woo Chun | Sung-Jae Jung | Sung-Pil Choi | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the BioNLP Shared Task 2013 Workshop

pdf bib
Using a Random Forest Classifier to recognise translations of biomedical terms across languages
Georgios Kontonatsios | Ioannis Korkontzelos | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Sixth Workshop on Building and Using Comparable Corpora

pdf bib
Deep Context-Free Grammar for Chinese with Broad-Coverage
Xiangli Wang | Yi Zhang | Yusuke Miyao | Takuya Matsuzaki | Junichi Tsujii
Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing

2012

pdf bib
Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese
Jun Hatori | Takuya Matsuzaki | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Akamon: An Open Source Toolkit for Tree/Forest-Based Statistical Machine Translation
Xianchao Wu | Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of the ACL 2012 System Demonstrations

pdf bib
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Kevin B. Cohen | Dina Demner-Fushman | Sophia Ananiadou | Bonnie Webber | Jun’ichi Tsujii | John Pestian
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

pdf bib
Bridging the Gap Between Scope-based and Event-based Negation/Speculation Annotations: A Bridge Not Too Far
Pontus Stenetorp | Sampo Pyysalo | Tomoko Ohta | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics

pdf bib
Open-domain Anatomical Entity Mention Detection
Tomoko Ohta | Sampo Pyysalo | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the Workshop on Detecting Structure in Scholarly Discourse

pdf bib
Biomedical Chinese-English CLIR Using an Extended CMeSH Resource to Expand Queries
Xinkai Wang | Paul Thompson | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Cross-lingual information retrieval (CLIR) involving the Chinese language has been thoroughly studied in the general language domain, but rarely in the biomedical domain, due to the lack of suitable linguistic resources and parsing tools. In this paper, we describe a Chinese-English CLIR system for biomedical literature, which exploits a bilingual ontology, the ``eCMeSH Tree"""". This is an extension of the Chinese Medical Subject Headings (CMeSH) Tree, based on Medical Subject Headings (MeSH). Using the 2006 and 2007 TREC Genomics track data, we have evaluated the performance of the eCMeSH Tree in expanding queries. We have compared our results to those obtained using two other approaches, i.e. pseudo-relevance feedback (PRF) and document translation (DT). Subsequently, we evaluate the performance of different combinations of these three retrieval methods. Our results show that our method of expanding queries using the eCMeSH Tree can outperform the PRF method. Furthermore, combining this method with PRF and DT helps to smooth the differences in query expansion, and consequently results in the best performance amongst all experiments reported. All experiments compare the use of two different retrieval models, i.e. Okapi BM25 and a query likelihood language model. In general, the former performs slightly better.

pdf bib
Coordination Structure Analysis using Dual Decomposition
Atsushi Hanamoto | Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
brat: a Web-based Tool for NLP-Assisted Text Annotation
Pontus Stenetorp | Sampo Pyysalo | Goran Topić | Tomoko Ohta | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Jun’ichi Tsujii | James Henderson | Marius Paşca
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

2011

pdf bib
Resource-rich research on natural language processing and understanding
Junichi Tsujii
Proceedings of the 8th International Workshop on Spoken Language Translation: Keynotes

pdf bib
Effective Use of Function Words for Rule Generalization in Forest-Based Translation
Xianchao Wu | Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Proceedings of BioNLP 2011 Workshop
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | John Pestian | Jun’ichi Tsujii | Bonnie Webber
Proceedings of BioNLP 2011 Workshop

pdf bib
Automatic Acquisition of Huge Training Data for Bio-Medical Named Entity Recognition
Yu Usami | Han-Cheol Cho | Naoaki Okazaki | Jun’ichi Tsujii
Proceedings of BioNLP 2011 Workshop

pdf bib
From Pathways to Biomolecular Events: Opportunities and Challenges
Tomoko Ohta | Sampo Pyysalo | Jun’ichi Tsujii
Proceedings of BioNLP 2011 Workshop

pdf bib
Towards Exhaustive Event Extraction for Protein Modifications
Sampo Pyysalo | Tomoko Ohta | Makoto Miwa | Jun’ichi Tsujii
Proceedings of BioNLP 2011 Workshop

pdf bib
SimSem: Fast Approximate String Matching in Relation to Semantic Category Disambiguation
Pontus Stenetorp | Sampo Pyysalo | Jun’ichi Tsujii
Proceedings of BioNLP 2011 Workshop

pdf bib
A Collaborative Annotation between Human Annotators and a Statistical Parser
Shun’ya Iwasawa | Hiroki Hanaoka | Takuya Matsuzaki | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 5th Linguistic Annotation Workshop

pdf bib
Learning the Optimal Use of Dependency-parsing Information for Finding Translations with Comparable Corpora
Daniel Andrade | Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web

pdf bib
Proceedings of BioNLP Shared Task 2011 Workshop
Jun’ichi Tsujii | Jin-Dong Kim | Sampo Pyysalo
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
Overview of BioNLP Shared Task 2011
Jin-Dong Kim | Sampo Pyysalo | Tomoko Ohta | Robert Bossy | Ngan Nguyen | Jun’ichi Tsujii
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
Overview of the Epigenetics and Post-translational Modifications (EPI) task of BioNLP Shared Task 2011
Tomoko Ohta | Sampo Pyysalo | Jun’ichi Tsujii
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
Overview of the Infectious Diseases (ID) task of BioNLP Shared Task 2011
Sampo Pyysalo | Tomoko Ohta | Rafal Rak | Dan Sullivan | Chunhong Mao | Chunxia Wang | Bruno Sobral | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
Overview of BioNLP 2011 Protein Coreference Shared Task
Ngan Nguyen | Jin-Dong Kim | Jun’ichi Tsujii
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
Overview of the Entity Relations (REL) supporting task of BioNLP Shared Task 2011
Sampo Pyysalo | Tomoko Ohta | Jun’ichi Tsujii
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
BioNLP Shared Task 2011: Supporting Resources
Pontus Stenetorp | Goran Topić | Sampo Pyysalo | Tomoko Ohta | Jin-Dong Kim | Jun’ichi Tsujii
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
Analysis of the Difficulties in Chinese Deep Parsing
Kun Yu | Yusuke Miyao | Takuya Matsuzaki | Xiangli Wang | Junichi Tsujii
Proceedings of the 12th International Conference on Parsing Technologies

pdf bib
Exploring Difficulties in Parsing Imperatives and Questions
Tadayoshi Hara | Takuya Matsuzaki | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Incremental Joint POS Tagging and Dependency Parsing in Chinese
Jun Hatori | Takuya Matsuzaki | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of 5th International Joint Conference on Natural Language Processing

bib
Challenges of Patent MT – Term and Structure Translation
Jun’ichi Tsujii
Proceedings of Machine Translation Summit XIII: Plenaries

2010

pdf bib
Fine-Grained Tree-to-String Translation Rule Extraction
Xianchao Wu | Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
A Simple Approach for HPSG Supertagging Using Dependency Information
Yao-zhong Zhang | Takuya Matsuzaki | Jun’ichi Tsujii
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
A Modular Architecture for the Wide-Coverage Translation of Natural Language Texts into Predicate Logic Formulas
Yusuke Miyao | Alastair Butler | Kei Yoshimoto | Jun’ichi Tsujii
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

pdf bib
The Deep Re-Annotation in a Chinese Scientific Treebank
Kun Yu | Xiangli Wang | Yusuke Miyao | Takuya Matsuzaki | Junichi Tsujii
Proceedings of the Fourth Linguistic Annotation Workshop

pdf bib
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing
K. Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | John Pestian | Jun’ichi Tsujii | Bonnie Webber
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing

pdf bib
Event Extraction for Post-Translational Modifications
Tomoko Ohta | Sampo Pyysalo | Makoto Miwa | Jin-Dong Kim | Jun’ichi Tsujii
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing

pdf bib
Scaling up Biomedical Event Extraction to the Entire PubMed
Jari Björne | Filip Ginter | Sampo Pyysalo | Jun’ichi Tsujii | Tapio Salakoski
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing

pdf bib
A Comparative Study of Syntactic Parsers for Event Extraction
Makoto Miwa | Sampo Pyysalo | Tadayoshi Hara | Jun’ichi Tsujii
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing

pdf bib
Towards Event Extraction from Full Texts on Infectious Diseases
Sampo Pyysalo | Tomoko Ohta | Han-Cheol Cho | Dan Sullivan | Chunhong Mao | Bruno Sobral | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing

pdf bib
Robust Measurement and Comparison of Context Similarity for Finding Translation Pairs
Daniel Andrade | Tetsuya Nasukawa | Junichi Tsujii
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Evaluating Dependency Representations for Event Extraction
Makoto Miwa | Sampo Pyysalo | Tadayoshi Hara | Jun’ichi Tsujii
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Entity-Focused Sentence Simplification for Relation Extraction
Makoto Miwa | Rune Sætre | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Simple and Efficient Algorithm for Approximate Dictionary Matching
Naoaki Okazaki | Jun’ichi Tsujii
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Forest-guided Supertagger Training
Yao-zhong Zhang | Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Imbalanced Classification Using Dictionary-based Prototypes and Hierarchical Decision Rules for Entity Sense Disambiguation
Tingting Mu | Xinglong Wang | Jun’ichi Tsujii | Sophia Ananiadou
Coling 2010: Posters

pdf bib
Semi-automatically Developing Chinese HPSG Grammar from the Penn Chinese Treebank for Deep Parsing
Kun Yu | Yusuke Miyao | Xiangli Wang | Takuya Matsuzaki | Junichi Tsujii
Coling 2010: Posters

pdf bib
U-Compare: An Integrated Language Resource Evaluation Platform Including a Comprehensive UIMA Resource Library
Yoshinobu Kano | Ruben Dorado | Luke McCrohon | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Language resources, including corpus and tools, are normally required to be combined in order to achieve a user’s specific task. However, resources tend to be developed independently in different, incompatible formats. In this paper we describe about U-Compare, which consists of the U-Compare component repository and the U-Compare platform. We have been building a highly interoperable resource library, providing the world largest ready-to-use UIMA component repository including wide variety of corpus readers and state-of-the-art language tools. These resources can be deployed as local services or web services, even possible to be hosted in clustered machines to increase the performance, while users do not need to be aware of such differences. In addition to the resource library, an integrated language processing platform is provided, allowing workflow creation, comparison, evaluation and visualization, using the resources in the library or any UIMA component, without any programming via graphical user interfaces, while a command line launcher is also available without GUIs. The evaluation itself is processed in a UIMA component, users can create and plug their own evaluation metrics in addition to the predefined metrics. U-Compare has been successfully used in many projects including BioCreative, Conll and the BioNLP shared task.

pdf bib
A Japanese Particle Corpus Built by Example-Based Annotation
Hiroki Hanaoka | Hideki Mima | Jun’ichi Tsujii
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper is a report on an on-going project of creating a new corpus focusing on Japanese particles. The corpus will provide deeper syntactic/semantic information than the existing resources. The initial target particle is ``to'' which occurs 22,006 times in 38,400 sentences of the existing corpus: the Kyoto Text Corpus. In this annotation task, an ``example-based'' methodology is adopted for the corpus annotation, which is different from the traditional annotation style. This approach provides the annotators with an example sentence rather than a linguistic category label. By avoiding linguistic technical terms, it is expected that any native speakers, with no special knowledge on linguistic analysis, can be an annotator without long training, and hence it can reduce the annotation cost. So far, 10,475 occurrences have been already annotated, with an inter-annotator agreement of 0.66 calculated by Cohen's kappa. The initial disagreement analyses and future directions are discussed in the paper.

2009

pdf bib
A Discriminative Latent Variable Chinese Segmenter with Hybrid Word/Character Information
Xu Sun | Yaozhong Zhang | Takuya Matsuzaki | Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Semi-Supervised Lexicon Mining from Parenthetical Expressions in Monolingual Web Pages
Xianchao Wu | Naoaki Okazaki | Jun’ichi Tsujii
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Learning Combination Features with L1 Regularization
Daisuke Okanohara | Jun’ichi Tsujii
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

pdf bib
Extracting Bilingual Dictionary from Comparable Corpora with Dependency Heterogeneity
Kun Yu | Junichi Tsujii
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

pdf bib
Obituaries: Hozumi Tanaka
Timothy Baldwin | Takenobu Tokunaga | Jun’ichi Tsujii
Computational Linguistics, Volume 35, Number 4, December 2009

pdf bib
The UOT system
Xianchao Wu | Takuya Matsuzaki | Naoaki Okazaki | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign

We present the UOT Machine Translation System that was used in the IWSLT-09 evaluation campaign. This year, we participated in the BTEC track for Chinese-to-English translation. Our system is based on a string-to-tree framework. To integrate deep syntactic information, we propose the use of parse trees and semantic dependencies on English sentences described respectively by Head-driven Phrase Structure Grammar and Predicate-Argument Structures. We report the results of our system on both the development and test sets.

pdf bib
Sequential Labeling with Latent Variables: An Exact Inference Algorithm and its Efficient Approximation
Xu Sun | Jun’ichi Tsujii
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Fast Full Parsing by Linear-Chain Conditional Random Fields
Yoshimasa Tsuruoka | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
A Rich Feature Vector for Protein-Protein Interaction Extraction from Multiple Corpora
Makoto Miwa | Rune Sætre | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Descriptive and Empirical Approaches to Capturing Underlying Dependencies among Parsing Errors
Tadayoshi Hara | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Supervised Learning of a Probabilistic Lexicon of Verb Semantic Classes
Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Classifying Relations for Biomedical Named Entity Disambiguation
Xinglong Wang | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
GuideLink: A Corpus Annotation System that Integrates the Management of Annotation Guidelines
Kenta Oouchida | Jin-Dong Kim | Toshihisa Takagi | Jun’ichi Tsujii
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

pdf bib
Design of Chinese HPSG Framework for Data-Driven Parsing
Xiangli Wang | Shunya Iwasawa | Yusuke Miyao | Takuya Matsuzaki | Kun Yu | Jun’ichi Tsujii
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

pdf bib
Proceedings of the BioNLP 2009 Workshop
K. Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | John Pestian | Jun’ichi Tsujii | Bonnie Webber
Proceedings of the BioNLP 2009 Workshop

pdf bib
Static Relations: a Piece in the Biomedical Information Extraction Puzzle
Sampo Pyysalo | Tomoko Ohta | Jin-Dong Kim | Jun’ichi Tsujii
Proceedings of the BioNLP 2009 Workshop

pdf bib
Incorporating GENETAG-style annotation to GENIA corpus
Tomoko Ohta | Jin-Dong Kim | Sampo Pyysalo | Yue Wang | Jun’ichi Tsujii
Proceedings of the BioNLP 2009 Workshop

pdf bib
Bridging the Gap between Domain-Oriented and Linguistically-Oriented Semantics
Sumire Uematsu | Jin-Dong Kim | Jun’ichi Tsujii
Proceedings of the BioNLP 2009 Workshop

pdf bib
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task
Jun’ichi Tsujii
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task

pdf bib
Overview of BioNLP’09 Shared Task on Event Extraction
Jin-Dong Kim | Tomoko Ohta | Sampo Pyysalo | Yoshinobu Kano | Jun’ichi Tsujii
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task

pdf bib
A Markov Logic Approach to Bio-Molecular Event Extraction
Sebastian Riedel | Hong-Woo Chun | Toshihisa Takagi | Jun’ichi Tsujii
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task

pdf bib
From Protein-Protein Interaction to Molecular Event Extraction
Rune Sætre | Makoto Miwa | Kazuhiro Yoshida | Jun’ichi Tsujii
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task

pdf bib
Integrated NLP Evaluation System for Pluggable Evaluation Metrics with Extensive Interoperable Toolkit
Yoshinobu Kano | Luke McCrohon | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)

pdf bib
Evaluating Contribution of Deep Syntactic Information to Shallow Semantic Analysis
Sumire Uematsu | Jun’ichi Tsujii
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

pdf bib
Effective Analysis of Causes and Inter-dependencies of Parsing Errors
Tadayoshi Hara | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

pdf bib
HPSG Supertagging: A Sequence Labeling View
Yao-zhong Zhang | Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

pdf bib
A Comparative Study on Generalization of Semantic Roles in FrameNet
Yuichiroh Matsubayashi | Naoaki Okazaki | Jun’ichi Tsujii
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty
Yoshimasa Tsuruoka | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Robust Approach to Abbreviating Terms: A Discriminative Latent Variable Model with Global Information
Xu Sun | Naoaki Okazaki | Jun’ichi Tsujii
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
A Novel Word Segmentation Approach for Written Languages with Word Boundary Markers
Han-Cheol Cho | Do-Gil Lee | Jung-Tae Lee | Pontus Stenetorp | Jun’ichi Tsujii | Hae-Chang Rim
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib
Bilingual Dictionary Extraction from Wikipedia
Kun Yu | Junichi Tsujii
Proceedings of Machine Translation Summit XII: Posters

2008

pdf bib
Evaluating the Effects of Treebank Size in a Practical Application for Parsing
Kenji Sagae | Yusuke Miyao | Rune Saetre | Jun’ichi Tsujii
Software Engineering, Testing, and Quality Assurance for Natural Language Processing

pdf bib
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Dina Demner-Fushman | Sophia Ananiadou | Kevin Bretonnel Cohen | John Pestian | Jun’ichi Tsujii | Bonnie Webber
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

pdf bib
Accelerating the Annotation of Sparse Named Entities by Dynamic Sentence Selection
Yoshimasa Tsuruoka | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

pdf bib
Prediction of Protein Sub-cellular Localization using Information from Texts and Sequences.
Hong-Woo Chun | Chisato Yamasaki | Naomi Saichi | Masayuki Tanaka | Teruyoshi Hishiki | Tadashi Imanishi | Takashi Gojobori | Jin-Dong Kim | Jun’ichi Tsujii | Toshihisa Takagi
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

pdf bib
Raising the Compatibility of Heterogeneous Annotations: A Case Study on
Yue Wang | Kazuhiro Yoshida | Jin-Dong Kim | Rune Saetre | Jun’ichi Tsujii
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

pdf bib
Parser Evaluation Across Frameworks without Format Conversion
Wai Lok Tam | Yo Sato | Yusuke Miyao | Junichi Tsujii
Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation

pdf bib
Task-oriented Evaluation of Syntactic Parsers and Their Representations
Yusuke Miyao | Rune Sætre | Kenji Sagae | Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of ACL-08: HLT

pdf bib
Bilingual Synonym Identification with Spelling Variations
Takashi Tsunakawa | Jun’ichi Tsujii
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Towards Data and Goal Oriented Analysis: Tool Inter-operability and Combinatorial Comparison
Yoshinobu Kano | Ngan Nguyen | Rune Sætre | Kazuhiro Yoshida | Keiichiro Fukamachi | Yusuke Miyao | Yoshimasa Tsuruoka | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II

pdf bib
A Discriminative Approach to Japanese Abbreviation Extraction
Naoaki Okazaki | Mitsuru Ishizuka | Jun’ichi Tsujii
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II

pdf bib
Challenges in Pronoun Resolution System for Biomedical Text
Ngan Nguyen | Jin-Dong Kim | Jun’ichi Tsujii
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper presents our findings on the feasibility of doing pronoun resolution for biomedical texts, in comparison with conducting pronoun resolution for the newswire domain. In our experiments, we built a simple machine learning-based pronoun resolution system, and evaluated the system on three different corpora: MUC, ACE, and GENIA. Comparative statistics not only reveal the noticeable issues in constructing an effective pronoun resolution system for a new domain, but also provides a comprehensive view of those corpora often used for this task.

pdf bib
GENIA-GR: a Grammatical Relation Corpus for Parser Evaluation in the Biomedical Domain
Yuka Tateisi | Yusuke Miyao | Kenji Sagae | Jun’ichi Tsujii
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We report the construction of a corpus for parser evaluation in the biomedical domain. A 50-abstract subset (492 sentences) of the GENIA corpus (Kim et al., 2003) is annotated with labeled head-dependent relations using the grammatical relations (GR) evaluation scheme (Carroll et al., 1998) ,which has been used for parser evaluation in the newswire domain.

pdf bib
Building Bilingual Lexicons using Lexical Translation Probabilities via Pivot Languages
Takashi Tsunakawa | Naoaki Okazaki | Jun’ichi Tsujii
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper proposes a method of increasing the size of a bilingual lexicon obtained from two other bilingual lexicons via a pivot language. When we apply this approach, there are two main challenges, “ambiguity” and “mismatch” of terms; we target the latter problem by improving the utilization ratio of the bilingual lexicons. Given two bilingual lexicons between language pairs Lf-Lp and Lp-Le, we compute lexical translation probabilities of word pairs by using a statistical word-alignment model, and term decomposition/composition techniques. We compare three approaches to generate the bilingual lexicon: “exact merging”, “word-based merging”, and our proposed “alignment-based merging”. In our method, we combine lexical translation probabilities and a simple language model for estimating the probabilities of translation pairs. The experimental results show that our method could drastically improve the number of translation terms compared to the two methods mentioned above. Additionally, we evaluated and discussed the quality of the translation outputs.

pdf bib
Connecting Text Mining and Pathways using the PathText Resource
Rune Sætre | Brian Kemper | Kanae Oda | Naoaki Okazaki | Yukiko Matsuoka | Norihiro Kikuchi | Hiroaki Kitano | Yoshimasa Tsuruoka | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Many systems have been developed in the past few years to assist researchers in the discovery of knowledge published as English text, for example in the PubMed database. At the same time, higher level collective knowledge is often published using a graphical notation representing all the entities in a pathway and their interactions. We believe that these pathway visualizations could serve as an effective user interface for knowledge discovery if they can be linked to the text in publications. Since the graphical elements in a Pathway are of a very different nature than their corresponding descriptions in English text, we developed a prototype system called PathText. The goal of PathText is to serve as a bridge between these two different representations. In this paper, we first describe the overall architecture and the interfaces of the PathText system, and then provide some details about the core Text Mining components.

pdf bib
Comparative Parser Performance Analysis across Grammar Frameworks through Automatic Tree Conversion using Synchronous Grammars
Takuya Matsuzaki | Jun’ichi Tsujii
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
A Discriminative Alignment Model for Abbreviation Recognition
Naoaki Okazaki | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Shift-Reduce Dependency DAG Parsing
Kenji Sagae | Jun’ichi Tsujii
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Modeling Latent-Dynamic in Shallow Parsing: A Latent Conditional Model with Improved Inference
Xu Sun | Louis-Philippe Morency | Daisuke Okanohara | Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Word Sense Disambiguation for All Words using Tree-Structured Conditional Random Fields
Jun Hatori | Yusuke Miyao | Jun’ichi Tsujii
Coling 2008: Companion volume: Posters

pdf bib
Exact Inference for Multi-label Classification using Sparse Graphical Models
Yusuke Miyao | Jun’ichi Tsujii
Coling 2008: Companion volume: Posters

pdf bib
Building a Bilingual Lexicon Using Phrase-based Statistical Machine Translation via a Pivot Language
Takashi Tsunakawa | Naoaki Okazaki | Jun’ichi Tsujii
Coling 2008: Companion volume: Posters

pdf bib
A Discriminative Candidate Generator for String Transformations
Naoaki Okazaki | Yoshimasa Tsuruoka | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Feature Forest Models for Probabilistic HPSG Parsing
Yusuke Miyao | Jun’ichi Tsujii
Computational Linguistics, Volume 34, Number 1, March 2008

pdf bib
Improving English-to-Chinese Translation for Technical Terms using Morphological Information
Xianchao Wu | Naoaki Okazaki | Takashi Tsunakawa | Jun’ichi Tsujii
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers

The continuous emergence of new technical terms and the difficulty of keeping up with neologism in parallel corpora deteriorate the performance of statistical machine translation (SMT) systems. This paper explores the use of morphological information to improve English-to-Chinese translation for technical terms. To reduce the morpheme-level translation ambiguity, we group the morphemes into morpheme phrases and propose the use of domain information for translation candidate selection. In order to find correspondences of morpheme phrases between the source and target languages, we propose an algorithm to mine morpheme phrase translation pairs from a bilingual lexicon. We also build a cascaded translation model that dynamically shifts translation units from phrase level to word and morpheme phrase levels. The experimental results show the significant improvements over the current phrase-based SMT systems.

2007

pdf bib
A discriminative language model with pseudo-negative samples
Daisuke Okanohara | Jun’ichi Tsujii
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
HPSG Parsing with Shallow Dependency Constraints
Kenji Sagae | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Development of a Japanese-Chinese machine translation system
Hitoshi Isahara | Sadao Kurohashi | Jun’ichi Tsujii | Kiyotaka Uchimoto | Hiroshi Nakagawa | Hiroyuki Kaji | Shun’ichi Kikuchi
Proceedings of Machine Translation Summit XI: Papers

bib
Proceedings of the Workshop on Patent translation
Jun’ichi Tsujii | Shoichi Yokoyama
Proceedings of the Workshop on Patent translation

pdf bib
Reranking for Biomedical Named-Entity Recognition
Kazuhiro Yoshida | Jun’ichi Tsujii
Biological, translational, and clinical language processing

pdf bib
Evaluating Impact of Re-training a Lexical Disambiguation Model on Domain Adaptation of an HPSG Parser
Tadayoshi Hara | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Tenth International Conference on Parsing Technologies

pdf bib
A log-linear model with an n-gram reference distribution for accurate HPSG parsing
Takashi Ninomiya | Takuya Matsuzaki | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Tenth International Conference on Parsing Technologies

pdf bib
Dependency Parsing and Domain Adaptation with LR Models and Parser Ensembles
Kenji Sagae | Jun’ichi Tsujii
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Linguistic and Biological Annotations of Biological Interaction Events
Tomoko Ohta | Yuka Tateisi | Jin-Dong Kim | Akane Yakushiji | Jun-ichi Tsujii
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper discusses an augmentation of a corpus ofresearch abstracts in biomedical domain (the GENIA corpus) with two kinds of annotations: tree annotation and event annotation. The tree annotation identifies the linguistic structure that encodes the relations among entities. The event annotation reveals the semantic structure of the biological interaction events encoded in the text. With these annotations we aim to provide a link between the clue and the target of biological event information extraction.

pdf bib
Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition
Daisuke Okanohara | Yusuke Miyao | Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Semantic Retrieval for the Accurate Identification of Relational Concepts in Massive Textbases
Yusuke Miyao | Tomoko Ohta | Katsuya Masuda | Yoshimasa Tsuruoka | Kazuhiro Yoshida | Takashi Ninomiya | Jun’ichi Tsujii
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Translating HPSG-Style Outputs of a Robust Parser into Typed Dynamic Logic
Manabu Sato | Daisuke Bekki | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
Trimming CFG Parse Trees for Sentence Compression Using Machine Learning Approaches
Yuya Unno | Takashi Ninomiya | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
An Intelligent Search Engine and GUI-based Efficient MEDLINE Search Tool Based on Deep Syntactic Parsing
Tomoko Ohta | Yusuke Miyao | Takashi Ninomiya | Yoshimasa Tsuruoka | Akane Yakushiji | Katsuya Masuda | Jumpei Takeuchi | Kazuhiro Yoshida | Tadayoshi Hara | Jin-Dong Kim | Yuka Tateisi | Jun’ichi Tsujii
Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions

pdf bib
Extremely Lexicalized Models for Accurate and Fast HPSG Parsing
Takashi Ninomiya | Takuya Matsuzaki | Yoshimasa Tsuruoka | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Automatic Construction of Predicate-argument Structure Patterns for Biomedical Information Extraction
Akane Yakushiji | Yusuke Miyao | Tomoko Ohta | Yuka Tateisi | Jun’ichi Tsujii
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Subdomain adaptation of a POS tagger with a small corpus
Yuka Tateisi | Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology

2005

pdf bib
Adapting a Probabilistic Disambiguation Model of an HPSG Parser to a New Domain
Tadayoshi Hara | Yusuke Miyao | Jun’ichi Tsujii
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
Assigning Polarity Scores to Reviews Using Machine Learning Techniques
Daisuke Okanohara | Jun’ichi Tsujii
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
Syntax Annotation for the GENIA Corpus
Yuka Tateisi | Akane Yakushiji | Tomoko Ohta | Jun’ichi Tsujii
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

pdf bib
A Machine Learning Approach to Acronym Generation
Yoshimasa Tsuruoka | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics

pdf bib
Probabilistic Models for Disambiguation of an HPSG-Based Chart Generator
Hiroko Nakanishi | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Ninth International Workshop on Parsing Technology

pdf bib
Efficacy of Beam Thresholding, Unification Filtering and Hybrid Parsing in Probabilistic HPSG Parsing
Takashi Ninomiya | Yoshimasa Tsuruoka | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Ninth International Workshop on Parsing Technology

pdf bib
Chunk Parsing Revisited
Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the Ninth International Workshop on Parsing Technology

pdf bib
Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data
Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
Probabilistic CFG with Latent Annotations
Takuya Matsuzaki | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
Probabilistic Disambiguation Models for Wide-Coverage HPSG Parsing
Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
Finding Anchor Verbs for Biomedical IE Using Predicate-Argument Structures
Akane Yakushiji | Yuka Tateisi | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the ACL Interactive Poster and Demonstration Sessions

pdf bib
Overview of the IWSLT evaluation campaign
Yasuhiro Akiba | Marcello Federico | Noriko Kando | Hiromi Nakaiwa | Michael Paul | Jun’ichi Tsujii
Proceedings of the First International Workshop on Spoken Language Translation: Evaluation Campaign

pdf bib
Generalizing Subcategorization Frames Acquired from Corpora Using Lexicalized Grammars
Naoki Yoshinaga | Jun’ichi Tsujii
Proceedings of the 7th International Workshop on Tree Adjoining Grammar and Related Formalisms

pdf bib
Context-free Approximation of LTAG towards CFG Filtering
Kenta Oouchida | Naoki Yoshinaga | Jun’ichi Tsujii
Proceedings of the 7th International Workshop on Tree Adjoining Grammar and Related Formalisms

pdf bib
Deep Linguistic Analysis for the Accurate Identification of Predicate-Argument Relations
Yusuke Miyao | Jun’ichi Tsujii
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Thesaurus or Logical Ontology, Which do we Need for Mining Text?
Junichi Tsujii
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Part-of-Speech Annotation of Biology Research Abstracts
Yuka Tateisi | Jun-ichi Tsujii
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
A Robust Retrieval Engine for Proximal and Structural Search
Katsuya Masuda | Takashi Ninomiya | Yusuke Miyao | Tomoko Ohta | Jun’ichi Tsujii
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers

pdf bib
Self-Organizing Markov Models and Their Application to Part-of-Speech Tagging
Jin-Dong Kim | Hae-Chang Rim | Jun’ichi Tsujii
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
A Debug Tool for Practical Grammar Development
Akane Yakushiji | Yuka Tateisi | Yusuke Miyao | Naoki Yoshinaga | Jun’ichi Tsujii
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
Comparison between CFG Filtering Techniques for LTAG and HPSG
Naoki Yoshinaga | Kentaro Torisawa | Jun’ichi Tsujii
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
Lexicalized Grammar Acquisition
Yusuke Miyao | Takashi Ninomiya | Jun’ichi Tsujii
10th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
A model of syntactic disambiguation based on lexicalized grammars
Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

pdf bib
An efficient clustering algorithm for class-based language models
Takuya Matsuzaki | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

pdf bib
Training a Naive Bayes Classifier via the EM Algorithm with a Class Distribution Constraint
Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

pdf bib
Evaluation and Extension of Maximum Entropy Models with Inequality Constraints
Jun’ichi Kazama | Jun’ichi Tsujii
Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing

pdf bib
Boosting Precision and Recall of Dictionary-Based Protein Name Recognition
Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine

pdf bib
Encoding Biomedical Resources in TEI: The Case of the GENIA Corpus
Tomaz Erjavec | Jin-Dong Kim | Tomoko Ohta | Yuka Tateisi | Jun’ichi Tsujii
Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine

pdf bib
Stretching TEI: Converting the Genia Corpus
Tomaz Erjavec | Jin-Dong Kim | Tomoko Ohta | Yuka Tateisi | Jun-ichi Tsujii
Proceedings of 4th International Workshop on Linguistically Interpreted Corpora (LINC-03) at EACL 2003

2002

pdf bib
Tuning support vector machines for biomedical named entity recognition
Jun’ichi Kazama | Takaki Makino | Yoshihiro Ohta | Jun’ichi Tsujii
Proceedings of the ACL-02 Workshop on Natural Language Processing in the Biomedical Domain

pdf bib
A Formal Proof of Strong Equivalence for a Grammar Conversion from LTAG to HPSG-style
Naoki Yoshinaga | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Sixth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+6)

pdf bib
Clustering for obtaining syntactic classes of words from automatically extracted LTAG grammars
Tadayoshi Hara | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Sixth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+6)

pdf bib
A Methodology for Terminology-based Knowledge Acquisition and Integration
Hideki Mima | Sophia Ananiadou | Goran Nenadic | Jun-Ichi Tsujii
COLING 2002: The 19th International Conference on Computational Linguistics

pdf bib
Lenient Default Unification for Robust Processing within Unification Based Grammar Formalisms
Takashi Ninomiya | Yusuke Miyao | Jun-Ichi Tsujii
COLING 2002: The 19th International Conference on Computational Linguistics

pdf bib
An Indexing Scheme for Typed Feature Structures
Takashi Ninomiya | Takaki Makino | Jun-Ichi Tsujii
COLING 2002: The 17th International Conference on Computational Linguistics: Project Notes

2001

pdf bib
Resource Sharing Amongst HPSG and LTAG Communities by a Method of Grammar Conversion between FB-LTAG and HPSG
Naoki Yoshinaga | Yusuke Miyao | Kentaro Torisawa | Jun’ichi Tsujii
Proceedings of the ACL 2001 Workshop on Sharing Tools and Resources

2000

pdf bib
Comparison between Tagged Corpora for the Named Entity Task
Chikashi Nobata | Nigel Collier | Jun’ichi Tsujii
The Workshop on Comparing Corpora

pdf bib
Building an Annotated Corpus in the Molecular-Biology Domain
Yuka Tateisi | Tomoko Ohta | Nigel Collier | Chikashi Nobata | Jun-ichi Tsujii
Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content

pdf bib
Invited Talk: Generic NLP Technologies: Language, Knowledge and Information Extraction
Jun’ichi Tsujii
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics

pdf bib
Part-of-Speech Tagging Based on Hidden Markov Model Assuming Joint Independence
Sang-Zoo Lee | Jun’ichi Tsujii | Hae-Chang Rim
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics

pdf bib
Difficulty Indices for the Named Entity Task in Japanese
Chikashi Nobata | Satoshi Sekine | Jun’ichi Tsujii
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics

pdf bib
Hidden Markov Model-Based Korean Part-of-Speech Tagging Considering High Agglutinativity, Word-Spacing, and Lexical Correlativity
Sang-Zoo Lee | Jun’ichi Tsujii | Hae-Chang Rim
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics

pdf bib
Extracting the Names of Genes and Gene Products with a Hidden Markov Model
Nigel Collier | Chikashi Nobata | Jun-ichi Tsujii
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

pdf bib
A Method of Measuring Term Representativeness - Baseline Method Using Co-occurrence Distribution
Tom Hisamitsu | Yoshiki Niwa | Jun-ichi Tsujii
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

pdf bib
A Hybrid Japanese Parser with Hand-crafted Grammar and Statistics
Hiroshi Kanayama | Kentaro Torisawa | Yutaka Mitsuishi | Jun’ichi Tsujii
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

pdf bib
Lexicalized Hidden Markov Models for Part-of-Speech Tagging
Sang-Zoo Lee | Jun-ichi Tsujii | Hae-Chang Rim
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

1999

pdf bib
Machine translation for the next century
Jun-ichi Tsujii
Proceedings of Machine Translation Summit VII

The panel intends to pick up some of the issues discussed in the Summit and discuss them further in the final session from broader perspectives. Since the Summit has not even started yet, I will just enumerate in this paper a list of possible perspectives on MT that I hope are relevant to our discussion.

pdf bib
Transfer in experience-guided machine translation
Gang Zhao | Junichi Tsujii
Proceedings of Machine Translation Summit VII

Experience-Guided Machine Translation (EGMT) seeks to represent the translators' knowledge of translation as experiences and translates by analogy. The transfer in EGMT finds the experiences most similar to a new text and its parts, segments it into units of translation and translates them by analogy to the experiences and then assembles them into a whole. A research prototype of analogical transfer from Chinese to English is built to prove the viability of the approach in the exploration of new architecture of machine translation. The paper discusses how the experiences are represented and selected with respect to a new text. It describes how units of translation are defined, partial translation is derived and composed into a whole.

pdf bib
The GENIA project: corpus-based knowledge acquisition and information extraction from genome research papers
Nigel Collier | Hyun Seok Park | Norihiro Ogata | Yuka Tateishi | Chikashi Nobata | Tomoko Ohta | Tateshi Sekimizu | Hisao Imai | Katsutoshi Ibushi | Jun-ichi Tsujii
Ninth Conference of the European Chapter of the Association for Computational Linguistics

1998

pdf bib
LiLFeS - Towards a Practical HPSG Parser
Takaki Makino | Minoru Yoshida | Kentaro Torisawa | Jun’ichi Tsujii
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf bib
HPSG-Style Underspecified Japanese Grammar with Wide Coverage
Yutaka Mitsuishi | Kentaro Torisawa | Jun’ichi Tsujii
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf bib
An Efficient Parallel Substrate for Typed Feature Structures on Shared Memory Parallel Machines
Takashi Ninomiya | Kentaro Torisawa | Jun’ichi Tsujii
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf bib
LiLFeS- Towards a Practical HPSG Parser
Takaki Makino | Minoru Yoshida | Kentaro Torisawa | Jun’ichi Tsujii
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf bib
HPSG-Style Underspecified Japanese Grammar with Wide Coverage
Yutaka Mitsuishi | Kentaro Torisawa | Jun’ichi Tsujii
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf bib
An Efficient Parallel Substrate for Typed Feature Structures on Shared Memory Parallel Machines
Takashi Ninomiya | Kentaro Torisawa | Jun’ichi Tsujii
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf bib
Packing of feature structures for optimizing the HPSG-style grammar translated from TAG
Yusuke Miyao | Kentaro Torisawa | Yuka Tateisi | Jun’ichi Tsujii
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)

pdf bib
Translating the XTAG English grammar to HPSG
Yuka Tateisi | Kentaro Torisawa | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)

1996

pdf bib
Computing Phrasal-signs in HPSG prior to Parsing
Kentaro Torisawa | Jun’ichi Tsujii
COLING 1996 Volume 2: The 16th International Conference on Computational Linguistics

1995

pdf bib
An HPSG-based Parser for Automatic Knowledge Acquisition
Kentaro Torisawa | Jun’ichi Tsujii
Proceedings of the Fourth International Workshop on Parsing Technologies

1994

pdf bib
Automatic Recognition of Verbal Polysemy
Fumiyo Fukumoto | Jun’ichi Tsujii
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics

pdf bib
Hypothesis Selection in Grammar Acquisition
Masaki Kiyono | Jun’ichi Tsujii
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics

pdf bib
Breaking Down Rhetorical Relations for the purpose of Analysing Discourse Structures
Jun’ichi Fukumoto | Jun’ichi Tsujii
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics

pdf bib
Combination of Symbolic and Statistical Approaches for Grammatical Knowledge Acquisition
Masaki Kiyono | Jun’ichi Tsujii
Fourth Conference on Applied Natural Language Processing

pdf bib
A Computational View of the Cognitive Semantics of Spatial Prepositions
Patrick Olivier | Jun-ichi Tsujii
32nd Annual Meeting of the Association for Computational Linguistics

1993

pdf bib
Treatment of Tense and Aspect in Translation from Italian to Greek — An Example of Treatment of Implicit Information in Knowledge-based Transfer MT
Margherita Antona | Jun-ichi Tsujii
Proceedings of the Fifth Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

pdf bib
— An Example of Treatment of Implicit Information in Knowledge-based Transfer MT
Margherita Antona | Jun-ichi Tsujii
Proceedings of the Fifth Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

pdf bib
After Linguistics-based MT
Junichi Tsujii
Proceedings of Machine Translation Summit IV

pdf bib
Linguistic Knowledge Acquisition from Parsing Failures
Masaki Kiyono | Jun-ichi Tsujii
Sixth Conference of the European Chapter of the Association for Computational Linguistics

1992

pdf bib
Automatic Learning for Semantic Collocation
Satoshi Sekine | Jeremy J. Carroll | Sofia Ananiadou | Jun’ichi Tsujii
Third Conference on Applied Natural Language Processing

pdf bib
Interaction between Structural Changes in Machine Translation
Satoshi Kinoshita | John Phillips | Jun-ichi Tsujii
Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992

pdf bib
Linguistic Knowledge Generator
Satoshi Sekine | Sofia Ananiadou | Jeremy J. Carroll | Jun’ichi Tsujii
COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics

pdf bib
Interaction between Structural Changes in Machine Translation
Satoshi Kinoshita | John Phillips | Jun-ichi Tsujii
COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics

1991

pdf bib
Lexical Transfer based on bilingual signs: Towards interaction during transfer
Jun-ich Tsujii | Kimikazu Fujita
Fifth Conference of the European Chapter of the Association for Computational Linguistics

1990

pdf bib
Machine Translation without a source text
Harold L. Somers | Jun-ichi Tsujii | Danny Jones
COLING 1990 Volume 3: Papers presented to the 13th International Conference on Computational Linguistics

pdf bib
Machine Translation and Machine-Aided Translation - What’s going on
Jun-ichi Tsujii
Proceedings of Translating and the Computer 12: Applying technology to the translation process

1988

pdf bib
Reasons why I do not care grammar formalism
Jun-ichi Tsujii
Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics

pdf bib
How to Get Preferred Readings in Natural Language Analysis
Jun-ichi Tsujii | Yukiyoshi Muto | Yuuji Ikeda | Makoto Nagao
Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics

pdf bib
Dialogue Translation vs. Text Translation
Jun-ichi Tsujii | Makoto Nagao
Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics

1987

pdf bib
What is ‘PIVOT’?
Jun-ichi Tsujii
Proceedings of Machine Translation Summit I

pdf bib
The Current Stage of the Mu-Project
Jun-ichi Tsujii
Proceedings of Machine Translation Summit I

1986

pdf bib
The Transfer Phase of the Mu Machine Translation System
Hakoto Nagao | Jun-ichi Tsujii
Coling 1986 Volume 1: The 11th International Conference on Computational Linguistics

pdf bib
Solutions for Problems of MT Parser - Methods Used in Mu-Machine Translation Project -
Jun-ichi Nakamura | Jun-ichi Tsujii | Makoto Nagao
Coling 1986 Volume 1: The 11th International Conference on Computational Linguistics

pdf bib
Future Directions of Machine Translation
Jun-ichi Tsujii
Coling 1986 Volume 1: The 11th International Conference on Computational Linguistics

1985

pdf bib
The Japanese Government Project for Machine Translation
Makoto Nagao | Jun-ichi Tsujii | Jun-ichi Nakamura
Computational Linguistics Formerly the American Journal of Computational Linguistics, Volume 11, Number 2-3, April-September 1985

1984

pdf bib
Analysis Grammar of Japanese in the Mu-project - A Procedural Approach to Analysis Grammar
Jun-ichi Tsujii | Jun-ichi Nakamura | Makoto Nagao
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics

pdf bib
Grammar Writing System (GRADE) of Mu-Machine Translation Project and its Characteristics
Jun-ichi Nakamura | Jun-ichi Tsujii | Makoto Nagao
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics

pdf bib
Dealing With Incompleteness of Linguistic Knowledge in Language Translation – Transfer and Generation Stage of Mu Machine Translation Project
Makoto Nagao | Toyoaki Nishida | Jun-ichi Tsujii
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics

1982

pdf bib
An English Japanese Machine Translation System of the Titles of Scientific and Engineering Papers
Makoto Nagao | Jun-ichi Tsujii | Koji Yada | Toshihiro Kakimoto
Coling 1982: Proceedings of the Ninth International Conference on Computational Linguistics

pdf bib
The Transfer Phase In an English-Japanese Translation System
Jun-ichi Tsujii
Coling 1982: Proceedings of the Ninth International Conference on Computational Linguistics

1980

pdf bib
A Machine Translation System From Japanese Into English - Another Perspective of MT Systems -
M. Nagao | J. Tsujii | K. Mitamura | H. Hirakawa | M. Kume
COLING 1980 Volume 1: The 8th International Conference on Computational Linguistics

pdf bib
An Attempt to Computerized Dictionary Data Bases
M. Nagao | J. Tsujii | Y. Ueda | M. Takiyama
COLING 1980 Volume 1: The 8th International Conference on Computational Linguistics

1976

pdf bib
PLATON--A New Programming Language for Natural Language Analysis
Makoto Nagao | Jun-Ichi Tsujii
American Journal of Computational Linguistics (February 1976)

pdf bib
Analysis of Japanese Sentences
Makoto Nagao | Jun-Ichi Tsujii
American Journal of Computational Linguistics (February 1976)

Search
Co-authors