Kyoko Kanzaki - ACL Anthology

Kyoko Kanzaki

2020

Improving Semantic Similarity Calculation of Japanese Text for MT Evaluation
Yuki Tanahashi | Kyoko Kanzaki | Eiko Yamamoto | Hitoshi Isahara
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation

2019

Towards linking synonymous expressions of compound verbs to Japanese WordNet
Kyoko Kanzaki | Hitoshi Isahara
Proceedings of the 10th Global Wordnet Conference

This paper describes our project on Japanese compound verbs. Japanese “Verb (adnominal form) + Verb” compounds, which are treated as single verbs, frequently appear in daily communication. They are not sufficiently registered in Japanese dictionaries or thesauri. We are now compiling a list of the synonymous expressions of compound verbs in “compound verb lexicon” built by the National Institute of Japanese Language and Linguistics. We extracted synonymous words and phrases of compound verbs from five hundred million Japanese web corpora. As a result, synonymous expressions of 1800 compound verbs were obtained automatically among 2700 in the “compound verb lexicon”. From our data, we observed that some compound verbs represent not only motion but also additional nuances such as an emotional one. In order to reflect the abundant meanings that compound verbs own, we will try to think of a link of synonymous expressions to Japanese wordnet. Concretely, in the case of synonymous phrases, we try to link adverbial expressions which are a part of phrases to the adverbial synset in Japanese wordnet.

2018

Building a List of Synonymous Words and Phrases of Japanese Compound Verbs
Kyoko Kanzaki | Hitoshi Isahara
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2014

Fusion of Multiple Semantic Networks and Human Association
Hitoshi Isahara | Kyoko Kanzaki | Eiko Yamamoto | Takayuki Kuribayashi | Michinaga Otsuka
Proceedings of the Seventh Global Wordnet Conference

2009

Word Segmentation Standard in Chinese, Japanese and Korean
Key-Sun Choi | Hitoshi Isahara | Kyoko Kanzaki | Hansaem Kim | Seok Mun Pak | Maosong Sun
Proceedings of the 7th Workshop on Asian Language Resources (ALR7)

Enhancing the Japanese WordNet
Francis Bond | Hitoshi Isahara | Sanae Fujita | Kiyotaka Uchimoto | Takayuki Kuribayashi | Kyoko Kanzaki
Proceedings of the 7th Workshop on Asian Language Resources (ALR7)

2008

The “Close-Distant” Relation of Adjectival Concepts Based on Self-Organizing Map
Kyoko Kanzaki | Noriko Tomuro | Hitoshi Isahara
Coling 2008: Proceedings of the Workshop on Cognitive Aspects of the Lexicon (COGALEX 2008)

The 2008 MedSLT System
Manny Rayner | Pierrette Bouillon | Jane Brotanek | Glenn Flores | Sonia Halimi | Beth Ann Hockey | Hitoshi Isahara | Kyoko Kanzaki | Elisabeth Kron | Yukie Nakao | Marianne Santaholma | Marianne Starlander | Nikos Tsourakis
Coling 2008: Proceedings of the workshop on Speech Processing for Safety Critical Translation and Pervasive Applications

Extraction of Attribute Concepts from Japanese Adjectives
Kyoko Kanzaki | Francis Bond | Noriko Tomuro | Hitoshi Isahara
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe various syntactic and semantic conditions for finding abstractnouns which refer to concepts of adjectives from a text, in an attempt to explore the creation of a thesaurus from text. Depending on usages, six kinds of syntactic patterns are shown. In the syntactic and semantic conditions an omission of an abstract noun is mainly used, but in addition, various linguistic clues are needed. We then compare our results with synsets of Japanese WordNet. From a viewpoint of Japanese WordNet, the degree of agreement of ?Attribute? between our data and Japanese WordNet was 22%. On the other hand, the total number of differences of obtained abstract nouns was 267. From a viewpoint of our data,the degree of agreement of abstract nouns between our data and Japanese WordNet was 54%.

Developing Non-European Translation Pairs in a Medium-Vocabulary Medical Speech Translation System
Pierrette Bouillon | Sonia Halimi | Yukie Nakao | Kyoko Kanzaki | Hitoshi Isahara | Nikos Tsourakis | Marianne Starlander | Beth Ann Hockey | Manny Rayner
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe recent work on MedSLT, a medium-vocabulary interlingua-based medical speech translation system, focussing on issues that arise when handling languages of which the grammar engineer has little or no knowledge. We show how we can systematically create and maintain multiple forms of grammars, lexica and interlingual representations, with some versions being used by language informants, and some by grammar engineers. In particular, we describe the advantages of structuring the interlingua definition as a simple semantic grammar, which includes a human-readable surface form. We show how this allows us to rationalise the process of evaluating translations between languages lacking common speakers, and also makes it possible to create a simple generic tool for debugging to-interlingua translation rules. Examples presented focus on the concrete case of translation between Japanese and Arabic in both directions.

Development of the Japanese WordNet
Hitoshi Isahara | Francis Bond | Kiyotaka Uchimoto | Masao Utiyama | Kyoko Kanzaki
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

After a long history of compilation of our own lexical resources, EDR Japanese/English Electronic Dictionary, and discussions with major players on development of various WordNets, Japanese National Institute of Information and Communications Technology started developing the Japanese WordNet in 2006 and will publicly release the first version, which includes both the synset in Japanese and the annotated Japanese corpus of SemCor, in June 2008. As the first step in compiling the Japanese WordNet, we added Japanese equivalents to synsets of the Princeton WordNet. Of course, we must also add some synsets which do not exist in the Princeton WordNet, and must modify synsets in the Princeton WordNet, in order to make the hierarchical structure of Princeton synsets represent thesaurus-like information found in the Japanese language, however, we will address these tasks in a future study. We then translated English sentences which are used in the SemCor annotation into Japanese and annotated them using our Japanese WordNet. This article describes the overview of our project to compile Japanese WordNet and other resources which relate to our Japanese WordNet.

We outline work performed within the framework of a current EC project. The goal is to construct a language-independent information system for a specific domain (environment/ecology/biodiversity) anchored in a language-independent ontology that is linked to wordnets in seven languages. For each language, information extraction and identification of lexicalized concepts with ontological entries is carried out by text miners (“Kybots”). The mapping of language-specific lexemes to the ontology allows for crosslinguistic identification and translation of equivalent terms. The infrastructure developed within this project enables long-range knowledge sharing and transfer across many languages and cultures, addressing the need for global and uniform transition of knowledge beyond the specific domains addressed here.

Boot-Strapping a WordNet Using Multiple Existing WordNets
Francis Bond | Hitoshi Isahara | Kyoko Kanzaki | Kiyotaka Uchimoto
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we describe the construction of an illustrated Japanese Wordnet. We bootstrap the Wordnet using existing multiple existing wordnets in order to deal with the ambiguity inherent in translation. We illustrate it with pictures from the Open Clip Art Library.

Many-to-Many Multilingual Medical Speech Translation on a PDA
Kyoko Kanzaki | Yukie Nakao | Manny Rayner | Marianne Santaholma | Marianne Starlander | Nikos Tsourakis
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Government and Commercial Uses of MT

Particularly considering the requirement of high reliability, we argue that the most appropriate architecture for a medical speech translator that can be realised using today’s technology combines unidirectional (doctor to patient) translation, medium-vocabulary controlled language coverage, interlingua-based translation, an embedded help component, and deployability on a hand-held hardware platform. We present an overview of the Open Source MedSLT prototype, which has been developed in accordance with these design principles. The system is implemented on top of the Regulus and Nuance 8.5 platforms, translates patient examination questions for all language pairs in the set {English, French, Japanese, Arabic, Catalan}, using vocabularies of about 400 to 1 100 words, and can be run in a distributed client/server environment, where the client application is hosted on a Nokia Internet Tablet device.

2006

MedSLT: A Limited-Domain Unidirectional Grammar-Based Medical Speech Translator
Manny Rayner | Pierrette Bouillon | Nikos Chatzichrisafis | Marianne Santaholma | Marianne Starlander | Beth Ann Hockey | Yukie Nakao | Hitoshi Isahara | Kyoko Kanzaki
Proceedings of the First International Workshop on Medical Speech Translation

Detection of inconsistencies in concept classifications in a large dictionary — Toward an improvement of the EDR electronic dictionary —
Eiko Yamamoto | Kyoko Kanzaki | Hitoshi Isahara
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

The EDR electronic dictionary is a machine-tractable dictionary developed for advanced computer-based processing of natural lan-guage. This dictionary comprises eleven sub-dictionaries, including a concept dictionary, word dictionaries, bilingual dictionaries, co-occurrence dictionaries, and a technical terminology dictionary. In this study, we focus on the concept dictionary and aim to revise the arrangement of concepts for improving the EDR electronic dictionary. We believe that unsuitable concepts in a class differ from other concepts in the same class from an abstract perspective. From this notion, we first try to automatically extract those concepts unsuited to the class. We then try semi-automatically to amend the concept explications used to explain the meanings to human users and rearrange them in suitable classes. In the experiment, we try to revise those concepts that are the lower-concepts of the concept “human” in the concept hierarchy and that are directly arranged under concepts with concept explications such as “person as defined by –” and “person viewed from –.” We analyze the result and evaluate our approach.

Semantic Analysis of Abstract Nouns to Compile a Thesaurus of Adjectives
Kyoko Kanzaki | Qing Ma | Eiko Yamamoto | Hitoshi Isahara
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Aiming to compile a thesaurus of adjectives, we discuss how to extract abstract nouns categorizing adjectives, clarify the semantic and syntactic functions of these abstract nouns, and manually evaluate the capability to extract the “instance-category” relations. We focused on some Japanese syntactic structures and utilized possibility of omission of abstract noun to decide whether or not a semantic relation between an adjective and an abstract noun is an “instance-category” relation. For 63% of the adjectives (57 groups/90 groups) in our experiments, our extracted categories were found to be most suitable. For 22 % of the adjectives (20/90), the categories in the EDR lexicon were found to be most suitable. For 14% of the adjectives (13/90), neither our extracted categories nor those in EDR were found to be suitable, or examinees’ own categories were considered to be more suitable. From our experimental results, we found that the correspondence between a group of adjectives and their category name was more suitable in our method than in the EDR lexicon.

2005

Japanese Speech Understanding using Grammar Specialization
Manny Rayner | Nikos Chatzichrisafis | Pierrette Bouillon | Yukie Nakao | Hitoshi Isahara | Kyoko Kanzaki | Beth Ann Hockey | Marianne Santaholma | Marianne Starlander
Proceedings of HLT/EMNLP 2005 Interactive Demonstrations

Practicing Controlled Language through a Help System integrated into the Medical Speech Translation System (MedSLT)
Marianne Starlander | Pierrette Bouillon | Nikos Chatzichrisafis | Marianne Santaholma | Manny Rayner | Beth Ann Hockey | Hitoshi Isahara | Kyoko Kanzaki | Yukie Nakao
Proceedings of Machine Translation Summit X: Papers

In this paper, we present evidence that providing users of a speech to speech translation system for emergency diagnosis (MedSLT) with a tool that helps them to learn the coverage greatly improves their success in using the system. In MedSLT, the system uses a grammar-based recogniser that provides more predictable results to the translation component. The help module aims at addressing the lack of robustness inherent in this type of approach. It takes as input the result of a robust statistical recogniser that performs better for out-of-coverage data and produces a list of in-coverage example sentences. These examples are selected from a defined list using a heuristic that prioritises sentences maximising the number of N-grams shared with those extracted from the recognition result.

A generic multi-lingual open source platform for limited-domain medical speech translation
Pierrette Bouillon | Manny Rayner | Nikos Chatzichrisafis | Beth Ann Hockey | Marianne Santaholma | Marianne Starlander | Yukie Nakao | Kyoko Kanzaki | Hitoshi Isahara
Proceedings of the 10th EAMT Conference: Practical applications of machine translation

2004

Hierarchy Extraction based on Inclusion of Appearance
Eiko Yamamoto | Kyoko Kanzaki | Hitoshi Isahara
Proceedings of the ACL Interactive Poster and Demonstration Sessions

Extraction of Hyperonymy of Adjectives from Large Corpora by Using the Neural Network Model
Kyoko Kanzaki | Qing Ma | Eiko Yamamoto | Masaki Murata | Hitoshi Isahara
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

Construction of an Objective Hierarchy of Abstract Concepts via Directional Similarity
Kyoko Kanzaki | Eiko Yamamoto | Qing Ma | Hitoshi Isahara
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

Extraction and Verification of KO-OU Expressions from Large Corpora
Atsuko Kida | Eiko Yamamoto | Kyoko Kanzaki | Hitoshi Isahara
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

A Limited-Domain English to Japanese Medical Speech Translator Built Using REGULUS 2
Manny Rayner | Pierrette Bouillon | Vol Van Dalsem III | Hitoshi Isahara | Kyoko Kanzaki | Beth Ann Hockey
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

2002

Classification of Adjectival and Non-adjectival Nouns Based on their Semantic Behavior by Using a Self-Organizing Semantic Map
Kyoko Kanzaki | Qing Ma | Masaki Murata | Hitoshi Isahara
COLING-02: SEMANET: Building and Using Semantic Networks

2000

Similarities and Differences among Semantic Behaviors of Japanese Adnominal Constituents
Kyoko Kanzaki | Qing Ma | Hitoshi Isahara
NAACL-ANLP 2000 Workshop: Syntactic and Semantic Complexity in Natural Language Processing Systems

1999

Lexical Semantics to Disambiguate Polysemous Phenomena of Japanese Adnominal Constituents
Hitoshi Isahara | Kyoko Kanzaki
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics

Venues