Manabu Okumura

Also published as: Manabu Okumara


2021

pdf bib
Generating Weather Comments from Meteorological Simulations
Soichiro Murakami | Sora Tanaka | Masatsugu Hangyo | Hidetaka Kamigaito | Kotaro Funakoshi | Hiroya Takamura | Manabu Okumura
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

The task of generating weather-forecast comments from meteorological simulations has the following requirements: (i) the changes in numerical values for various physical quantities need to be considered, (ii) the weather comments should be dependent on delivery time and area information, and (iii) the comments should provide useful information for users. To meet these requirements, we propose a data-to-text model that incorporates three types of encoders for numerical forecast maps, observation data, and meta-data. We also introduce weather labels representing weather information, such as sunny and rain, for our model to explicitly describe useful information. We conducted automatic and human evaluations. The results indicate that our model performed best against baselines in terms of informativeness. We make our code and data publicly available.

pdf bib
Metric-Type Identification for Multi-Level Header Numerical Tables in Scientific Papers
Lya Hulliyyatus Suadaa | Hidetaka Kamigaito | Manabu Okumura | Hiroya Takamura
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Numerical tables are widely used to present experimental results in scientific papers. For table understanding, a metric-type is essential to discriminate numbers in the tables. We introduce a new information extraction task, metric-type identification from multi-level header numerical tables, and provide a dataset extracted from scientific papers consisting of header tables, captions, and metric-types. We then propose two joint-learning neural classification and generation schemes featuring pointer-generator-based and BERT-based models. Our results show that the joint models can handle both in-header and out-of-header metric-type identification problems.

pdf bib
One-class Text Classification with Multi-modal Deep Support Vector Data Description
Chenlong Hu | Yukun Feng | Hidetaka Kamigaito | Hiroya Takamura | Manabu Okumura
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

This work presents multi-modal deep SVDD (mSVDD) for one-class text classification. By extending the uni-modal SVDD to a multiple modal one, we build mSVDD with multiple hyperspheres, that enable us to build a much better description for target one-class data. Additionally, the end-to-end architecture of mSVDD can jointly handle neural feature learning and one-class text learning. We also introduce a mechanism for incorporating negative supervision in the absence of real negative data, which can be beneficial to the mSVDD model. We conduct experiments on Reuters and 20 Newsgroup datasets, and the experimental results demonstrate that mSVDD outperforms uni-modal SVDD and mSVDD can get further improvements when negative supervision is incorporated.

pdf bib
A New Surprise Measure for Extracting Interesting Relationships between Persons
Hidetaka Kamigaito | Jingun Kwon | Young-In Song | Manabu Okumura
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

One way to enhance user engagement in search engines is to suggest interesting facts to the user. Although relationships between persons are important as a target for text mining, there are few effective approaches for extracting the interesting relationships between persons. We therefore propose a method for extracting interesting relationships between persons from natural language texts by focusing on their surprisingness. Our method first extracts all personal relationships from dependency trees for the texts and then calculates surprise scores for distributed representations of the extracted relationships in an unsupervised manner. The unique point of our method is that it does not require any labeled dataset with annotation for the surprising personal relationships. The results of the human evaluation show that the proposed method could extract more interesting relationships between persons from Japanese Wikipedia articles than a popularity-based baseline method. We demonstrate our proposed method as a chrome plugin on google search.

pdf bib
Improving Neural RST Parsing Model with Silver Agreement Subtrees
Naoki Kobayashi | Tsutomu Hirao | Hidetaka Kamigaito | Manabu Okumura | Masaaki Nagata
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Most of the previous Rhetorical Structure Theory (RST) parsing methods are based on supervised learning such as neural networks, that require an annotated corpus of sufficient size and quality. However, the RST Discourse Treebank (RST-DT), the benchmark corpus for RST parsing in English, is small due to the costly annotation of RST trees. The lack of large annotated training data causes poor performance especially in relation labeling. Therefore, we propose a method for improving neural RST parsing models by exploiting silver data, i.e., automatically annotated data. We create large-scale silver data from an unlabeled corpus by using a state-of-the-art RST parser. To obtain high-quality silver data, we extract agreement subtrees from RST trees for documents built using the RST parsers. We then pre-train a neural RST parser with the obtained silver data and fine-tune it on the RST-DT. Experimental results show that our method achieved the best micro-F1 scores for Nuclearity and Relation at 75.0 and 63.2, respectively. Furthermore, we obtained a remarkable gain in the Relation score, 3.0 points, against the previous state-of-the-art parser.

pdf bib
An Empirical Study of Generating Texts for Search Engine Advertising
Hidetaka Kamigaito | Peinan Zhang | Hiroya Takamura | Manabu Okumura
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers

Although there are many studies on neural language generation (NLG), few trials are put into the real world, especially in the advertising domain. Generating ads with NLG models can help copywriters in their creation. However, few studies have adequately evaluated the effect of generated ads with actual serving included because it requires a large amount of training data and a particular environment. In this paper, we demonstrate a practical use case of generating ad-text with an NLG model. Specially, we show how to improve the ads’ impact, deploy models to a product, and evaluate the generated ads.

pdf bib
Towards Table-to-Text Generation with Numerical Reasoning
Lya Hulliyyatus Suadaa | Hidetaka Kamigaito | Kotaro Funakoshi | Manabu Okumura | Hiroya Takamura
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Recent neural text generation models have shown significant improvement in generating descriptive text from structured data such as table formats. One of the remaining important challenges is generating more analytical descriptions that can be inferred from facts in a data source. The use of a template-based generator and a pointer-generator is among the potential alternatives for table-to-text generators. In this paper, we propose a framework consisting of a pre-trained model and a copy mechanism. The pre-trained models are fine-tuned to produce fluent text that is enriched with numerical reasoning. However, it still lacks fidelity to the table contents. The copy mechanism is incorporated in the fine-tuning step by using general placeholders to avoid producing hallucinated phrases that are not supported by a table while preserving high fluency. In summary, our contributions are (1) a new dataset for numerical table-to-text generation using pairs of a table and a paragraph of a table description with richer inference from scientific papers, and (2) a table-to-text generation framework enriched with numerical reasoning.

pdf bib
A Case Study of In-House Competition for Ranking Constructive Comments in a News Service
Hayato Kobayashi | Hiroaki Taguchi | Yoshimune Tabuchi | Chahine Koleejan | Ken Kobayashi | Soichiro Fujita | Kazuma Murao | Takeshi Masuyama | Taichi Yatsuka | Manabu Okumura | Satoshi Sekine
Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media

Ranking the user comments posted on a news article is important for online news services because comment visibility directly affects the user experience. Research on ranking comments with different metrics to measure the comment quality has shown “constructiveness” used in argument analysis is promising from a practical standpoint. In this paper, we report a case study in which this constructiveness is examined in the real world. Specifically, we examine an in-house competition to improve the performance of ranking constructive comments and demonstrate the effectiveness of the best obtained model for a commercial service.

pdf bib
Fusing Label Embedding into BERT: An Efficient Improvement for Text Classification
Yijin Xiong | Yukun Feng | Hao Wu | Hidetaka Kamigaito | Manabu Okumura
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Pointing to Subwords for Generating Function Names in Source Code
Shogo Fujita | Hidetaka Kamigaito | Hiroya Takamura | Manabu Okumura
Proceedings of the 28th International Conference on Computational Linguistics

We tackle the task of automatically generating a function name from source code. Existing generators face difficulties in generating low-frequency or out-of-vocabulary subwords. In this paper, we propose two strategies for copying low-frequency or out-of-vocabulary subwords in inputs. Our best performing model showed an improvement over the conventional method in terms of our modified F1 and accuracy on the Java-small and Java-large datasets.

pdf bib
Neural text normalization leveraging similarities of strings and sounds
Riku Kawamura | Tatsuya Aoki | Hidetaka Kamigaito | Hiroya Takamura | Manabu Okumura
Proceedings of the 28th International Conference on Computational Linguistics

We propose neural models that can normalize text by considering the similarities of word strings and sounds. We experimentally compared a model that considers the similarities of both word strings and sounds, a model that considers only the similarity of word strings or of sounds, and a model without the similarities as a baseline. Results showed that leveraging the word string similarity succeeded in dealing with misspellings and abbreviations, and taking into account the sound similarity succeeded in dealing with phonetic substitutions and emphasized characters. So that the proposed models achieved higher F1 scores than the baseline.

pdf bib
Hierarchical Trivia Fact Extraction from Wikipedia Articles
Jingun Kwon | Hidetaka Kamigaito | Young-In Song | Manabu Okumura
Proceedings of the 28th International Conference on Computational Linguistics

Recently, automatic trivia fact extraction has attracted much research interest. Modern search engines have begun to provide trivia facts as the information for entities because they can motivate more user engagement. In this paper, we propose a new unsupervised algorithm that automatically mines trivia facts for a given entity. Unlike previous studies, the proposed algorithm targets at a single Wikipedia article and leverages its hierarchical structure via top-down processing. Thus, the proposed algorithm offers two distinctive advantages: it does not incur high computation time, and it provides a domain-independent approach for extracting trivia facts. Experimental results demonstrate that the proposed algorithm is over 100 times faster than the existing method which considers Wikipedia categories. Human evaluation demonstrates that the proposed algorithm can mine better trivia facts regardless of the target entity domain and outperforms the existing methods.

pdf bib
Diverse and Non-redundant Answer Set Extraction on Community QA based on DPPs
Shogo Fujita | Tomohide Shibata | Manabu Okumura
Proceedings of the 28th International Conference on Computational Linguistics

In community-based question answering (CQA) platforms, it takes time for a user to get useful information from among many answers. Although one solution is an answer ranking method, the user still needs to read through the top-ranked answers carefully. This paper proposes a new task of selecting a diverse and non-redundant answer set rather than ranking the answers. Our method is based on determinantal point processes (DPPs), and it calculates the answer importance and similarity between answers by using BERT. We built a dataset focusing on a Japanese CQA site, and the experiments on this dataset demonstrated that the proposed method outperformed several baseline methods.

pdf bib
A Simple and Effective Usage of Word Clusters for CBOW Model
Yukun Feng | Chenlong Hu | Hidetaka Kamigaito | Hiroya Takamura | Manabu Okumura
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

We propose a simple and effective method for incorporating word clusters into the Continuous Bag-of-Words (CBOW) model. Specifically, we propose to replace infrequent input and output words in CBOW model with their clusters. The resulting cluster-incorporated CBOW model produces embeddings of frequent words and a small amount of cluster embeddings, which will be fine-tuned in downstream tasks. We empirically show our replacing method works well on several downstream tasks. Through our analysis, we show that our method might be also useful for other similar models which produce word embeddings.

2019

pdf bib
Split or Merge: Which is Better for Unsupervised RST Parsing?
Naoki Kobayashi | Tsutomu Hirao | Kengo Nakamura | Hidetaka Kamigaito | Manabu Okumura | Masaaki Nagata
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Rhetorical Structure Theory (RST) parsing is crucial for many downstream NLP tasks that require a discourse structure for a text. Most of the previous RST parsers have been based on supervised learning approaches. That is, they require an annotated corpus of sufficient size and quality, and heavily rely on the language and domain dependent corpus. In this paper, we present two language-independent unsupervised RST parsing methods based on dynamic programming. The first one builds the optimal tree in terms of a dissimilarity score function that is defined for splitting a text span into smaller ones. The second builds the optimal tree in terms of a similarity score function that is defined for merging two adjacent spans into a large one. Experimental results on English and German RST treebanks showed that our parser based on span merging achieved the best score, around 0.8 F1 score, which is close to the scores of the previous supervised parsers.

pdf bib
Context-aware Neural Machine Translation with Coreference Information
Takumi Ohtani | Hidetaka Kamigaito | Masaaki Nagata | Manabu Okumura
Proceedings of the Fourth Workshop on Discourse in Machine Translation (DiscoMT 2019)

We present neural machine translation models for translating a sentence in a text by using a graph-based encoder which can consider coreference relations provided within the text explicitly. The graph-based encoder can dynamically encode the source text without attending to all tokens in the text. In experiments, our proposed models provide statistically significant improvement to the previous approach of at most 0.9 points in the BLEU score on the OpenSubtitle2018 English-to-Japanese data set. Experimental results also show that the graph-based encoder can handle a longer text well, compared with the previous approach.

pdf bib
A Simple and Effective Method for Injecting Word-Level Information into Character-Aware Neural Language Models
Yukun Feng | Hidetaka Kamigaito | Hiroya Takamura | Manabu Okumura
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

We propose a simple and effective method to inject word-level information into character-aware neural language models. Unlike previous approaches which usually inject word-level information at the input of a long short-term memory (LSTM) network, we inject it into the softmax function. The resultant model can be seen as a combination of character-aware language model and simple word-level language model. Our injection method can also be used together with previous methods. Through the experiments on 14 typologically diverse languages, we empirically show that our injection method, when used together with the previous methods, works better than the previous methods, including a gating mechanism, averaging, and concatenation of word vectors. We also provide a comprehensive comparison of these injection methods.

pdf bib
A Large-Scale Multi-Length Headline Corpus for Analyzing Length-Constrained Headline Generation Model Evaluation
Yuta Hitomi | Yuya Taguchi | Hideaki Tamori | Ko Kikuta | Jiro Nishitoba | Naoaki Okazaki | Kentaro Inui | Manabu Okumura
Proceedings of the 12th International Conference on Natural Language Generation

Browsing news articles on multiple devices is now possible. The lengths of news article headlines have precise upper bounds, dictated by the size of the display of the relevant device or interface. Therefore, controlling the length of headlines is essential when applying the task of headline generation to news production. However, because there is no corpus of headlines of multiple lengths for a given article, previous research on controlling output length in headline generation has not discussed whether the system outputs could be adequately evaluated without multiple references of different lengths. In this paper, we introduce two corpora, which are Japanese News Corpus (JNC) and JApanese MUlti-Length Headline Corpus (JAMUL), to confirm the validity of previous evaluation settings. The JNC provides common supervision data for headline generation. The JAMUL is a large-scale evaluation dataset for headlines of three different lengths composed by professional editors. We report new findings on these corpora; for example, although the longest length reference summary can appropriately evaluate the existing methods controlling output length, this evaluation setting has several problems.

pdf bib
Global Optimization under Length Constraint for Neural Text Summarization
Takuya Makino | Tomoya Iwakura | Hiroya Takamura | Manabu Okumura
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We propose a global optimization method under length constraint (GOLC) for neural text summarization models. GOLC increases the probabilities of generating summaries that have high evaluation scores, ROUGE in this paper, within a desired length. We compared GOLC with two optimization methods, a maximum log-likelihood and a minimum risk training, on CNN/Daily Mail and a Japanese single document summarization data set of The Mainichi Shimbun Newspapers. The experimental results show that a state-of-the-art neural summarization model optimized with GOLC generates fewer overlength summaries while maintaining the fastest processing speed; only 6.70% overlength summaries on CNN/Daily and 7.8% on long summary of Mainichi, compared to the approximately 20% to 50% on CNN/Daily Mail and 10% to 30% on Mainichi with the other optimization methods. We also demonstrate the importance of the generation of in-length summaries for post-editing with the dataset Mainich that is created with strict length constraints. The ex- perimental results show approximately 30% to 40% improved post-editing time by use of in-length summaries.

pdf bib
Dataset Creation for Ranking Constructive News Comments
Soichiro Fujita | Hayato Kobayashi | Manabu Okumura
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Ranking comments on an online news service is a practically important task for the service provider, and thus there have been many studies on this task. However, most of them considered users’ positive feedback, such as “Like”-button clicks, as a quality measure. In this paper, we address directly evaluating the quality of comments on the basis of “constructiveness,” separately from user feedback. To this end, we create a new dataset including 100K+ Japanese comments with constructiveness scores (C-scores). Our experiments clarify that C-scores are not always related to users’ positive feedback, and the performance of pairwise ranking models tends to be enhanced by the variation of comments rather than articles.

pdf bib
Discourse-Aware Hierarchical Attention Network for Extractive Single-Document Summarization
Tatsuya Ishigaki | Hidetaka Kamigaito | Hiroya Takamura | Manabu Okumura
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

Discourse relations between sentences are often represented as a tree, and the tree structure provides important information for summarizers to create a short and coherent summary. However, current neural network-based summarizers treat the source document as just a sequence of sentences and ignore the tree-like discourse structure inherent in the document. To incorporate the information of a discourse tree structure into the neural network-based summarizers, we propose a discourse-aware neural extractive summarizer which can explicitly take into account the discourse dependency tree structure of the source document. Our discourse-aware summarizer can jointly learn the discourse structure and the salience score of a sentence by using novel hierarchical attention modules, which can be trained on automatically parsed discourse dependency trees. Experimental results showed that our model achieved competitive or better performances against state-of-the-art models in terms of ROUGE scores on the DailyMail dataset. We further conducted manual evaluations. The results showed that our approach also gained the coherence of the output summaries.

2018

pdf bib
Neural Machine Translation Incorporating Named Entity
Arata Ugawa | Akihiro Tamura | Takashi Ninomiya | Hiroya Takamura | Manabu Okumura
Proceedings of the 27th International Conference on Computational Linguistics

This study proposes a new neural machine translation (NMT) model based on the encoder-decoder model that incorporates named entity (NE) tags of source-language sentences. Conventional NMT models have two problems enumerated as follows: (i) they tend to have difficulty in translating words with multiple meanings because of the high ambiguity, and (ii) these models’abilitytotranslatecompoundwordsseemschallengingbecausetheencoderreceivesaword, a part of the compound word, at each time step. To alleviate these problems, the encoder of the proposed model encodes the input word on the basis of its NE tag at each time step, which could reduce the ambiguity of the input word. Furthermore,the encoder introduces a chunk-level LSTM layer over a word-level LSTM layer and hierarchically encodes a source-language sentence to capture a compound NE as a chunk on the basis of the NE tags. We evaluate the proposed model on an English-to-Japanese translation task with the ASPEC, and English-to-Bulgarian and English-to-Romanian translation tasks with the Europarl corpus. The evaluation results show that the proposed model achieves up to 3.11 point improvement in BLEU.

pdf bib
Stylistically User-Specific Generation
Abdurrisyad Fikri | Hiroya Takamura | Manabu Okumura
Proceedings of the 11th International Conference on Natural Language Generation

Recent neural models for response generation show good results in terms of general responses. In real conversations, however, depending on the speaker/responder, similar utterances should require different responses. In this study, we attempt to consider individual user’s information in adjusting the notable sequence-to-sequence (seq2seq) model for more diverse, user-specific responses. We assume that we need user-specific features to adjust the response and we argue that some selected representative words from the users are suitable for this task. Furthermore, we prove that even for unseen or unknown users, our model can provide more diverse and interesting responses, while maintaining correlation with input utterances. Experimental results with human evaluation show that our model can generate more interesting responses than the popular seq2seqmodel and achieve higher relevance with input utterances than our baseline.

2017

pdf bib
Japanese Sentence Compression with a Large Training Dataset
Shun Hasegawa | Yuta Kikuchi | Hiroya Takamura | Manabu Okumura
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

In English, high-quality sentence compression models by deleting words have been trained on automatically created large training datasets. We work on Japanese sentence compression by a similar approach. To create a large Japanese training dataset, a method of creating English training dataset is modified based on the characteristics of the Japanese language. The created dataset is used to train Japanese sentence compression models based on the recurrent neural network.

pdf bib
Distinguishing Japanese Non-standard Usages from Standard Ones
Tatsuya Aoki | Ryohei Sasano | Hiroya Takamura | Manabu Okumura
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We focus on non-standard usages of common words on social media. In the context of social media, words sometimes have other usages that are totally different from their original. In this study, we attempt to distinguish non-standard usages on social media from standard ones in an unsupervised manner. Our basic idea is that non-standardness can be measured by the inconsistency between the expected meaning of the target word and the given context. For this purpose, we use context embeddings derived from word embeddings. Our experimental results show that the model leveraging the context embedding outperforms other methods and provide us with findings, for example, on how to construct context embeddings and which corpus to use.

pdf bib
Summarizing Lengthy Questions
Tatsuya Ishigaki | Hiroya Takamura | Manabu Okumura
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In this research, we propose the task of question summarization. We first analyzed question-summary pairs extracted from a Community Question Answering (CQA) site, and found that a proportion of questions cannot be summarized by extractive approaches but requires abstractive approaches. We created a dataset by regarding the question-title pairs posted on the CQA site as question-summary pairs. By using the data, we trained extractive and abstractive summarization models, and compared them based on ROUGE scores and manual evaluations. Our experimental results show an abstractive method using an encoder-decoder model with a copying mechanism achieves better scores for both ROUGE-2 F-measure and the evaluations by human judges.

pdf bib
Supervised Attention for Sequence-to-Sequence Constituency Parsing
Hidetaka Kamigaito | Katsuhiko Hayashi | Tsutomu Hirao | Hiroya Takamura | Manabu Okumura | Masaaki Nagata
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

The sequence-to-sequence (Seq2Seq) model has been successfully applied to machine translation (MT). Recently, MT performances were improved by incorporating supervised attention into the model. In this paper, we introduce supervised attention to constituency parsing that can be regarded as another translation task. Evaluation results on the PTB corpus showed that the bracketing F-measure was improved by supervised attention.

2016

pdf bib
Controlling Output Length in Neural Encoder-Decoders
Yuta Kikuchi | Graham Neubig | Ryohei Sasano | Hiroya Takamura | Manabu Okumura
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Unsupervised Word Alignment by Agreement Under ITG Constraint
Hidetaka Kamigaito | Akihiro Tamura | Hiroya Takamura | Manabu Okumura | Eiichiro Sumita
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Corpus-Based Analysis of Canonical Word Order of Japanese Double Object Constructions
Ryohei Sasano | Manabu Okumura
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Hierarchical Back-off Modeling of Hiero Grammar based on Non-parametric Bayesian Model
Hidetaka Kamigaito | Taro Watanabe | Hiroya Takamura | Manabu Okumura | Eiichiro Sumita
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Context-Dependent Automatic Response Generation Using Statistical Machine Translation Techniques
Andrew Shin | Ryohei Sasano | Hiroya Takamura | Manabu Okumura
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
Unsupervised Word Alignment Using Frequency Constraint in Posterior Regularized EM
Hidetaka Kamigaito | Taro Watanabe | Hiroya Takamura | Manabu Okumura
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Single Document Summarization based on Nested Tree Structure
Yuta Kikuchi | Tsutomu Hirao | Hiroya Takamura | Manabu Okumura | Masaaki Nagata
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib
Automatic Knowledge Acquisition for Case Alternation between the Passive and Active Voices in Japanese
Ryohei Sasano | Daisuke Kawahara | Sadao Kurohashi | Manabu Okumura
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Simple Approach to Unknown Word Processing in Japanese Morphological Analysis
Ryohei Sasano | Sadao Kurohashi | Manabu Okumura
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Construction of Emotional Lexicon Using Potts Model
Braja Gopal Patra | Hiroya Takamura | Dipankar Das | Manabu Okumura | Sivaji Bandyopadhyay
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Proceedings of the 3rd Workshop on Sentiment Analysis where AI meets Psychology
Sivaji Bandyopadhyay | Manabu Okumura
Proceedings of the 3rd Workshop on Sentiment Analysis where AI meets Psychology

pdf bib
Part-of-Speech Induction in Dependency Trees for Statistical Machine Translation
Akihiro Tamura | Taro Watanabe | Eiichiro Sumita | Hiroya Takamura | Manabu Okumura
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Subtree Extractive Summarization via Submodular Maximization
Hajime Morita | Ryohei Sasano | Hiroya Takamura | Manabu Okumura
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf bib
Proceedings of the 2nd Workshop on Sentiment Analysis where AI meets Psychology
Sivaji Bandyopadhyay | Manabu Okumura
Proceedings of the 2nd Workshop on Sentiment Analysis where AI meets Psychology

pdf bib
Generating “A for Alpha” When There Are Thousands of Characters
Hiroaki Kawasaki | Ryohei Sasano | Hiroya Takamura | Manabu Okumura
Proceedings of COLING 2012

pdf bib
Automatic Domain Adaptation for Word Sense Disambiguation Based on Comparison of Multiple Classifiers
Kanako Komiya | Manabu Okumura
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation

pdf bib
Sentence Compression with Semantic Role Constraints
Katsumasa Yoshikawa | Ryu Iida | Tsutomu Hirao | Manabu Okumura
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

pdf bib
Developing Japanese WordNet Affect for Analyzing Emotions
Yoshimitsu Torii | Dipankar Das | Sivaji Bandyopadhyay | Manabu Okumura
Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2.011)

pdf bib
Proceedings of the Workshop on Sentiment Analysis where AI meets Psychology (SAAIP 2011)
Sivaji Bandyopadhyay | Manabu Okumura
Proceedings of the Workshop on Sentiment Analysis where AI meets Psychology (SAAIP 2011)

pdf bib
A Named Entity Recognition Method based on Decomposition and Concatenation of Word Chunks
Tomoya Iwakura | Hiroya Takamura | Manabu Okumura
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Identification of relations between answers with global constraints for Community-based Question Answering services
Hikaru Yokono | Takaaki Hasegawa | Genichiro Kikui | Manabu Okumura
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Automatic Determination of a Domain Adaptation Method for Word Sense Disambiguation Using Decision Tree Learning
Kanako Komiya | Manabu Okumura
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Potts Model on the Case Fillers for Word Sense Disambiguation
Hiroya Takamura | Manabu Okumura
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Query Snowball: A Co-occurrence-based Approach to Multi-document Summarization for Question Answering
Hajime Morita | Tetsuya Sakai | Manabu Okumura
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
SemEval-2010 Task: Japanese WSD
Manabu Okumura | Kiyoaki Shirai | Kanako Komiya | Hikaru Yokono
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
An Approach toward Register Classification of Book Samples in the Balanced Corpus of Contemporary Written Japanese
Wakako Kashino | Manabu Okumura
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

pdf bib
An Efficient Algorithm for Unsupervised Word Segmentation with Branching Entropy and MDL
Valentin Zhikov | Hiroya Takamura | Manabu Okumura
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf bib
Text Summarization Model Based on Maximum Coverage Problem and its Variant
Hiroya Takamura | Manabu Okumura
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Structured Output Learning with Polynomial Kernel
Hajime Morita | Hiroya Takamura | Manabu Okumura
Proceedings of the International Conference RANLP-2009

2008

pdf bib
Identifying Cross-Document Relations between Sentences
Yasunari Miyabe | Hiroya Takamura | Manabu Okumura
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Learning to Shift the Polarity of Words for Sentiment Classification
Daisuke Ikeda | Hiroya Takamura | Lev-Arie Ratinov | Manabu Okumura
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

2007

pdf bib
Extracting Semantic Orientations of Phrases from Dictionary
Hiroya Takamura | Takashi Inui | Manabu Okumura
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
TITPI: Web People Search Task Using Semi-Supervised Clustering Approach
Kazunari Sugiyama | Manabu Okumura
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
Japanese Dependency Analysis Using the Ancestor-Descendant Relation
Akihiro Tamura | Hiroya Takamura | Manabu Okumura
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Japanese Dependency Parsing Using Co-Occurrence Information and a Combination of Case Elements
Takeshi Abekawa | Manabu Okumura
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Time Period Identification of Events in Text
Taichi Noro | Takashi Inui | Hiroya Takamura | Manabu Okumura
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
A Rote Extractor with Edit Distance-Based Generalisation and Multi-Corpora Precision Calculation
Enrique Alfonseca | Pablo Castells | Manabu Okumura | Maria Ruiz-Casado
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
An Automatic Method for Summary Evaluation Using Multiple Evaluation Results by a Manual Method
Hidetsugu Nanba | Manabu Okumura
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
Towards Large-scale Non-taxonomic Relation Extraction: Estimating the Precision of Rote Extractors
Enrique Alfonseca | Maria Ruiz-Casado | Manabu Okumura | Pablo Castells
Proceedings of the 2nd Workshop on Ontology Learning and Population: Bridging the Gap between Text and Knowledge

pdf bib
Automatic Terminology Intelligibility Estimation for Readership-oriented Technical Writing
Yasuko Senda | Yasusi Sinohara | Manabu Okumura
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes automatic terminology intelligibility estimation for readership-oriented technical writing. We assume that the term frequency weighted by the types of documents can be an indicator of the term intelligibility for a certain readership. From this standpoint, we analyzed the relationship between the following: average intelligibility levels of 46 technical terms that were rated by about 120 laymen; numbers of documents that an Internet search

pdf bib
Latent Variable Models for Semantic Orientations of Phrases
Hiroya Takamura | Takashi Inui | Manabu Okumura
11th Conference of the European Chapter of the Association for Computational Linguistics

2005

pdf bib
Kernel-based Approach for Automatic Evaluation of Natural Language Generation Technologies: Application to Automatic Summarization
Tsutomu Hirao | Manabu Okumura | Hideki Isozaki
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
Investigating the Characteristics of Causal Relations in Japanese Text
Takashi Inui | Manabu Okumura
Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky

pdf bib
Extracting Semantic Orientations of Words using Spin Model
Hiroya Takamura | Takashi Inui | Manabu Okumura
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf bib
Corpus-Based Analysis of Japanese Relative Clause Constructions
Takeshi Abekawa | Manabu Okumura
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
Classification of Multiple-Sentence Questions
Akihiro Tamura | Hiroya Takamura | Manabu Okumura
Second International Joint Conference on Natural Language Processing: Full Papers

2004

pdf bib
Comparison of Some Automatic and Manual Methods for Summary Evaluation Based on the Text Summarization Challenge 2
Hidetsugu Nanba | Manabu Okumura
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
A Support System for Revising Titles to Stimulate the Lay Reader’s Interest in Technical Achievements
Yasuko Senda | Yasusi Sinohara | Manabu Okumura
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Corpus and Evaluation Measures for Multiple Document Summarization with Multiple Sources
Tsutomu Hirao | Takahiro Fukusima | Manabu Okumura | Chikashi Nobata | Hidetsugu Nanba
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
Text Summarization Challenge 2 - Text summarization evaluation at NTCIR Workshop 3
Manabu Okumura | Takahiro Fukusima | Hidetsugu Nanba
Proceedings of the HLT-NAACL 03 Text Summarization Workshop

pdf bib
Patent Claim Processing for Readability - Structure Analysis and Term Explanation
Akihiro Shinmori | Manabu Okumura | Yuzo Marukawa | Makoto Iwayama
Proceedings of the ACL-2003 Workshop on Patent Corpus Processing

pdf bib
Automatic Acquisition of Script Knowledge from a Text Collection
Toshiaki Fujiki | Hidetsugu Nanba | Manabu Okumura
10th Conference of the European Chapter of the Association for Computational Linguistics

2002

pdf bib
Constructing a lexicon of action
Takenobu Tokunaga | Manabu Okumura | Suguru Saitô | Hozumi Tanaka
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf bib
Some Examinations of Intrinsic Methods for Summary Evaluation Based on the Text Summarization Challenge (TSC)
Hidetsugu Nanba | Manabu Okumura
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2000

pdf bib
A Comparison of Summarization Methods Based on Task-based Evaluation
Hajime Mochizuki | Manabu Okumura
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf bib
Producing More Readable Extracts by Revising Them
Manabu Okumura
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

1998

pdf bib
Text Segmentation with Multiple Surface Linguistic Cues
Hajime Mochizuki | Takeo Honda | Manabu Okumura
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf bib
Text Segmentation with Multiple Surface Linguistic Cues
Hajime Mochizuki | Takeo Honda | Manabu Okumura
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

1997

pdf bib
Grammar Acquisition Based on Clustering Analysis and Its Application to Statistical Parsing
Thanaruk Theeramunkong | Manabu Okumura
Fifth Workshop on Very Large Corpora

pdf bib
Exploiting Contextual Information in Hypothesis Selection for Grammar Refinement
Thanaruk Theeramunkong | Yasunobu Kawaguchi | Manabu Okumura
Computational Environments for Grammar Development and Linguistic Engineering

1996

pdf bib
Towards Automatic Grammar Acquisition from a Bracketed Corpus
Thanaruk Theeramunkong | Manabu Okumara
Fourth Workshop on Very Large Corpora

pdf bib
Zero Pronoun Resolution in Japanese Discourse Based on Centering Theory
Manabu Okumura | Kouji Tamura
COLING 1996 Volume 2: The 16th International Conference on Computational Linguistics

1994

pdf bib
Word Sense Disambiguation and Text Segmentation Based on Lexical Cohesion
Manabu Okumura | Takeo Honda
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics

1992

pdf bib
A Chart-based Method of ID/LP Parsing with Generalized Discrimination Networks
Surapant Meknavin | Manabu Okumura | Hozumi Tanaka
COLING 1992 Volume 1: The 14th International Conference on Computational Linguistics

Search
Co-authors