<?xml version="1.0" encoding="UTF-8" ?>
<volume id="W16">
  <paper id="4900">
    <title>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</title>
    <editor>Hsin-Hsi Chen</editor>
    <editor>Yuen-Hsien Tseng</editor>
    <editor>Vincent Ng</editor>
    <editor>Xiaofei Lu</editor>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <url>http://aclweb.org/anthology/W16-49</url>
    <bibtype>book</bibtype>
    <bibkey>NLPTEA2016:2016</bibkey>
  </paper>

  <paper id="4901">
    <title>Simplification of Example Sentences for Learners of Japanese Functional Expressions</title>
    <author><first>Jun</first><last>Liu</last></author>
    <author><first>Yuji</first><last>Matsumoto</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>1&#8211;5</pages>
    <url>http://aclweb.org/anthology/W16-4901</url>
    <abstract>Learning functional expressions is one of the difficulties for language
	learners, since functional expressions tend to have multiple meanings and
	complicated usages in various situations. In this paper, we report an
	experiment of simplifying example sentences of Japanese functional expressions
	especially for Chinese-speaking learners. For this purpose, we developed
	“Japanese Functional Expressions List” and “Simple Japanese Replacement
	List”. To evaluate the method, we conduct a small-scale experiment with
	Chinese-speaking learners on the
	effectiveness of the simplified example sentences. The experimental results
	indicate that simplified sentences are helpful in learning Japanese functional
	expressions.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>liu-matsumoto:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4902">
    <title>Effectiveness of Linguistic and Learner Features to Listenability Measurement Using a Decision Tree Classifier</title>
    <author><first>Katsunori</first><last>Kotani</last></author>
    <author><first>Takehiko</first><last>Yoshimi</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>6&#8211;10</pages>
    <url>http://aclweb.org/anthology/W16-4902</url>
    <abstract>In learning Asian languages, learners encounter the problem of character types
	that are different from those in their first language, for instance, between
	Chinese characters and the Latin alphabet. This problem also affects listening
	because learners reconstruct letters from speech sounds. Hence, special
	attention should be paid to listening practice for learners of Asian languages.
	However, to our knowledge, few studies have evaluated the ease of listening
	comprehension (listenability) in Asian languages. Therefore, as a pilot study
	of listenability in Asian languages, we developed a measurement method for
	learners of English in order to examine the discriminability of linguistic and
	learner features. The results showed that the accuracy of our method
	outperformed a simple majority vote, which suggests that a combination of
	linguistic and learner features should be used to measure listenability in
	Asian languages as well as in English.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>kotani-yoshimi:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4903">
    <title>A Two-Phase Approach Towards Identifying Argument Structure in Natural Language</title>
    <author><first>Arkanath</first><last>Pathak</last></author>
    <author><first>Pawan</first><last>Goyal</last></author>
    <author><first>Plaban</first><last>Bhowmick</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>11&#8211;19</pages>
    <url>http://aclweb.org/anthology/W16-4903</url>
    <abstract>We propose a new approach for extracting argument structure from natural
	language texts that contain an underlying argument. Our approach comprises of
	two phases: Score Assignment and Structure Prediction. The Score Assignment
	phase trains models to classify relations between argument units (Support,
	Attack or Neutral). To that end, different training strategies have been
	explored. We identify different linguistic and lexical features for training
	the classifiers. Through ablation study, we observe that our novel use of
	word-embedding features is most effective for this task. The Structure
	Prediction phase makes use of the scores from the Score Assignment phase to
	arrive at the optimal structure. We perform experiments on three argumentation
	datasets, namely, AraucariaDB, Debatepedia and Wikipedia. We also propose two
	baselines and observe that the proposed approach outperforms baseline systems
	for the final task of Structure Prediction.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>pathak-goyal-bhowmick:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4904">
    <title>Distributed Vector Representations for Unsupervised Automatic Short Answer Grading</title>
    <author><first>Oliver</first><last>Adams</last></author>
    <author><first>Shourya</first><last>Roy</last></author>
    <author><first>Raghuram</first><last>Krishnapuram</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>20&#8211;29</pages>
    <url>http://aclweb.org/anthology/W16-4904</url>
    <abstract>We address the problem of automatic short answer grading, evaluating a
	collection of approaches
	inspired by recent advances in distributional text representations. In
	addition, we propose an unsupervised
	approach for determining text similarity using one-to-many alignment of word
	vectors.
	We evaluate the proposed technique across two datasets from different domains,
	namely,
	computer science and English reading comprehension, that additionally vary
	between highschool
	level and undergraduate students. Experiments demonstrate that the proposed
	technique
	often outperforms other compositional distributional semantics approaches as
	well as vector
	space methods such as latent semantic analysis. When combined with a scoring
	scheme, the
	proposed technique provides a powerful tool for tackling the complex problem of
	short answer
	grading. We also discuss a number of other key points worthy of consideration
	in preparing
	viable, easy-to-deploy automatic short-answer grading systems for the
	real-world.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>adams-roy-krishnapuram:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4905">
    <title>A Comparison of Word Embeddings for English and Cross-Lingual Chinese Word Sense Disambiguation</title>
    <author><first>Hong Jin</first><last>Kang</last></author>
    <author><first>Tao</first><last>Chen</last></author>
    <author><first>Muthu Kumar</first><last>Chandrasekaran</last></author>
    <author><first>Min-Yen</first><last>Kan</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>30&#8211;39</pages>
    <url>http://aclweb.org/anthology/W16-4905</url>
    <abstract>Word embeddings are now ubiquitous forms of word representation in
	natural language processing.  There have been applications of 
	word embeddings for monolingual word sense disambiguation (WSD) in English,
	but few comparisons have been done.  This paper attempts to bridge
	that gap by examining popular embeddings for the task of monolingual
	English WSD.  Our simplified method leads to comparable
	state-of-the-art performance without expensive retraining.
	Cross-Lingual WSD -- where the word senses of a word in a source
	language come from a separate target translation language --
	can also assist in language learning; for example, when providing
	translations of target vocabulary for learners.  Thus we have also
	applied word embeddings to the novel task of cross-lingual WSD for
	Chinese and provide a public dataset for further benchmarking.
	We have also experimented with using word embeddings for LSTM networks
	and found surprisingly that a basic LSTM network does not work well.
	We discuss the ramifications of this outcome.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>kang-EtAl:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4906">
    <title>Overview of NLP-TEA 2016 Shared Task for Chinese Grammatical Error Diagnosis</title>
    <author><first>Lung-Hao</first><last>Lee</last></author>
    <author><first>Gaoqi</first><last>RAO</last></author>
    <author><first>Liang-Chih</first><last>Yu</last></author>
    <author><first>Endong</first><last>XUN</last></author>
    <author><first>Baolin</first><last>Zhang</last></author>
    <author><first>Li-Ping</first><last>Chang</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>40&#8211;48</pages>
    <url>http://aclweb.org/anthology/W16-4906</url>
    <abstract>This paper presents the NLP-TEA 2016 shared task for Chinese grammatical error
	diagnosis which seeks to identify grammatical error types and their range of
	occurrence within sentences written by learners of Chinese as foreign language.
	We describe the task definition, data preparation, performance metrics, and
	evaluation results. Of the 15 teams registered for this shared task, 9 teams
	developed the system and submitted a total of 36 runs. We expected this
	evaluation campaign could lead to the development of more advanced NLP
	techniques for educational applications, especially for Chinese error
	detection. All data sets with gold standards and scoring scripts are made
	publicly available to researchers.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>lee-EtAl:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4907">
    <title>Chinese Grammatical Error Diagnosis with Long Short-Term Memory Networks</title>
    <author><first>Bo</first><last>Zheng</last></author>
    <author><first>Wanxiang</first><last>Che</last></author>
    <author><first>Jiang</first><last>Guo</last></author>
    <author><first>Ting</first><last>Liu</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>49&#8211;56</pages>
    <url>http://aclweb.org/anthology/W16-4907</url>
    <abstract>Grammatical error diagnosis is an important task in natural language
	processing. This paper introduces our Chinese Grammatical Error Diagnosis
	(CGED) system in the NLP-TEA-3 shared task for CGED. The CGED system can
	diagnose four types of grammatical errors which are redundant words (R),
	missing words (M), bad word selection (S) and disordered words (W). We treat
	the CGED task as a sequence labeling task and describe three models, including
	a CRF-based model, an LSTM-based model and an ensemble model using stacking. We
	also show in details how we build and train the models. Evaluation includes
	three levels, which are detection level, identification level and position
	level. On the CGED-HSK dataset of NLP-TEA-3 shared task, our system presents
	the best F1-scores in all the three levels and also the best recall in the last
	two levels.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>zheng-EtAl:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4908">
    <title>Automatic Grammatical Error Detection for Chinese based on Conditional Random Field</title>
    <author><first>Yajun</first><last>Liu</last></author>
    <author><first>Yingjie</first><last>Han</last></author>
    <author><first>Liyan</first><last>Zhuo</last></author>
    <author><first>Hongying</first><last>Zan</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>57&#8211;62</pages>
    <url>http://aclweb.org/anthology/W16-4908</url>
    <abstract>In the process of learning and using Chinese, foreigners may have grammatical
	errors due to negative migration of their native languages. Currently, the
	computer-oriented automatic detection method of grammatical errors is not
	mature enough. Based on the evaluating task ----CGED2016, we select and analyze
	the classification model and design feature extraction method to obtain
	grammatical errors in-cluding Mission(M), Disorder(W), Selection (S) and
	Redundant (R) automatically. The experiment re-sults based on the dynamic
	corpus of HSK show that the Chinese grammatical error automatic detection
	method, which uses CRF as classification model and n-gram as feature extraction
	method. It is simple and efficient whichplay a positive effect on the research
	of Chinese grammatical error automatic detection and also a supporting and
	guiding role in the teaching of Chinese as a foreign language.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>liu-EtAl:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4909">
    <title>CYUT-III System at Chinese Grammatical Error Diagnosis Task</title>
    <author><first>CHEN</first><last>PO-LIN</last></author>
    <author><first>Shih-Hung</first><last>Wu</last></author>
    <author><first>Liang-Pu</first><last>Chen</last></author>
    <author><first>ping-che</first><last>yang</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>63&#8211;72</pages>
    <url>http://aclweb.org/anthology/W16-4909</url>
    <abstract>This paper describe the CYUT-III system on grammar error detection in the 2016
	NLP-TEA Chinese Grammar Error Detection shared task CGED. In this task a system
	has to detect four types of errors, in-cluding redundant word error, missing
	word error, word selection error and word ordering error. Based on the
	conditional random fields (CRF) model, our system is a linear tagger that can
	detect the errors in learners’ essays. Since the system performance depends
	on the features heavily, in this paper, we are going to report how to integrate
	the collocation feature into the CRF model. Our system presents the best
	detection accuracy and Identification accuracy on the TOCFL dataset, which is
	in traditional Chi-nese. The same system also works well on the simplified
	Chinese HSK dataset.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>polin-EtAl:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4910">
    <title>Word Order Sensitive Embedding Features/Conditional Random Field-based Chinese Grammatical Error Detection</title>
    <author><first>Wei-Chieh</first><last>Chou</last></author>
    <author><first>Chin-Kui</first><last>Lin</last></author>
    <author><first>Yuan-Fu</first><last>Liao</last></author>
    <author><first>Yih-Ru</first><last>Wang</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>73&#8211;81</pages>
    <url>http://aclweb.org/anthology/W16-4910</url>
    <abstract>This paper discusses how to adapt two new word embedding features to build a
	more efficient Chinese Grammatical Error Diagnosis (CGED) systems to assist
	Chinese foreign learners (CFLs) in improving their written essays. The major
	idea is to apply word order sensitive Word2Vec approaches including (1)
	structured skip-gram and (2) continuous window (CWindow) models, because they
	are more suitable for solving syntax-based problems. The proposed new features
	were evaluated on the Test of Chinese as a Foreign Language (TOCFL) learner
	database provided by NLP-TEA-3&#38;CGED shared task. Experimental results showed
	that the new features did work better than the traditional word order
	insensitive Word2Vec approaches. Moreover, according to the official evaluation
	results, our system achieved the lowest (0.1362) false positive (FA) and the
	highest precision rates in all three measurements.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>chou-EtAl:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4911">
    <title>A Fluctuation Smoothing Approach for Unsupervised Automatic Short Answer Grading</title>
    <author><first>Shourya</first><last>Roy</last></author>
    <author><first>Sandipan</first><last>Dandapat</last></author>
    <author><first>Y.</first><last>Narahari</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>82&#8211;91</pages>
    <url>http://aclweb.org/anthology/W16-4911</url>
    <abstract>We offer a fluctuation smoothing computational approach for unsupervised
	automatic short answer
	grading (ASAG) techniques in the educational ecosystem. A major drawback of the
	existing
	techniques is the significant effect that variations in model answers could
	have on their
	performances. The proposed fluctuation smoothing approach, based on classical
	sequential pattern
	mining, exploits lexical overlap in students’ answers to any typical
	question. We empirically
	demonstrate using multiple datasets that the proposed approach improves the
	overall performance
	and significantly reduces (up to 63%) variation in performance (standard
	deviation) of unsupervised
	ASAG techniques. We bring in additional benchmarks such as (a) paraphrasing of
	model
	answers and (b) using answers by k top performing students as model answers, to
	amplify the
	benefits of the proposed approach.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>roy-dandapat-narahari:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4912">
    <title>Japanese Lexical Simplification for Non-Native Speakers</title>
    <author><first>Muhaimin</first><last>Hading</last></author>
    <author><first>Yuji</first><last>Matsumoto</last></author>
    <author><first>Maki</first><last>Sakamoto</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>92&#8211;96</pages>
    <url>http://aclweb.org/anthology/W16-4912</url>
    <abstract>This paper introduces Japanese lexical simplification.                               
	Japanese
	lexical
	simplification is the task of replacing difficult words in a given sentence to
	produce a new sentence with simple words without changing the original meaning
	of the sentence.  We purpose a method of supervised regression learning to
	estimate difficulty ordering of words with statistical features obtained from
	two types of Japanese corpora.                                            For the
	similarity of
	words,
	we use
	a
	Japanese
	thesaurus and dependency-based word embeddings.  Evaluation of the proposed
	method is performed by comparing the difficulty ordering of the words.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>hading-matsumoto-sakamoto:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4913">
    <title>A Corpus-based Approach for Spanish-Chinese Language Learning</title>
    <author><first>Shuyuan</first><last>Cao</last></author>
    <author><first>Iria</first><last>da Cunha</last></author>
    <author><first>Mikel</first><last>Iruskieta</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>97&#8211;106</pages>
    <url>http://aclweb.org/anthology/W16-4913</url>
    <abstract>Due to the huge population that speaks Spanish and Chinese, these languages
	occupy an important position in the language learning studies. Although there
	are some automatic translation systems that benefit the learning of both
	languages, there is enough space to create resources in order to help language
	learners. As a quick and effective resource that can give large amount language
	information, corpus-based learning is becoming more and more popular. In this
	paper we enrich a Spanish-Chinese parallel corpus automatically with part
	of-speech (POS) information and manually with discourse segmentation (following
	the Rhetorical Structure Theory (RST) (Mann and Thompson, 1988)). Two search
	tools allow the Spanish-Chinese language learners to carry out different
	queries based on tokens and lemmas. The parallel corpus and the research tools
	are available to the academic community. We propose some examples to illustrate
	how learners can use the corpus to learn Spanish and Chinese.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>cao-dacunha-iruskieta:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4914">
    <title>Syntactic Well-Formedness Diagnosis and Error-Based Coaching in Computer Assisted Language Learning using Machine Translation</title>
    <author><first>Lu&#237;s</first><last>Morgado da Costa</last></author>
    <author><first>Francis</first><last>Bond</last></author>
    <author><first>Xiaoling</first><last>He</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>107&#8211;116</pages>
    <url>http://aclweb.org/anthology/W16-4914</url>
    <abstract>We present a novel approach to Computer Assisted Language Learning (CALL),
	using deep syntactic parsers and semantic based machine translation (MT) in
	diagnosing and providing explicit feedback on language learners’ errors. We
	are currently developing a proof of concept system showing how semantic-based
	machine translation can, in conjunction with robust computational grammars, be
	used to interact with students, better understand their language errors, and
	help students correct their grammar through a series of useful feedback
	messages and guided language drills. Ultimately, we aim to prove the viability
	of a new integrated rule-based MT approach to disambiguate students’ intended
	meaning in a CALL system. This is a necessary step to provide accurate coaching
	on how to correct ungrammatical input, and it will allow us to overcome a
	current bottleneck in the  field — an exponential burst of ambiguity caused
	by ambiguous lexical items (Flickinger, 2010). From the users’ interaction
	with the system, we will also produce a richly annotated Learner Corpus,
	annotated automatically with both syntactic and semantic information.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>morgadodacosta-bond-he:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4915">
    <title>An Aligned French-Chinese corpus of 10K segments from university educational material</title>
    <author><first>Ruslan</first><last>Kalitvianski</last></author>
    <author><first>Lingxiao</first><last>Wang</last></author>
    <author><first>Val&#233;rie</first><last>Bellynck</last></author>
    <author><first>Christian</first><last>Boitet</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>117&#8211;121</pages>
    <url>http://aclweb.org/anthology/W16-4915</url>
    <abstract>This paper describes a corpus of nearly 10K French-Chinese aligned segments,
	produced by post-editing machine translated computer science courseware. This
	corpus was built from 2013 to 2016 within the PROJECT\_NAME project, by native
	Chinese students. The quality, as judged by native speakers, is ad-equate for
	understanding (far better than by reading only the original French) and for
	getting better marks. This corpus is annotated at segment-level by a
	self-assessed quality score. It has been directly used as supplemental training
	data to build a statistical machine translation system dedicated to that
	sublanguage, and can be used to extract the specific bilingual terminology. To
	our knowledge, it is the first corpus of this kind to be released.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>kalitvianski-EtAl:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4916">
    <title>Analysis of Foreign Language Teaching Methods: An Automatic Readability Approach</title>
    <author><first>Nasser</first><last>Zalmout</last></author>
    <author><first>Hind</first><last>Saddiki</last></author>
    <author><first>Nizar</first><last>Habash</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>122&#8211;130</pages>
    <url>http://aclweb.org/anthology/W16-4916</url>
    <abstract>Much research in education has been done on the study of different language
	teaching methods. However, there has been little investigation using
	computational analysis to compare such methods in terms of readability or
	complexity progression. In this paper, we make use of existing readability
	scoring techniques and our own classifiers to analyze the textbooks used in two
	very different teaching methods for English as a Second Language -- the
	grammar-based and the communicative methods. Our analysis indicates that the
	grammar-based curriculum shows a more coherent readability progression compared
	to the communicative curriculum. This finding corroborates with the
	expectations about the differences between these two methods and validates our
	approach’s value in comparing different teaching methods quantitatively.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>zalmout-saddiki-habash:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4917">
    <title>Generating and Scoring Correction Candidates in Chinese Grammatical Error Diagnosis</title>
    <author><first>Shao-Heng</first><last>Chen</last></author>
    <author><first>Yu-Lin</first><last>Tsai</last></author>
    <author><first>Chuan-Jie</first><last>Lin</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>131&#8211;139</pages>
    <url>http://aclweb.org/anthology/W16-4917</url>
    <abstract>Grammatical error diagnosis is an essential part in a language-learning
	tutoring system.  Based on the data sets of Chinese grammar error detection
	tasks, we proposed a system which measures the likelihood of correction
	candidates generated by deleting or inserting characters or words, moving
	substrings to different positions, substituting prepositions with other
	prepositions, or substituting words with their synonyms or similar strings. 
	Sentence likelihood is measured based on the frequencies of substrings from the
	space-removed version of Google n-grams.  The evaluation on the training set
	shows that Missing-related and Selection-related candidate generation methods
	have promising performance.  Our final system achieved a precision of 30.28%
	and a recall of 62.85% in the identification level evaluated on the test set.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>chen-tsai-lin:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4918">
    <title>Grammatical Error Detection Based on Machine Learning for Mandarin as Second Language Learning</title>
    <author><first>Jui-Feng</first><last>Yeh</last></author>
    <author><first>Tsung-Wei</first><last>Hsu</last></author>
    <author><first>Chan-Kun</first><last>Yeh</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>140&#8211;147</pages>
    <url>http://aclweb.org/anthology/W16-4918</url>
    <abstract>Mandarin is not simple language for foreigner. Even using Mandarin as the
	mother tongue, they have to spend more time to learn when they were child. The
	following issues are the reason why causes learning problem. First, the word is
	envolved by Hieroglyphic. So a character can express meanings independently,
	but become a word has another semantic. Second, the Mandarin's grammars have
	flexible rule and special usage. Therefore, the common grammatical errors can
	classify to missing, redundant, selection and disorder. In this paper, we
	proposed the structure of the Recurrent Neural Networks using Long Short-term
	memory (RNN-LSTM). It can detect the error type from the foreign learner
	writing. The features based on the word vector and part-of-speech vector. In
	the test data found that our method in the detection level of recall better
	than the others, even as high as 0.9755. That is because we give the
	possibility of greater choice in detecting errors.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>yeh-hsu-yeh:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4919">
    <title>Bi-LSTM Neural Networks for Chinese Grammatical Error Diagnosis</title>
    <author><first>Shen</first><last>Huang</last></author>
    <author><first>Houfeng</first><last>WANG</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>148&#8211;154</pages>
    <url>http://aclweb.org/anthology/W16-4919</url>
    <abstract>Grammatical Error Diagnosis for Chinese has always been a challenge for both
	foreign learners and NLP researchers, for the variousity of grammar and the
	flexibility of expression. In this paper, we present a model based on
	Bidirectional Long Short-Term Memory(Bi-LSTM) neural networks, which treats the
	task as a sequence labeling problem, so as to detect Chinese grammatical
	errors, to identify the error types and to locate the error positions. In the
	corpora of this year's shared task, there can be multiple errors in a single
	offset of a sentence, to address which, we simutaneously train three Bi-LSTM
	models sharing word embeddings which label Missing, Redundant and Selection
	errors respectively. We regard word ordering error as a special kind of word
	selection error which is longer during training phase, and then separate them
	by length during testing phase.
	  In NLP-TEA 3 shared task for Chinese Grammatical Error Diagnosis(CGED), Our
	system achieved relatively high F1 for all the three levels in the traditional
	Chinese track and for the detection level in the Simpified Chinese track.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>huang-wang:2016:NLPTEA2016</bibkey>
  </paper>

  <paper id="4920">
    <title>Chinese Grammatical Error Diagnosis Using Single Word Embedding</title>
    <author><first>Jinnan</first><last>Yang</last></author>
    <author><first>Bo</first><last>Peng</last></author>
    <author><first>Jin</first><last>Wang</last></author>
    <author><first>Jixian</first><last>Zhang</last></author>
    <author><first>Xuejie</first><last>Zhang</last></author>
    <booktitle>Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)</booktitle>
    <month>December</month>
    <year>2016</year>
    <address>Osaka, Japan</address>
    <publisher>The COLING 2016 Organizing Committee</publisher>
    <pages>155&#8211;161</pages>
    <url>http://aclweb.org/anthology/W16-4920</url>
    <abstract>Abstract
	Automatic grammatical error detection for Chinese has been a big challenge for
	NLP researchers. Due to the formal and strict grammar rules in Chinese, it is
	hard for foreign students to master Chinese. A computer-assisted learning tool
	which can automatically detect and correct Chinese grammatical errors is
	necessary for those foreign students. Some of the previous works have sought to
	identify Chinese grammatical errors using template- and learning-based methods.
	In contrast, this study introduced convolutional neural network (CNN) and
	long-short term memory (LSTM) for the shared task of Chinese Grammatical Error
	Diagnosis (CGED). Different from traditional word-based embedding, single word
	embedding was used as input of CNN and LSTM. The proposed single word embedding
	can capture both semantic and syntactic information to detect those four type
	grammatical error. In experimental evaluation, the recall and f1-score of our
	submitted results Run1 of the TOCFL testing data ranked the fourth place in all
	submissions in detection-level.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>yang-EtAl:2016:NLPTEA2016</bibkey>
  </paper>

</volume>

