<?xml version="1.0" encoding="UTF-8" ?>
<volume id="W17">
  <paper id="6000">
    <title>Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing</title>
    <editor>Yue Zhang</editor>
    <editor>Zhifang Sui</editor>
    <month>December</month>
    <year>2017</year>
    <address>Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <url>http://www.aclweb.org/anthology/W17-60</url>
    <bibtype>book</bibtype>
    <bibkey>SIGHAN-9:2017</bibkey>
  </paper>

  <paper id="6001">
    <title>Group Linguistic Bias Aware Neural Response Generation</title>
    <author><first>Jianan</first><last>Wang</last></author>
    <author><first>Xin</first><last>Wang</last></author>
    <author><first>Fang</first><last>Li</last></author>
    <author><first>Zhen</first><last>Xu</last></author>
    <author><first>Zhuoran</first><last>Wang</last></author>
    <author><first>Baoxun</first><last>Wang</last></author>
    <booktitle>Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing</booktitle>
    <month>December</month>
    <year>2017</year>
    <address>Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>1&#8211;10</pages>
    <url>http://www.aclweb.org/anthology/W17-6001</url>
    <abstract>For practical chatbots, one of the essential factor for improving user
	experience is the capability of customizing the talking style of the agents,
	that is, to make chatbots provide responses meeting users' preference on
	language styles, topics, etc. To address this issue, this paper proposes to
	incorporate linguistic biases, which implicitly involved in the conversation
	corpora generated by human groups in the Social Network Services (SNS), into
	the encoder-decoder based response generator. By attaching a specially designed
	neural component to dynamically control the impact of linguistic biases in
	response generation, a Group Linguistic Bias Aware Neural Response Generation
	(GLBA-NRG) model is eventually presented. The experimental results on the
	dataset from the Chinese SNS show that the proposed architecture outperforms
	the current response generating models by producing both meaningful and vivid
	responses with customized styles.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>wang-EtAl:2017:SIGHAN-9</bibkey>
  </paper>

  <paper id="6002">
    <title>Neural Regularized Domain Adaptation for Chinese Word Segmentation</title>
    <author><first>Zuyi</first><last>Bao</last></author>
    <author><first>Si</first><last>Li</last></author>
    <author><first>Weiran</first><last>XU</last></author>
    <author><first>Sheng</first><last>GAO</last></author>
    <booktitle>Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing</booktitle>
    <month>December</month>
    <year>2017</year>
    <address>Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>11&#8211;20</pages>
    <url>http://www.aclweb.org/anthology/W17-6002</url>
    <abstract>For Chinese word segmentation, the large-scale annotated corpora mainly focus
	on newswire and only a handful of annotated data is available in other domains
	such as patents and literature. Considering the limited amount of annotated
	target domain data, it is a challenge for segmenters to learn domain-specific
	information while avoid getting over-fitted at the same time. In this paper, we
	propose a neural regularized domain adaptation method for Chinese word
	segmentation. The teacher networks trained in source domain are employed to
	regularize the training process of the student network by preserving the
	general knowledge. In the experiments, our neural regularized domain adaptation
	method achieves a better performance comparing to previous methods.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>bao-EtAl:2017:SIGHAN-9</bibkey>
  </paper>

  <paper id="6003">
    <title>The Sentimental Value of Chinese Sub-Character Components</title>
    <author><first>Yassine</first><last>Benajiba</last></author>
    <author><first>Or</first><last>Biran</last></author>
    <author><first>Zhiliang</first><last>Weng</last></author>
    <author><first>Yong</first><last>Zhang</last></author>
    <author><first>Jin</first><last>Sun</last></author>
    <booktitle>Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing</booktitle>
    <month>December</month>
    <year>2017</year>
    <address>Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>21&#8211;29</pages>
    <url>http://www.aclweb.org/anthology/W17-6003</url>
    <abstract>Sub-character components of Chinese characters carry important semantic
	information, and recent studies have shown that utilizing this information can
	improve performance on core semantic tasks. In this paper, we hypothesize that
	in addition to semantic information, sub-character components may also carry
	emotional information, and that utilizing it should improve performance on
	sentiment analysis tasks. We conduct a series of experiments on four Chinese
	sentiment data sets and show that we can significantly improve the performance
	in various tasks over that of a character-level embeddings baseline. We then
	focus on qualitatively assessing multiple examples and trying to explain how
	the sub-character components affect the results in each case.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>benajiba-EtAl:2017:SIGHAN-9</bibkey>
  </paper>

  <paper id="6004">
    <title>Chinese Answer Extraction Based on POS Tree and Genetic Algorithm</title>
    <author><first>Shuihua</first><last>Li</last></author>
    <author><first>Xiaoming</first><last>Zhang</last></author>
    <author><first>Zhoujun</first><last>Li</last></author>
    <booktitle>Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing</booktitle>
    <month>December</month>
    <year>2017</year>
    <address>Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>30&#8211;36</pages>
    <url>http://www.aclweb.org/anthology/W17-6004</url>
    <abstract>Answer extraction is the most important part of a chinese web-based question
	answering system. In order to enhance the robustness and adaptability of answer
	extraction to new domains and eliminate the influence of the incomplete and
	noisy search snippets, we propose two new answer exraction methods. We utilize
	text patterns to generate Part-of-Speech (POS) patterns. In addition, a method
	is proposed to construct a POS tree by using these POS patterns. The POS tree
	is useful to candidate answer extraction of web-based question answering. To
	retrieve a efficient POS tree, the similarities between questions are used to
	select the question-answer pairs whose questions are similar to the unanswered
	question. Then, the POS tree is improved based on these question-answer pairs.
	In order to rank these candidate answers, the weights of the leaf nodes of the
	POS tree are calculated using a heuristic method. Moreover, the Genetic
	Algorithm (GA) is used to train the weights. The experimental results of
	10-fold crossvalidation show that the weighted POS tree trained by GA can
	improve the accu-
	racy of answer extraction.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>li-zhang-li:2017:SIGHAN-9</bibkey>
  </paper>

  <paper id="6005">
    <title>Learning from Parenthetical Sentences for Term Translation in Machine Translation</title>
    <author><first>Guoping</first><last>Huang</last></author>
    <author><first>Jiajun</first><last>Zhang</last></author>
    <author><first>Yu</first><last>Zhou</last></author>
    <author><first>Chengqing</first><last>Zong</last></author>
    <booktitle>Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing</booktitle>
    <month>December</month>
    <year>2017</year>
    <address>Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>37&#8211;45</pages>
    <url>http://www.aclweb.org/anthology/W17-6005</url>
    <abstract>Terms extensively exist in specific domains, and term translation plays a
	critical role in domain-specific machine translation (MT) tasks. However, it's
	a challenging task to translate them correctly for the huge number of
	pre-existing terms and the endless new terms. To achieve better term
	translation quality, it is necessary to inject external term knowledge into the
	underlying MT system. Fortunately, there are plenty of term translation
	knowledge in parenthetical sentences on the Internet. In this paper, we propose
	a simple, straightforward and effective framework to improve term translation
	by learning from parenthetical sentences. This framework includes: (1) a
	focused web crawler; (2) a parenthetical sentence filter, acquiring
	parenthetical sentences including bilingual term pairs; (3) a term translation
	knowledge extractor, extracting bilingual term translation candidates; (4) a
	probability learner, generating the term translation table for MT decoders. The
	extensive experiments demonstrate that our proposed framework significantly
	improves the translation quality of terms and sentences.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>huang-EtAl:2017:SIGHAN-9</bibkey>
  </paper>

</volume>

