<?xml version="1.0" encoding="UTF-8" ?>
<volume id="W17">
  <paper id="2400">
    <title>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</title>
    <editor>Martin Riedl</editor>
    <editor>Swapna Somasundaran</editor>
    <editor>Goran Glavaš</editor>
    <editor>Eduard Hovy</editor>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <url>http://www.aclweb.org/anthology/W17-24</url>
    <bibtype>book</bibtype>
    <bibkey>TextGraphs-11:2017</bibkey>
  </paper>

  <paper id="2401">
    <title>On the "Calligraphy" of Books</title>
    <author><first>Vanessa Queiroz</first><last>Marinho</last></author>
    <author><first>Henrique Ferraz</first><last>de Arruda</last></author>
    <author><first>Thales</first><last>Sinelli</last></author>
    <author><first>Luciano da Fontoura</first><last>Costa</last></author>
    <author><first>Diego Raphael</first><last>Amancio</last></author>
    <booktitle>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</booktitle>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>1&#8211;10</pages>
    <url>http://www.aclweb.org/anthology/W17-2401</url>
    <abstract>Authorship attribution is a natural language processing task that has been
	widely studied, often by considering small order statistics. In this paper, we
	explore a complex network approach to assign the authorship of texts based on
	their mesoscopic representation, in an attempt to capture the flow of the
	narrative.  Indeed, as reported in this work, such an approach allowed the
	identification of the dominant narrative structure of the studied authors.  
	This has been achieved due to the ability of the mesoscopic approach to take
	into account relationships between different, not necessarily adjacent, parts
	of the text, which is able to capture the story flow. The potential of the
	proposed approach has been illustrated through principal component analysis, a
	comparison with the chance baseline method, and network visualization. Such
	visualizations reveal individual characteristics of the authors, which can be
	understood as a kind of calligraphy.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>marinho-EtAl:2017:TextGraphs-11</bibkey>
  </paper>

  <paper id="2402">
    <title>Adapting predominant and novel sense discovery algorithms for identifying corpus-specific sense differences</title>
    <author><first>Binny</first><last>Mathew</last></author>
    <author><first>Suman Kalyan</first><last>Maity</last></author>
    <author><first>Pratip</first><last>Sarkar</last></author>
    <author><first>Animesh</first><last>Mukherjee</last></author>
    <author><first>Pawan</first><last>Goyal</last></author>
    <booktitle>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</booktitle>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>11&#8211;20</pages>
    <url>http://www.aclweb.org/anthology/W17-2402</url>
    <abstract>Word senses are not static and may have temporal, spatial or corpus-specific
	scopes. Identifying such scopes might benefit the existing WSD systems largely.
	In this paper, while studying corpus specific word senses, we adapt three
	existing predominant and novel-sense discovery algorithms to identify these
	corpus-specific senses. We make use of text data available in the form of
	millions of digitized books and newspaper archives as two different sources of
	corpora and propose automated methods to identify corpus-specific word senses
	at various time points. We conduct an extensive and thorough human judgement
	experiment to rigorously evaluate and compare the performance of these
	approaches. Post adaptation, the output of the three algorithms are in the same
	format and the accuracy results are also comparable, with roughly 45-60% of the
	reported corpus-specific senses being judged as genuine.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>mathew-EtAl:2017:TextGraphs-11</bibkey>
  </paper>

  <paper id="2403">
    <title>Merging knowledge bases in different languages</title>
    <author><first>Jer&#243;nimo</first><last>Hern&#225;ndez-Gonz&#225;lez</last></author>
    <author><first>Estevam R.</first><last>Hruschka Jr.</last></author>
    <author><first>Tom M.</first><last>Mitchell</last></author>
    <booktitle>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</booktitle>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>21&#8211;29</pages>
    <url>http://www.aclweb.org/anthology/W17-2403</url>
    <abstract>Recently, different systems which learn to populate and extend a knowledge base
	(KB) from the web in different languages have been presented. Although a large
	set of concepts should be learnt independently from the language used to read,
	there are facts which are expected to be more easily gathered in local language
	(e.g., culture or geography). A system that merges KBs learnt in different
	languages will benefit from the complementary information as long as common
	beliefs are identified, as well as from redundancy present in web pages written
	in different languages. In this paper, we deal with the problem of identifying
	equivalent beliefs (or concepts) across language specific KBs, assuming that
	they share the same ontology of categories and relations. In a case study with
	two KBs independently learnt from different inputs, namely web pages written in
	English and web pages written in Portuguese respectively, we report on the
	results of two methodologies: an approach based on personalized PageRank and an
	inference technique to find out common relevant paths through the KBs. The
	proposed inference technique efficiently identifies relevant paths,
	outperforming the baseline (a dictionary-based classifier) in the vast majority
	of tested categories.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>hernandezgonzalez-hruschkajr-mitchell:2017:TextGraphs-11</bibkey>
  </paper>

  <paper id="2404">
    <title>Parameter Free Hierarchical Graph-Based Clustering for Analyzing Continuous Word Embeddings</title>
    <author><first>Thomas Alexander</first><last>Trost</last></author>
    <author><first>Dietrich</first><last>Klakow</last></author>
    <booktitle>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</booktitle>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>30&#8211;38</pages>
    <url>http://www.aclweb.org/anthology/W17-2404</url>
    <abstract>Word embeddings are high-dimensional vector representations of words and are
	thus difficult to interpret. In order to deal with this, we introduce an
	unsupervised parameter free method for creating a hierarchical graphical
	clustering of the full ensemble of word vectors and show that this structure is
	a geometrically meaningful representation of the original relations between the
	words. This newly obtained representation can be used for better understanding
	and thus improving the embedding algorithm and exhibits semantic meaning,
	so it can also be utilized in a variety of language processing tasks like
	categorization or measuring similarity.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>trost-klakow:2017:TextGraphs-11</bibkey>
  </paper>

  <paper id="2405">
    <title>Spectral Graph-Based Method of Multimodal Word Embedding</title>
    <author><first>Kazuki</first><last>Fukui</last></author>
    <author><first>Takamasa</first><last>Oshikiri</last></author>
    <author><first>Hidetoshi</first><last>Shimodaira</last></author>
    <booktitle>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</booktitle>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>39&#8211;44</pages>
    <url>http://www.aclweb.org/anthology/W17-2405</url>
    <abstract>In this paper, we propose a novel method for multimodal word embedding, which
	exploit a generalized framework of multi-view spectral graph embedding to take
	into account visual appearances or scenes denoted by words in a corpus.  
	We evaluated our method through word similarity tasks and
	a concept-to-image search task, having found that it provides word
	representations that reflect visual information, while somewhat trading-off the
	performance on the word similarity tasks. Moreover, we demonstrate that our
	method captures multimodal linguistic regularities, which enable recovering
	relational similarities between words and images by vector arithmetics.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>fukui-oshikiri-shimodaira:2017:TextGraphs-11</bibkey>
  </paper>

  <paper id="2406">
    <title>Graph Methods for Multilingual FrameNets</title>
    <author><first>Collin</first><last>Baker</last></author>
    <author><first>Michael</first><last>Ellsworth</last></author>
    <booktitle>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</booktitle>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>45&#8211;50</pages>
    <url>http://www.aclweb.org/anthology/W17-2406</url>
    <attachment type="presentation">W17-2406.Presentation.pdf</attachment>
    <abstract>This paper introduces a new, graph-based view of the data of the FrameNet
	project, which we hope will make it easier to understand the mixture of
	semantic and syntactic information contained in FrameNet annotation.  We show
	how English FrameNet and other Frame Semantic resources can be represented as
	sets of interconnected graphs of frames, frame elements, semantic types, and
	annotated instances of them in text.  We display examples of the new graphical
	representation based on the annotations, which combine Frame Semantics and
	Construction Grammar, thus capturing most of the syntax and semantics of each
	sentence.  We consider how graph theory could help researchers to make better
	use of FrameNet data for tasks such as automatic Frame Semantic role labeling,
	paraphrasing, and translation.              Finally, we describe the development of
	FrameNet-like lexical resources for other languages in the current Multilingual
	FrameNet project.  which seeks to discover cross-lingual alignments, both in
	the lexicon (for frames and lexical units within frames) and across parallel or
	comparable texts.  We conclude with an example showing graphically the semantic
	and syntactic similarities and differences between parallel sentences in
	English and Japanese.  We will release software for displaying such graphs from
	the current data releases.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>baker-ellsworth:2017:TextGraphs-11</bibkey>
  </paper>

  <paper id="2407">
    <title>Extract with Order for Coherent Multi-Document Summarization</title>
    <author><first>Mir Tafseer</first><last>Nayeem</last></author>
    <author><first>Yllias</first><last>Chali</last></author>
    <booktitle>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</booktitle>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>51&#8211;56</pages>
    <url>http://www.aclweb.org/anthology/W17-2407</url>
    <abstract>In this work, we aim at developing an extractive summarizer in the
	multi-document setting. We implement a rank based sentence selection using
	continuous vector representations along with key-phrases. Furthermore, we
	propose a model to tackle summary coherence for increasing readability.  We
	conduct experiments on the Document Understanding Conference (DUC) 2004
	datasets using ROUGE toolkit. Our experiments demonstrate that the methods
	bring significant improvements over the state of the art methods in terms of
	informativity and coherence.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>nayeem-chali:2017:TextGraphs-11</bibkey>
  </paper>

  <paper id="2408">
    <title>Work Hard, Play Hard: Email Classification on the Avocado and Enron Corpora</title>
    <author><first>Sakhar</first><last>Alkhereyf</last></author>
    <author><first>Owen</first><last>Rambow</last></author>
    <booktitle>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</booktitle>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>57&#8211;65</pages>
    <url>http://www.aclweb.org/anthology/W17-2408</url>
    <abstract>In this paper, we present an empirical study of email classification into
	  two main categories &#x201c;Business" and &#x201c;Personal".  We train on the Enron
	  email corpus, and test on the Enron and Avocado email corpora. We show
	  that information from the email exchange networks improves the
	  performance of classification. We represent the email exchange networks
	  as social networks with
	  graph structures. For this classification task, we extract social
	  networks features from the graphs in addition to lexical features from
	  email content and we compare the performance of SVM and Extra-Trees
	  classifiers using these features.  Combining graph features with lexical
	  features improves the performance on both classifiers. We also provide
	  manually annotated sets of the Avocado and Enron email corpora as
	  a supplementary contribution.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>alkhereyf-rambow:2017:TextGraphs-11</bibkey>
  </paper>

  <paper id="2409">
    <title>A Graph Based Semi-Supervised Approach for Analysis of Derivational Nouns in Sanskrit</title>
    <author><first>Amrith</first><last>Krishna</last></author>
    <author><first>Pavankumar</first><last>Satuluri</last></author>
    <author><first>Harshavardhan</first><last>Ponnada</last></author>
    <author><first>Muneeb</first><last>Ahmed</last></author>
    <author><first>Gulab</first><last>Arora</last></author>
    <author><first>Kaustubh</first><last>Hiware</last></author>
    <author><first>Pawan</first><last>Goyal</last></author>
    <booktitle>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</booktitle>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>66&#8211;75</pages>
    <url>http://www.aclweb.org/anthology/W17-2409</url>
    <abstract>Derivational nouns are widely used in Sanskrit corpora and represent an
	important cornerstone of productivity in the language. Currently there exists
	no analyser that identifies the derivational nouns. We propose a semi
	supervised approach for identification of derivational nouns in Sanskrit. We
	not only identify the derivational words, but also link them to their
	corresponding source words. Our novelty comes in the design of the network
	structure for the task. The edge weights are featurised based on the phonetic,
	morphological, syntactic and the semantic similarity shared between the words
	to be identified. We find that our model is effective for the task, even when
	we employ a labelled dataset which is only 5 % to that of the entire dataset.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>krishna-EtAl:2017:TextGraphs-11</bibkey>
  </paper>

  <paper id="2410">
    <title>Evaluating text coherence based on semantic similarity graph</title>
    <author><first>Jan Wira Gotama</first><last>Putra</last></author>
    <author><first>Takenobu</first><last>Tokunaga</last></author>
    <booktitle>Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing</booktitle>
    <month>August</month>
    <year>2017</year>
    <address>Vancouver, Canada</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>76&#8211;85</pages>
    <url>http://www.aclweb.org/anthology/W17-2410</url>
    <attachment type="presentation">W17-2410.Presentation.pdf</attachment>
    <abstract>Coherence is a crucial feature of text because
	it is indispensable for conveying its
	communication purpose and meaning to
	its readers. In this paper, we propose an
	unsupervised text coherence scoring based
	on graph construction in which edges are
	established between semantically similar
	sentences represented by vertices. The
	sentence similarity is calculated based on
	the cosine similarity of semantic vectors
	representing sentences. We provide three
	graph construction methods establishing
	an edge from a given vertex to a preceding
	adjacent vertex, to a single similar
	vertex, or to multiple similar vertices.
	We evaluated our methods in the document
	discrimination task and the insertion
	task by comparing our proposed methods
	to the supervised (Entity Grid) and unsupervised
	(Entity Graph) baselines. In the
	document discrimination task, our method
	outperformed the unsupervised baseline
	but could not do the supervised baseline,
	while in the insertion task, our method outperformed
	both baselines.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>putra-tokunaga:2017:TextGraphs-11</bibkey>
  </paper>

</volume>

