<?xml version="1.0" encoding="UTF-8" ?>
<volume id="W17">
  <paper id="1700">
    <title>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</title>
    <editor>Stella Markantonatou</editor>
    <editor>Carlos Ramisch</editor>
    <editor>Agata Savary</editor>
    <editor>Veronika Vincze</editor>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <url>http://www.aclweb.org/anthology/W17-17</url>
    <bibtype>book</bibtype>
    <bibkey>MWE2017:2017</bibkey>
  </paper>

  <paper id="1701">
    <title>ParaDi: Dictionary of Paraphrases of Czech Complex Predicates with Light Verbs</title>
    <author><first>Petra</first><last>Barancikova</last></author>
    <author><first>V&#225;clava</first><last>Kettnerov&#225;</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>1&#8211;10</pages>
    <url>http://www.aclweb.org/anthology/W17-1701</url>
    <abstract>We present a new freely available dictionary of paraphrases of Czech complex
	predicates with light verbs, ParaDi. Candidates for single predicative
	paraphrases of selected complex predicates have been extracted automatically
	from large monolingual data using word2vec. They have been manually verified
	and further refined. We demonstrate one of many possible applications of ParaDi
	in an experiment with improving machine translation quality.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>barancikova-kettnerova:2017:MWE2017</bibkey>
  </paper>

  <paper id="1702">
    <title>Multi-word Entity Classification in a Highly Multilingual Environment</title>
    <author><first>Sophie</first><last>Chesney</last></author>
    <author><first>Guillaume</first><last>Jacquet</last></author>
    <author><first>Ralf</first><last>Steinberger</last></author>
    <author><first>Jakub</first><last>Piskorski</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>11&#8211;20</pages>
    <url>http://www.aclweb.org/anthology/W17-1702</url>
    <abstract>This paper describes an approach for the classification of millions of existing
	multi-word entities (MWEntities), such as organisation or event names, into
	thirteen category types, based only on the tokens they contain. 
	In order to classify our very large in-house collection of multilingual
	MWEntities into an application-oriented set of entity categories, we trained
	and tested distantly-supervised classifiers in 43 languages based on MWEntities
	extracted from BabelNet. The best-performing classifier was the multi-class SVM
	using a TF.IDF-weighted data representation. Interestingly, one unique
	classifier trained on a mix of all languages consistently performed better than
	classifiers trained for individual languages, reaching an averaged F1-value of
	88.8%. In this paper, we present the training and test data, including a human
	evaluation of its accuracy, describe the methods used to train the classifiers,
	and discuss the results.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>chesney-EtAl:2017:MWE2017</bibkey>
  </paper>

  <paper id="1703">
    <title>Using bilingual word-embeddings for multilingual collocation extraction</title>
    <author><first>Marcos</first><last>Garcia</last></author>
    <author><first>Marcos</first><last>Garc&#237;a-Salido</last></author>
    <author><first>Margarita</first><last>Alonso-Ramos</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>21&#8211;30</pages>
    <url>http://www.aclweb.org/anthology/W17-1703</url>
    <abstract>This paper presents a new strategy for multilingual collocation extraction
	which takes advantage of parallel corpora to learn bilingual word-embeddings.
	Monolingual collocation candidates are retrieved using Universal Dependencies,
	while the distributional models are then applied to search for equivalents of
	the elements of each collocation in the target languages. The proposed method
	extracts not only collocation equivalents with direct translation between
	languages, but also other cases where the collocations in the two languages are
	not literal translations of each other.
	Several experiments -evaluating collocations with three syntactic patterns- in
	English, Spanish, and Portuguese show that our approach can effectively extract
	large pairs of bilingual equivalents with an average precision of about 90%.
	Moreover, preliminary results on comparable corpora suggest that the
	distributional models can be applied for identifying new bilingual collocations
	in different domains.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>garcia-garciasalido-alonsoramos:2017:MWE2017</bibkey>
  </paper>

  <paper id="1704">
    <title>The PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions</title>
    <author><first>Agata</first><last>Savary</last></author>
    <author><first>Carlos</first><last>Ramisch</last></author>
    <author><first>Silvio</first><last>Cordeiro</last></author>
    <author><first>Federico</first><last>Sangati</last></author>
    <author><first>Veronika</first><last>Vincze</last></author>
    <author><first>Behrang</first><last>QasemiZadeh</last></author>
    <author><first>Marie</first><last>Candito</last></author>
    <author><first>Fabienne</first><last>Cap</last></author>
    <author><first>Voula</first><last>Giouli</last></author>
    <author><first>Ivelina</first><last>Stoyanova</last></author>
    <author><first>Antoine</first><last>Doucet</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>31&#8211;47</pages>
    <url>http://www.aclweb.org/anthology/W17-1704</url>
    <abstract>Multiword expressions (MWEs) are
	known as a "pain in the neck" for NLP
	due to their idiosyncratic behaviour.
	While some categories of MWEs have
	been addressed by many studies, verbal
	MWEs (VMWEs), such as to take a
	decision, to break one’s heart or to turn
	off, have been rarely modelled. This is
	notably due to their syntactic variability,
	which hinders treating them as "words
	with spaces". We describe an initiative
	meant to bring about substantial progress
	in understanding, modelling and process-
	ing VMWEs. It is a joint effort, carried
	out within a European research network,
	to elaborate universal terminologies and
	annotation guidelines for 18 languages. Its
	main outcome is a multilingual 5-million-
	word annotated corpus which underlies a
	shared task on automatic identification of
	VMWEs. This paper presents the corpus
	annotation methodology and outcome, the
	shared task organisation and the results of
	the participating systems.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>savary-EtAl:2017:MWE2017</bibkey>
  </paper>

  <paper id="1705">
    <title>USzeged: Identifying Verbal Multiword Expressions with POS Tagging and Parsing Techniques</title>
    <author><first>Katalin Ilona</first><last>Simk&#243;</last></author>
    <author><first>Vikt&#243;ria</first><last>Kov&#225;cs</last></author>
    <author><first>Veronika</first><last>Vincze</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>48&#8211;53</pages>
    <url>http://www.aclweb.org/anthology/W17-1705</url>
    <abstract>The paper describes our system submitted for the Workshop on Multiword
	Expressions’ shared task on automatic identification of verbal multiword
	expressions. It uses POS tagging and dependency parsing to identify single- and
	multi-token verbal MWEs in text. Our system is language independent and
	competed on nine of the eighteen languages. Our paper describes how our system
	works and gives its error analysis for the languages it was submitted for.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>simko-kovacs-vincze:2017:MWE2017</bibkey>
  </paper>

  <paper id="1706">
    <title>Parsing and MWE Detection: Fips at the PARSEME Shared Task</title>
    <author><first>Luka</first><last>Nerima</last></author>
    <author><first>Vasiliki</first><last>Foufi</last></author>
    <author><first>Eric</first><last>Wehrli</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>54&#8211;59</pages>
    <url>http://www.aclweb.org/anthology/W17-1706</url>
    <abstract>Identifying multiword expressions (MWEs) in a sentence in order to ensure their
	proper processing in subsequent applications, like machine translation, and
	performing the syntactic analysis of the sentence are interrelated processes.
	In our approach, priority is given to parsing alternatives involving
	collocations, and hence collocational information helps the parser through the
	maze of alternatives, with the aim to lead to substantial improvements in the
	performance of both tasks (collocation identification and parsing), and in that
	of a subsequent task (machine translation). In this paper, we are going to
	present our system and the procedure that we have followed in order to
	participate to the open track of the PARSEME shared task on automatic
	identification of verbal multiword expressions (VMWEs) in running texts.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>nerima-foufi-wehrli:2017:MWE2017</bibkey>
  </paper>

  <paper id="1707">
    <title>Neural Networks for Multi-Word Expression Detection</title>
    <author><first>Natalia</first><last>Klyueva</last></author>
    <author><first>Antoine</first><last>Doucet</last></author>
    <author><first>Milan</first><last>Straka</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>60&#8211;65</pages>
    <url>http://www.aclweb.org/anthology/W17-1707</url>
    <abstract>In this paper we describe the MUMULS system that participated to the 2017
	shared task on automatic identification of verbal multiword expressions
	(VMWEs). The MUMULS system was implemented using a supervised approach based on
	recurrent neural networks using the open source library TensorFlow. The model
	was trained on a data set containing annotated VMWEs as well as morphological
	and syntactic information. The MUMULS system performed the identification of
	VMWEs in 15 languages, it was one of few systems that could categorize VMWEs
	type in nearly all languages.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>klyueva-doucet-straka:2017:MWE2017</bibkey>
  </paper>

  <paper id="1708">
    <title>Factoring Ambiguity out of the Prediction of Compositionality for German Multi-Word Expressions</title>
    <author><first>Stefan</first><last>Bott</last></author>
    <author><first>Sabine</first><last>Schulte im Walde</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>66&#8211;72</pages>
    <url>http://www.aclweb.org/anthology/W17-1708</url>
    <abstract>Ambiguity represents an obstacle for distributional semantic models(DSMs),
	which typically subsume the contexts of all word senses within one vector.
	While individual vector space approaches have been concerned with sense
	discrimination (e.g., Sch&#252;tze 1998, Erk 2009, Erk and Pado 2010), such
	discrimination has rarely been integrated into DSMs across semantic tasks. This
	paper presents a soft-clustering approach to sense discrimination that filters
	sense-irrelevant features when predicting the degrees of compositionality for
	German noun-noun compounds and German particle verbs.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>bott-schulteimwalde:2017:MWE2017</bibkey>
  </paper>

  <paper id="1709">
    <title>Multiword expressions and lexicalism: the view from LFG</title>
    <author><first>Jamie Y.</first><last>Findlay</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>73&#8211;79</pages>
    <url>http://www.aclweb.org/anthology/W17-1709</url>
    <abstract>Multiword expressions (MWEs) pose a problem for lexicalist theories like
	Lexical Functional Grammar (LFG), since they are prima facie counterexamples to
	a strong form of the lexical integrity principle, which entails that a lexical
	item can only be realised as a single, syntactically atomic word. In this
	paper, I demonstrate some of the problems facing any strongly lexicalist
	account of MWEs, and argue that the lexical integrity principle must be
	weakened. I conclude by sketching a formalism which integrates a Tree Adjoining
	Grammar into the LFG architecture, taking advantage of this relaxation.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>findlay:2017:MWE2017</bibkey>
  </paper>

  <paper id="1710">
    <title>Understanding Idiomatic Variation</title>
    <author><first>Kristina</first><last>Geeraert</last></author>
    <author><first>R. Harald</first><last>Baayen</last></author>
    <author><first>John</first><last>Newman</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>80&#8211;90</pages>
    <url>http://www.aclweb.org/anthology/W17-1710</url>
    <abstract>This study investigates the processing of idiomatic variants through an
	eye-tracking experiment. Four types of idiom variants were included, in
	addition to the canonical form and the literal meaning. Results suggest that
	modifications to idioms, modulo obvious effects of length differences, are not
	more difficult to process than the canonical forms themselves. This fits with
	recent corpus findings.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>geeraert-baayen-newman:2017:MWE2017</bibkey>
  </paper>

  <paper id="1711">
    <title>Discovering Light Verb Constructions and their Translations from Parallel Corpora without Word Alignment</title>
    <author><first>Natalie</first><last>Vargas</last></author>
    <author><first>Carlos</first><last>Ramisch</last></author>
    <author><first>Helena</first><last>Caseli</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>91&#8211;96</pages>
    <url>http://www.aclweb.org/anthology/W17-1711</url>
    <abstract>We propose a method for joint unsupervised discovery of multiword expressions
	(MWEs) and their translations from parallel corpora. First, we apply
	independent monolingual MWE extraction in source and target languages
	simultaneously. Then, we calculate translation probability, association score
	and distributional  similarity of co-occurring pairs. Finally, we rank all
	translations of a given MWE using a linear combination of these features.
	Preliminary experiments on light verb constructions show promising results.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>vargas-ramisch-caseli:2017:MWE2017</bibkey>
  </paper>

  <paper id="1712">
    <title>Identification of Multiword Expressions for Latvian and Lithuanian: Hybrid Approach</title>
    <author><first>Justina</first><last>Mandravickaite</last></author>
    <author><first>Tomas</first><last>Krilavi&#x10D;ius</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>97&#8211;101</pages>
    <url>http://www.aclweb.org/anthology/W17-1712</url>
    <abstract>We discuss an experiment on automatic identification of bi-gram multi-word
	expressions in parallel Latvian and Lithuanian corpora. Raw corpora, lexical
	association measures (LAMs) and supervised machine learning (ML) are used due
	to deficit and quality of lexical resources (e.g., POS-tagger, parser) and
	tools. While combining LAMs with ML is rather effective for other languages, it
	has shown some nice results for Lithuanian and Latvian as well. Combining LAMs
	with ML we have achieved 92,4% precision and 52,2% recall for Latvian and 95,1%
	precision and 77,8% recall for Lithuanian.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>mandravickaite-krilavivcius:2017:MWE2017</bibkey>
  </paper>

  <paper id="1713">
    <title>Show Me Your Variance and I Tell You Who You Are - Deriving Compound Compositionality from Word Alignments</title>
    <author><first>Fabienne</first><last>Cap</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>102&#8211;107</pages>
    <url>http://www.aclweb.org/anthology/W17-1713</url>
    <abstract>We use word alignment variance as an indicator for the non-compositionality of
	German and English noun compounds. Our work-in-progress results are on their
	own not competitive with state-of-the art approaches, but they show that
	alignment variance is correlated with compositionality and thus worth a
	closer look in the future.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>cap:2017:MWE2017</bibkey>
  </paper>

  <paper id="1714">
    <title>Semantic annotation to characterize contextual variation in terminological noun compounds: a pilot study</title>
    <author><first>Melania</first><last>Cabezas-Garc&#237;a</last></author>
    <author><first>Antonio</first><last>San Mart&#237;n</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>108&#8211;113</pages>
    <url>http://www.aclweb.org/anthology/W17-1714</url>
    <abstract>Noun compounds (NCs) are semantically complex and not fully compositional, as
	is often assumed. This paper presents a pilot study regarding the semantic
	annotation of environmental NCs with a view to accessing their semantics and
	exploring their domain-based contextual variation. Our results showed that the
	semantic annotation of NCs afforded important insights into how context impacts
	their conceptualization.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>cabezasgarcia-sanmartin:2017:MWE2017</bibkey>
  </paper>

  <paper id="1715">
    <title>Detection of Verbal Multi-Word Expressions via Conditional Random Fields with Syntactic Dependency Features and Semantic Re-Ranking</title>
    <author><first>Alfredo</first><last>Maldonado</last></author>
    <author><first>Lifeng</first><last>Han</last></author>
    <author><first>Erwan</first><last>Moreau</last></author>
    <author><first>Ashjan</first><last>Alsulaimani</last></author>
    <author><first>Koel Dutta</first><last>Chowdhury</last></author>
    <author><first>Carl</first><last>Vogel</last></author>
    <author><first>Qun</first><last>Liu</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>114&#8211;120</pages>
    <url>http://www.aclweb.org/anthology/W17-1715</url>
    <abstract>A description of a system for identifying Verbal Multi-Word Expressions (VMWEs)
	in running text is presented. The system mainly exploits universal syntactic
	dependency features through a Conditional Random Fields (CRF) sequence model.
	The system competed in the Closed Track at the PARSEME VMWE Shared Task 2017,
	ranking 2nd place in most languages on full VMWE-based evaluation and 1st in
	three languages on token-based evaluation. In addition, this paper presents an
	option to re-rank the 10 best CRF-predicted sequences via semantic vectors,
	boosting its scores above other systems in the competition. We also show that
	all systems in the competition would struggle to beat a simple lookup baseline
	system and argue for a more purpose-specific evaluation scheme.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>maldonado-EtAl:2017:MWE2017</bibkey>
  </paper>

  <paper id="1716">
    <title>A data-driven approach to verbal multiword expression detection. PARSEME Shared Task system description paper</title>
    <author><first>Tiberiu</first><last>Boro&#x15F;</last></author>
    <author><first>Sonia</first><last>Pipa</last></author>
    <author><first>Verginica</first><last>Barbu Mititelu</last></author>
    <author><first>Dan</first><last>Tufi&#x15F;</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>121&#8211;126</pages>
    <url>http://www.aclweb.org/anthology/W17-1716</url>
    <abstract>"Multiword expressions" are groups of words acting as a morphologic, syntactic
	and semantic unit in linguistic analysis. Verbal multiword expressions
	represent the subgroup of multiword expressions, namely that in which a verb is
	the syntactic head of the group considered in its canonical (or dictionary)
	form. All multiword expressions are a great challenge for natural language
	processing, but the verbal ones are particularly interesting for tasks such as
	parsing, as the verb is the central element in the syntactic organization of a
	sentence. In this paper we introduce our data-driven approach to verbal
	multiword expressions which was objectively validated during the PARSEME shared
	task on verbal multiword expressions identification. We tested our approach on
	12 languages, and we provide detailed information about corpora composition,
	feature selection process, validation procedure and performance on all
	languages.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>borocs-EtAl:2017:MWE2017</bibkey>
  </paper>

  <paper id="1717">
    <title>The ATILF-LLF System for Parseme Shared Task: a Transition-based Verbal Multiword Expression Tagger</title>
    <author><first>Hazem</first><last>Al Saied</last></author>
    <author><first>Matthieu</first><last>Constant</last></author>
    <author><first>Marie</first><last>Candito</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>127&#8211;132</pages>
    <url>http://www.aclweb.org/anthology/W17-1717</url>
    <abstract>We describe the ATILF-LLF system built for the MWE 2017 Shared Task on
	automatic identification of verbal multiword expressions. We participated in
	the closed track only, for all the 18 available languages. Our system is a
	robust greedy transition-based system, in which MWE are identified through a
	MERGE transition. The system was meant to accommodate the variety of linguistic
	resources provided for each language, in terms of accompanying morphological
	and syntactic information. Using per-MWE Fscore, the system was ranked first
	for all but two languages (Hungarian and Romanian).</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>alsaied-constant-candito:2017:MWE2017</bibkey>
  </paper>

  <paper id="1718">
    <title>Investigating the Opacity of Verb-Noun Multiword Expression Usages in Context</title>
    <author><first>Shiva</first><last>Taslimipoor</last></author>
    <author><first>Omid</first><last>Rohanian</last></author>
    <author><first>Ruslan</first><last>Mitkov</last></author>
    <author><first>Afsaneh</first><last>Fazly</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>133&#8211;138</pages>
    <url>http://www.aclweb.org/anthology/W17-1718</url>
    <abstract>This study investigates the supervised
	token-based identification of Multiword
	Expressions (MWEs). This is an ongoing
	research to exploit the information contained
	in the contexts in which different instances
	of an expression could occur. This
	information is used to investigate the question
	of whether an expression is literal or
	MWE. Lexical and syntactic context features
	derived from vector representations
	are shown to be more effective over traditional
	statistical measures to identify tokens
	of MWEs.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>taslimipoor-EtAl:2017:MWE2017</bibkey>
  </paper>

  <paper id="1719">
    <title>Compositionality in Verb-Particle Constructions</title>
    <author><first>Archna</first><last>Bhatia</last></author>
    <author><first>Choh Man</first><last>Teng</last></author>
    <author><first>James</first><last>Allen</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>139&#8211;148</pages>
    <url>http://www.aclweb.org/anthology/W17-1719</url>
    <abstract>We are developing a broad-coverage deep semantic lexicon for a system that
	parses sentences into a logical form expressed in a rich ontology that supports
	reasoning. In this paper we look at verb-particle constructions (VPCs), and the
	extent to which they can be treated compositionally vs idiomatically. First we
	distinguish between the different types of VPCs based on their compositionality
	and then present a set of heuristics for classifying specific instances as
	compositional or not. We then identify a small set of general sense classes for
	particles when used compositionally and discuss the resulting lexical
	representations that are being added to the lexicon. By treating VPCs as
	compositional whenever possible, we attain broad coverage in a compact way, and
	also enable interpretations of novel VPC usages not explicitly present in the
	lexicon.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>bhatia-teng-allen:2017:MWE2017</bibkey>
  </paper>

  <paper id="1720">
    <title>Rule-Based Translation of Spanish Verb-Noun Combinations into Basque</title>
    <author><first>Uxoa</first><last>I&#241;urrieta</last></author>
    <author><first>Itziar</first><last>Aduriz</last></author>
    <author><first>Arantza</first><last>Diaz de Ilarraza</last></author>
    <author><first>Gorka</first><last>Labaka</last></author>
    <author><first>Kepa</first><last>Sarasola</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>149&#8211;154</pages>
    <url>http://www.aclweb.org/anthology/W17-1720</url>
    <abstract>This paper presents a method to improve the translation of Verb-Noun
	Combinations (VNCs) in a rule-based Machine Translation (MT) system for
	Spanish-Basque. Linguistic information about a set of VNCs is gathered from the
	public database Konbitzul, and it is integrated into the MT system, leading to
	an improvement in BLEU, NIST and TER scores, as well as the results being
	evidently better according to human evaluators.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>inurrieta-EtAl:2017:MWE2017</bibkey>
  </paper>

  <paper id="1721">
    <title>Verb-Particle Constructions in Questions</title>
    <author><first>Veronika</first><last>Vincze</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>155&#8211;160</pages>
    <url>http://www.aclweb.org/anthology/W17-1721</url>
    <abstract>In this paper, we investigate the behavior of verb-particle constructions in
	English questions. We present a small dataset that contains questions and
	verb-particle
	construction candidates. We demonstrate that there are significant differences
	in the distribution of WH-words, verbs and prepositions/particles in sentences
	that contain VPCs and sentences that contain only verb + prepositional phrase
	combinations both by statistical means and in machine learning experiments.
	Hence, VPCs and non-VPCs can be effectively separated from each other by using
	a rich feature set, containing several novel features.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>vincze:2017:MWE2017</bibkey>
  </paper>

  <paper id="1722">
    <title>Simple Compound Splitting for German</title>
    <author><first>Marion</first><last>Weller-Di Marco</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>161&#8211;166</pages>
    <url>http://www.aclweb.org/anthology/W17-1722</url>
    <abstract>This paper presents a simple method for
	German compound splitting that combines
	a basic frequency-based approach with a
	form-to-lemma mapping to approximate
	morphological operations. With the exception 
	of a small set of hand-crafted rules
	for modeling transitional elements, this 
	approach is resource-poor. In our evaluation,
	the simple splitter outperforms a splitter
	relying on rich morphological resources.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>wellerdimarco:2017:MWE2017</bibkey>
  </paper>

  <paper id="1723">
    <title>Identification of Ambiguous Multiword Expressions Using Sequence Models and Lexical Resources</title>
    <author><first>Manon</first><last>Scholivet</last></author>
    <author><first>Carlos</first><last>Ramisch</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>167&#8211;175</pages>
    <url>http://www.aclweb.org/anthology/W17-1723</url>
    <abstract>We present a simple and efficient tagger capable of identifying highly
	ambiguous multiword expressions (MWEs) in French texts. It is based on
	conditional random fields (CRF), using local context information as features.
	We show that this approach can obtain results that, in some cases, approach
	more sophisticated parser-based MWE identification methods without requiring
	syntactic trees from a treebank. Moreover, we study how well the CRF can take
	into account external information coming from a lexicon.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>scholivet-ramisch:2017:MWE2017</bibkey>
  </paper>

  <paper id="1724">
    <title>Comparing Recurring Lexico-Syntactic Trees (RLTs) and Ngram Techniques for Extended Phraseology Extraction</title>
    <author><first>Agn&#232;s</first><last>Tutin</last></author>
    <author><first>Olivier</first><last>Kraif</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>176&#8211;180</pages>
    <url>http://www.aclweb.org/anthology/W17-1724</url>
    <abstract>This paper aims at assessing to what extent a syntax-based method (Recurring
	Lexico-syntactic Trees                                (RLT) extraction) allows us to
	extract
	large
	phraseological units such as prefabricated routines, e.g. "as previously said"
	or "as far as we/I know" in scientific writing.  In order to evaluate this
	method, we compare it to the classical ngram extraction technique, on a subset
	of recurring segments including speech verbs in a French corpus of scientific
	writing. Results show that  the LRT extraction technique is far more efficient
	for extended MWEs such as routines or collocations but performs more poorly for
	surface phenomena such as syntactic constructions or fully frozen expressions.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>tutin-kraif:2017:MWE2017</bibkey>
  </paper>

  <paper id="1725">
    <title>Benchmarking Joint Lexical and Syntactic Analysis on Multiword-Rich Data</title>
    <author><first>Matthieu</first><last>Constant</last></author>
    <author><first>H&#233;ctor</first><last>Mart&#237;nez Alonso</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>181&#8211;186</pages>
    <url>http://www.aclweb.org/anthology/W17-1725</url>
    <abstract>This article evaluates the extension of a dependency parser that performs joint
	syntactic analysis and multiword expression identification. We show that, given
	sufficient training data, the parser benefits from explicit multiword
	information and improves overall labeled accuracy score in eight of the ten
	evaluation cases.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>constant-martinezalonso:2017:MWE2017</bibkey>
  </paper>

  <paper id="1726">
    <title>Semi-Automated Resolution of Inconsistency for a Harmonized Multiword Expression and Dependency Parse Annotation</title>
    <author><first>King</first><last>Chan</last></author>
    <author><first>Julian</first><last>Brooke</last></author>
    <author><first>Timothy</first><last>Baldwin</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>187&#8211;193</pages>
    <url>http://www.aclweb.org/anthology/W17-1726</url>
    <abstract>This paper presents a methodology for identifying and resolving various kinds
	of inconsistency in the context of merging dependency and multiword expression
	(MWE) annotations, to generate a dependency treebank with comprehensive MWE
	annotations. Candidates for correction are identified using a variety of
	heuristics, including an entirely novel one which identifies violations of MWE
	constituency in the dependency tree, and resolved by arbitration with minimal
	human intervention. Using this technique, we identified and corrected several
	hundred errors across both parse and MWE annotations, representing changes to a
	significant percentage (well over 10%) of the MWE instances in the joint
	corpus.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>chan-brooke-baldwin:2017:MWE2017</bibkey>
  </paper>

  <paper id="1727">
    <title>Combining Linguistic Features for the Detection of Croatian Multiword Expressions</title>
    <author><first>Maja</first><last>Buljan</last></author>
    <author><first>Jan</first><last>&#x160;najder</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>194&#8211;199</pages>
    <url>http://www.aclweb.org/anthology/W17-1727</url>
    <abstract>As multiword expressions (MWEs) exhibit a range of idiosyncrasies, their
	automatic detection warrants the use of many different features. Tsvetkov and
	Wintner (2014) proposed a Bayesian network model that combines linguistically
	motivated features and also models their interactions. In this paper, we extend
	their model with new features and apply it to Croatian, a morphologically
	complex and a relatively free word order language, achieving a satisfactory
	performance of 0.823 F1-score. Furthermore, by comparing against (semi)naive
	Bayes models, we demonstrate that manually modeling feature interactions is
	indeed important.  We make our annotated dataset of Croatian MWEs freely
	available.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>buljan-vsnajder:2017:MWE2017</bibkey>
  </paper>

  <paper id="1728">
    <title>Complex Verbs are Different: Exploring the Visual Modality in Multi-Modal Models to Predict Compositionality</title>
    <author><first>Maximilian</first><last>K&#246;per</last></author>
    <author><first>Sabine</first><last>Schulte im Walde</last></author>
    <booktitle>Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)</booktitle>
    <month>April</month>
    <year>2017</year>
    <address>Valencia, Spain</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>200&#8211;206</pages>
    <url>http://www.aclweb.org/anthology/W17-1728</url>
    <abstract>This paper compares a neural network DSM relying on textual co-occurrences with
	a multi-modal model integrating visual information. We focus on nominal vs.
	verbal compounds, and zoom into lexical, empirical and perceptual target
	properties to explore the contribution of the visual modality. Our experiments
	show that (i)  visual features contribute differently for verbs than for nouns,
	and (ii) images complement textual information, if (a) the textual modality by
	itself is poor and appropriate image subsets are used, or (b) the textual
	modality by itself is rich and large (potentially noisy) images are added.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>koper-schulteimwalde:2017:MWE2017</bibkey>
  </paper>

</volume>

