Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP

Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP GeorgianaDinu MiguelBallesteros AvirupSil SamBowman WaelHamza AndersSogaard TahiraNaseem YoavGoldberg July 2018

Melbourne, Australia

Association for Computational Linguistics http://www.aclweb.org/anthology/W18-29 book W18-29:2018 Compositional Morpheme Embeddings with Affixes as Functions and Stems as Arguments DanielEdmiston KarlStratos Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP July 2018

Melbourne, Australia

Association for Computational Linguistics 1–5 This work introduces a novel, linguistically motivated architecture for composing morphemes to derive word embeddings. The principal novelty in the work is to treat stems as vectors and affixes as functions over vectors. In this way, our model’s architecture more closely resembles the compositionality of morphemes in natural language. Such a model stands in opposition to models which treat morphemes uniformly, making no distinction between stem and affix. We run this new architecture on a dependency parsing task in Korean—a language rich in derivational morphology—and compare it against a lexical baseline,along with other sub-word architectures. StAffNet, the name of our architecture, shows competitive performance with the state-of-the-art on this task. http://www.aclweb.org/anthology/W18-2901 inproceedings edmiston-stratos:2018:W18-29 Unsupervised Source Hierarchies for Low-Resource Neural Machine Translation AnnaCurrey KennethHeafield Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP July 2018

Melbourne, Australia

Association for Computational Linguistics 6–12 Incorporating source syntactic information into neural machine translation (NMT) has recently proven successful (Eriguchi et al., 2016; Luong et al., 2016). However, this is generally done using an outside parser to syntactically annotate the training data, making this technique difficult to use for languages or domains for which a reliable parser is not available. In this paper, we introduce an unsupervised tree-to-sequence (tree2seq) model for neural machine translation; this model is able to induce an unsupervised hierarchical structure on the source sentence based on the downstream task of neural machine translation. We adapt the Gumbel tree-LSTM of Choi et al. (2018) to NMT in order to create the encoder. We evaluate our model against sequential and supervised parsing baselines on three low- and medium-resource language pairs. For low-resource cases, the unsupervised tree2seq encoder significantly outperforms the baselines; no improvements are seen for medium-resource translation. http://www.aclweb.org/anthology/W18-2902 inproceedings currey-heafield:2018:W18-29 Latent Tree Learning with Differentiable Parsers: Shift-Reduce Parsing and Chart Parsing JeanMaillard StephenClark Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP July 2018

Melbourne, Australia

Association for Computational Linguistics 13–18 Latent tree learning models represent sentences by composing their words according to an induced parse tree, all based on a downstream task. These models often outperform baselines which use (externally provided) syntax trees to drive the composition order. This work contributes (a) a new latent tree learning model based on shift-reduce parsing, with competitive downstream performance and non-trivial induced trees, and (b) an analysis of the trees learned by our shift-reduce model and by a chart-based model. http://www.aclweb.org/anthology/W18-2903 inproceedings maillard-clark:2018:W18-29 Syntax Helps ELMo Understand Semantics: Is Syntax Still Relevant in a Deep Neural Architecture for SRL? EmmaStrubell AndrewMcCallum Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP July 2018

Melbourne, Australia

Association for Computational Linguistics 19–27 Do unsupervised methods for learning rich, contextualized token representations obviate the need for explicit modeling of linguistic structure in neural network models for semantic role labeling (SRL)? We address this question by incorporating the massively successful ELMo embeddings peters2018deep into LISA anon2018linguistically, a strong, linguistically-informed neural network architecture for SRL. In experiments on the CoNLL-2005 shared task we find that though ELMo out-performs typical word embeddings, beginning to close the gap in F1 between LISA with predicted and gold syntactic parses, syntactically-informed models still out-perform syntax-free models when both use ELMo, especially on out-of-domain data. Our results suggest that linguistic structures are indeed still relevant in this golden age of deep learning for NLP. http://www.aclweb.org/anthology/W18-2904 inproceedings strubell-mccallum:2018:W18-29 Subcharacter Information in Japanese Embeddings: When Is It Worth It? MarzenaKarpinska BofangLi AnnaRogers AleksandrDrozd Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP July 2018

Melbourne, Australia

Association for Computational Linguistics 28–37 Languages with logographic writing systems present a difficulty for traditional character-level models. Leveraging the subcharacter information was recently shown to be beneficial for a number of intrinsic and extrinsic tasks in Chinese. We examine whether the same strategies could be applied for Japanese, and contribute a new analogy dataset for this language. http://www.aclweb.org/anthology/W18-2905 inproceedings karpinska-EtAl:2018:W18-29 A neural parser as a direct classifier for head-final languages HiroshiKanayama MasayasuMuraoka RyosukeKohita Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP July 2018

Melbourne, Australia

Association for Computational Linguistics 38–46 This paper demonstrates a neural parser implementation suitable for consistently head-final languages such as Japanese. Unlike the transition- and graph-based algorithms in most state-of-the-art parsers, our parser directly selects the head word of a dependent from a limited number of candidates. This method drastically simplifies the model so that we can easily interpret the output of the neural model. Moreover, by exploiting grammatical knowledge to restrict possible modification types, we can control the output of the parser to reduce specific errors without adding annotated corpora. The neural parser performed well both on conventional Japanese corpora and the Japanese version of Universal Dependency corpus, and the advantages of distributed representations were observed in the comparison with the non-neural conventional model. http://www.aclweb.org/anthology/W18-2906 inproceedings kanayama-muraoka-kohita:2018:W18-29 Syntactic Dependency Representations in Neural Relation Classification FarhadNooralahzadeh LiljaØvrelid Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP July 2018

Melbourne, Australia

Association for Computational Linguistics 47–53 We investigate the use of different syntactic dependency representations in a neural relation classification task and compare the CoNLL, Stanford Basic and Universal Dependencies schemes. We further compare with a syntax-agnostic approach and perform an error analysis in order to gain a better understanding of the results. http://www.aclweb.org/anthology/W18-2907 inproceedings nooralahzadeh-vrelid:2018:W18-29