Abstract
In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways. This topic has been thoroughly studied on recurrent architectures. In this paper, we extend the previous work to the encoder-decoder attention in the Transformer architecture. We propose four different input combination strategies for the encoder-decoder attention: serial, parallel, flat, and hierarchical. We evaluate our methods on tasks of multimodal translation and translation with multiple source languages. The experiments show that the models are able to use multiple sources and improve over single source baselines.- Anthology ID:
- W18-6326
- Volume:
- Proceedings of the Third Conference on Machine Translation: Research Papers
- Month:
- October
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 253–260
- Language:
- URL:
- https://aclanthology.org/W18-6326
- DOI:
- 10.18653/v1/W18-6326
- Bibkey:
- Cite (ACL):
- Jindřich Libovický, Jindřich Helcl, and David Mareček. 2018. Input Combination Strategies for Multi-Source Transformer Decoder. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 253–260, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Input Combination Strategies for Multi-Source Transformer Decoder (Libovický et al., WMT 2018)
- Copy Citation:
- PDF:
- https://aclanthology.org/W18-6326.pdf
Export citation
@inproceedings{libovicky-etal-2018-input, title = "Input Combination Strategies for Multi-Source Transformer Decoder", author = "Libovick{\'y}, Jind{\v{r}}ich and Helcl, Jind{\v{r}}ich and Mare{\v{c}}ek, David", editor = "Bojar, Ond{\v{r}}ej and Chatterjee, Rajen and Federmann, Christian and Fishel, Mark and Graham, Yvette and Haddow, Barry and Huck, Matthias and Yepes, Antonio Jimeno and Koehn, Philipp and Monz, Christof and Negri, Matteo and N{\'e}v{\'e}ol, Aur{\'e}lie and Neves, Mariana and Post, Matt and Specia, Lucia and Turchi, Marco and Verspoor, Karin", booktitle = "Proceedings of the Third Conference on Machine Translation: Research Papers", month = oct, year = "2018", address = "Brussels, Belgium", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/W18-6326", doi = "10.18653/v1/W18-6326", pages = "253--260", abstract = "In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways. This topic has been thoroughly studied on recurrent architectures. In this paper, we extend the previous work to the encoder-decoder attention in the Transformer architecture. We propose four different input combination strategies for the encoder-decoder attention: serial, parallel, flat, and hierarchical. We evaluate our methods on tasks of multimodal translation and translation with multiple source languages. The experiments show that the models are able to use multiple sources and improve over single source baselines.", }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="libovicky-etal-2018-input"> <titleInfo> <title>Input Combination Strategies for Multi-Source Transformer Decoder</title> </titleInfo> <name type="personal"> <namePart type="given">Jindřich</namePart> <namePart type="family">Libovický</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jindřich</namePart> <namePart type="family">Helcl</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">David</namePart> <namePart type="family">Mareček</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2018-10</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the Third Conference on Machine Translation: Research Papers</title> </titleInfo> <name type="personal"> <namePart type="given">Ondřej</namePart> <namePart type="family">Bojar</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Rajen</namePart> <namePart type="family">Chatterjee</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christian</namePart> <namePart type="family">Federmann</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mark</namePart> <namePart type="family">Fishel</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yvette</namePart> <namePart type="family">Graham</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Barry</namePart> <namePart type="family">Haddow</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matthias</namePart> <namePart type="family">Huck</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Antonio</namePart> <namePart type="given">Jimeno</namePart> <namePart type="family">Yepes</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Philipp</namePart> <namePart type="family">Koehn</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christof</namePart> <namePart type="family">Monz</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matteo</namePart> <namePart type="family">Negri</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Aurélie</namePart> <namePart type="family">Névéol</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mariana</namePart> <namePart type="family">Neves</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matt</namePart> <namePart type="family">Post</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Lucia</namePart> <namePart type="family">Specia</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marco</namePart> <namePart type="family">Turchi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Karin</namePart> <namePart type="family">Verspoor</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Brussels, Belgium</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways. This topic has been thoroughly studied on recurrent architectures. In this paper, we extend the previous work to the encoder-decoder attention in the Transformer architecture. We propose four different input combination strategies for the encoder-decoder attention: serial, parallel, flat, and hierarchical. We evaluate our methods on tasks of multimodal translation and translation with multiple source languages. The experiments show that the models are able to use multiple sources and improve over single source baselines.</abstract> <identifier type="citekey">libovicky-etal-2018-input</identifier> <identifier type="doi">10.18653/v1/W18-6326</identifier> <location> <url>https://aclanthology.org/W18-6326</url> </location> <part> <date>2018-10</date> <extent unit="page"> <start>253</start> <end>260</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T Input Combination Strategies for Multi-Source Transformer Decoder %A Libovický, Jindřich %A Helcl, Jindřich %A Mareček, David %Y Bojar, Ondřej %Y Chatterjee, Rajen %Y Federmann, Christian %Y Fishel, Mark %Y Graham, Yvette %Y Haddow, Barry %Y Huck, Matthias %Y Yepes, Antonio Jimeno %Y Koehn, Philipp %Y Monz, Christof %Y Negri, Matteo %Y Névéol, Aurélie %Y Neves, Mariana %Y Post, Matt %Y Specia, Lucia %Y Turchi, Marco %Y Verspoor, Karin %S Proceedings of the Third Conference on Machine Translation: Research Papers %D 2018 %8 October %I Association for Computational Linguistics %C Brussels, Belgium %F libovicky-etal-2018-input %X In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways. This topic has been thoroughly studied on recurrent architectures. In this paper, we extend the previous work to the encoder-decoder attention in the Transformer architecture. We propose four different input combination strategies for the encoder-decoder attention: serial, parallel, flat, and hierarchical. We evaluate our methods on tasks of multimodal translation and translation with multiple source languages. The experiments show that the models are able to use multiple sources and improve over single source baselines. %R 10.18653/v1/W18-6326 %U https://aclanthology.org/W18-6326 %U https://doi.org/10.18653/v1/W18-6326 %P 253-260
Markdown (Informal)
[Input Combination Strategies for Multi-Source Transformer Decoder](https://aclanthology.org/W18-6326) (Libovický et al., WMT 2018)
- Input Combination Strategies for Multi-Source Transformer Decoder (Libovický et al., WMT 2018)
ACL
- Jindřich Libovický, Jindřich Helcl, and David Mareček. 2018. Input Combination Strategies for Multi-Source Transformer Decoder. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 253–260, Brussels, Belgium. Association for Computational Linguistics.