Machine Translation Archive
Index of data, corpora and resources
Publications since 2010
For other periods go to: publications 2005-2009; publications 2000-2004; publications 1990-1999; publications 1970-1989; publications before 1989
To return to home page click
here
Bilingual
corpora [see also Comparable corpora,
Example-based methods, Multilingual
corpora]
(2015) Vishwajeet Kumar, Ashish Kulkarni, Pankaj
Singh, Ganesh Ramakrishnan, & Ganesh Arnaal: A machine-assisted
human translation system for technical documents. MT Summit XV, October 30 – November 3, 2015, Miami, Florida, USA.
Proceedings of MT Summit XV: vol.2: MT Users’ Track; p.259-272. [PDF, 1,275KB]
(2015) Chaochao Wang, Deyi Xiong, Min Zhang, &
Chunyu Kit: Learning
bilingual distributed phrase represenations for statistical machine translation.
MT Summit XV, October 30 – November 3,
2015, Miami, Florida, USA. Proceedings of MT Summit XV: vol.1: MT
Researchers’ Track; pp.32-43. [PDF, 668KB]
(2015) Alex Yanishevsky: How much cake is
enough: the case for domain-specific engines. MT Summit XV, October 30 – November 3, 2015, Miami, Florida, USA.
Proceedings of MT Summit XV: vol.2: MT Users’ Track; pp.224-247. [PDF, 1,682KB]
(2015) Dong Zhan & Hiromi Nakaiwa: Automatic detection of
antecedents of Japanese zero pronouns using a Japanese-English bilingual corpus.
MT Summit XV, October 30 – November 3,
2015, Miami, Florida, USA. Proceedings of MT Summit XV: vol.1: MT
Researchers’ Track; pp.66-79. [PDF, 863KB]
(2014) Burak Aydın & Arzucan Özgür: Expanding machine
translation training data with an out-of-domain corpus using language modeling
based vocabulary saturation. AMTA
2014: proceedings of the eleventh conference of the Association for Machine
Translation in the Americas, Vancouver, BC, October 22-26; pp.180-192. [PDF,
523KB]
(2014) Fabrizio Gotti, Philippe Langlais, & Atefeh
Farzindar: Hashtag occurrences, layout and
translation: a corpus-driven analysis of tweets published by the Canadian
government. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.2254-2261. [PDF, 486KB]
(2014) Adam Kilgarriff: Terminology finding in the Sketch Engine: an
evaluation. Translating and the
Computer 36: proceedings. Asling: International Society for Advancement in
Language Technology, 27-28 November 2014; pp.130-132. [PDF, 286KB]
(2014) Shachar Mirkin & Laurent Besacier: Data selection for
compact adapted SMT models. AMTA
2014: proceedings of the eleventh conference of the Association for Machine
Translation in the Americas, Vancouver, BC, October 22-26; pp.301-314. [PDF,
610KB]
(2014) Xingyi
Song, Lucia Specia, & Trevor Cohn: Data
selection for discriminative training in statistical machine translation.
Proceedings of the 17th annual conference of the European Association for
Machine Translation, EAMT 2014, Dubrovnik, Croatia, 16th-18th June 2014;
pp.45-52. [PDF, 415KB]
(2013) Wanxiang Che, Mengqiu Wang, Christopher
D.Manning, & Ting Liu: Named entity
recognition with bilingual constraints. [NAACL-HLT
2013] The 2013 conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 9-14 June 2013,
(2013) Lei Cui, Dongdong Zhang, Shujie Liu, Mu Li,
& Ming Zhou: Bilingual data cleaning for SMT
using graph-based random walk. ACL-2013: Proceedings of the 51st Meeting of
the Association for Computational Linguistics, Short papers, Sofia,
Bulgaria, August 4-9 2013; pp.340-345. [PDF, 259KB]
(2013) Manaal Faruqui & Chris Dyer: An information theoretic approach to bilingual word
clustering. ACL-2013: Proceedings of the 51st Meeting of the Association for
Computational Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013;
pp.777-783. [PDF, 263KB]
(2013) Francisco Guzman, Hassan Sajjad, Stephan Vogel,
& Ahmed Abdelali: The AMARA corpus:
building resources for translating the web’s educational content. [IWSLT 2013] Proceedings of the 10th International Workshop on Spoken
Language Translation,
(2013) Ann Irvine & Chris Callison-Burch: Supervised bilingual lexicon induction with
multiple monolingual signals. [NAACL-HLT
2013] The 2013 conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 9-14 June 2013,
(2013) Adam Kilgarriff: Terminology finding, parallel corpora and
bilingual word sketches in the Sketch Engine. [Aslib 2013] Translating and
the Computer 35, 28-29 November 2013, etc.venues, Paddington,
(2013) Anoop Kunchukuttan, Rajen Chatterjee, Shourya
Roy, Abhijit Mishra, & Pushpak Bhattacharyya: TransDoop: a map-reduce based crowdsourced
translation for complex domains. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, System demonstrations,
(2013) Oscar Mendoza Rivera, Ruslan Mitkov, &
Gloria Corpas Pastor: A flexible framework
for collocation retrieval and translation from parallel and comparable corpora.
[MT
(2013) Vassilis Papavassiliou, Prokopis Prokopidis,
& Gregor Thurmair: A modular
open-source focused crawler for mining monolingual and bilingual corpora from
the web. Proceedings of the 6th Workshop on Building and Using Comparable
Corpora,
(2013) Karl Pichotta & John DeNero:
Identifying phrasal verbs using many
bilingual corpora. [EMNLP 2013]
Proceedings of the 2013 Conference on Empirical Methods in Natural Language
Processing, Seattle, Washington, USA, 18-21 October 2013; pp.636-646. [PDF,
267KB]
(2013) Jason R.Smith, Herve Saint-Amand, Magdalena
Plamada, Philipp Koehn, Chris Callison-Burch, & Adam Lopez: Dirt cheap web-scale parallel text from the Common
Crawl. ACL-2013: Proceedings of the 51st Meeting of the Association for
Computational Linguistics,
(2013) Ivan Vulić &
Marie-Francine Moens: A study on bootstrapping
bilingual vector spaces from non-parallel data (and nothing else). [EMNLP 2013] Proceedings of the 2013
Conference on Empirical Methods in Natural Language Processing, Seattle,
Washington, USA, 18-21 October 2013; pp.1044-1054. [PDF, 261KB]
(2013) Lingxiao Wang & Christian Boitet: Online production of HQ parallel corpora and
permanent task-based evaluation of multiple MT systems: both can be obtained
through iMAGs with no added cost. Proceedings
of MT
(2012) Walid Aransa, Holger Schwenk, & Loic
Barrault: Semi-supervised transliteration
mining from parallel and comparable corpora. IWSLT-2012: 9th International Workshop on Spoken Language Translation,
(2012) Mihael Arcan, Paul
Buitelaar, & Christian Federmann: Using
domain-specific and collaborative resources for term translation. SSST-6, Sixth Workshop on Syntax, Semantics
and Structure in Statistical Translation, Jeju,
(2012) Núria Bel, Vassilis
Papavasiliou, Prokopis Prokopidis, Antonio Toral, & Victoria Arranz: Mining and exploiting domain-specific corpora in the
PANACEA platform. [BUCC 2012] The 5th
Workshop on Building and Using Comparable Corpora: “Language Resources for
Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Sergey Block, Michael Bloodgood, Petra Bradley,
Ryan Corbett, Michael Maxwell, Erica Michael, Peter Osthus, Paul Rodrigues,
& Benjamin Strauss: Evaluating parallel corpora: assessing utility for use
with translation memory systems in government settings [abstract]. AMTA-2012: the Tenth Biennial Conference of the Association for Machine
Translation in the
(2012) Ondřej Bojar, Zdeněk Žabokrtský,
Ondřej Dušek, Petra Galuščáková, Martin Majliš, David Mareček,
Jiří Maršík, Michal Novák, Martin Popel, & Aleš Tamchyna: The joy of parallelism with CzEng 1.0. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Houda Bouamor, Aurélien Max, & Anne Vilnat:
Validation of sub-sentential paraphrases
acquired from parallel monolingual corpora. [EACL 2012] Proceedings of the 13th Conference of the European Chapter
of the Association for Computational Linguistics,
(2012) Mauro Cettolo, Christian Girardi, &
Marcello Federico: WIT3: web
inventory of transcribed and translated talks. EAMT 2012: Proceedings of the 16th Annual Conference of the European
Association for Machine Translation, Trento, Italy, May 28-30 2012, ed.
Mauro Cettolo, Marcello
Federico, Lucia Specia, Andy Way; pp.261-268. [PDF, 197KB]
(2012) Sherri Condon, Luis Hernandez, Dan Parvaz,
Mohammad S.Khan, & Hazrat Jahed: Producing
data for under-resourced languages: a Dari-English parallel corpus of
multi-genre text. AMTA-2012: the
Tenth Biennial Conference of the Association for Machine Translation in the
(2012) Ângela Costa, Tiago Luís, Joana Ribeiro, Ana
Cristina Mendes, & Luísa Coheur: An
English-Portuguese parallel corpus of questions: translation guidelines and
application in statistical machine translation. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Mark Fishel, Yota Georgakopoulou, Sergio
Penkale, Volha Petukhova, Matej Rojc, Martin Volk, & Andy Way: From subtitles to parallel corpora. EAMT 2012: Proceedings of the 16th Annual
Conference of the European Association for Machine Translation, Trento,
Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; pp.3-6. [
(2012) Guillem Gascó, Martha-Alicia Rocha, Germán
Sanchis-Trilles, Jesús Andrés-Ferrer, & Francisco Casacuberta: Does more data always yield better translations? [EACL
2012] Proceedings of the 13th Conference of the European Chapter of the
Association for Computational Linguistics,
(2012) Monica Gavrila, Walther v.Hahn, & Cristina
Vertan: Same domain different discourse style:
a case study on language resources for data-driven machine translation. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Cyril Goutte, Marine Carpuat, & George
Foster: The impact of sentence alignment errors
on phrase-based machine translation performance. AMTA-2012: the Tenth Biennial Conference of the Association for Machine
Translation in the
(2012) Stephen Grimes, Katherine Peterson, &
Xuansong Li: Automatic word alignment tools to
scale production of manually aligned parallel texts. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Eva Hajičová & Petr Sgall:
Formal models and practice of annotation [abstract]. In: Crosslingual Language Technology in service
of an integrated multilingual Europe, 4-5 May 2012,
(2012) Petter Haugereid &
Francis Bond: Extracting semantic transfer
rules from parallel corpora with SMT phrase aligners. SSST-6, Sixth Workshop on Syntax, Semantics and Structure in
Statistical Translation, Jeju,
(2012) Quoc Hung-Ngo &
Werner Winiwarter: A visualizing annotation
tool for semi-automatically building a bilingual corpus. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Georgi Iliev & Angel Genov: Expanding parallel resources for medium-density
languages for free. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Fattaneh Jabbari, Somayeh Bakhshaei, Seyed
Mohammad Mohammadzadeh Ziabary, & Shahram Khadivi: Developing an open-domain English-Farsi
translation system using AFEC, Amirkabir bilingual Farsi-English corpus. AMTA-2012: Fourth workshop on computational
approaches to Arabic script-based languages. Proceedings,
(2012) Weimin Jiang: Engine-specific
Chinese-English user parallel corpora AMTA-2012:
the Tenth Biennial Conference of the Association for Machine Translation in the
Americas. Proceedings,
(2012) J.Howard Johnson: Conditional significance pruning: discarding more
of huge phrase tables. AMTA-2012: the
Tenth Biennial Conference of the Association for Machine Translation in the
(2012) Adam Kilgarriff &
George Tambouratzis: The PRESEMT project.
[BUCC 2012] The 5th Workshop on Building
and Using Comparable Corpora: “Language Resources for Machine Translation in
Less-Resourced Languages and Domains”,
LREC 2012 Workshop, 26 May 2012,
(2012) Gerhard Kremer, Matthias Hartung, Sebastian
Padó & Stefan Riezler: Statistical machine
translation support improves human adjective translation. Translation: Computation, Corpora, Cognition 2 (1), July 2012; pp.103-126.
[PDF, 266KB]
(2012) Cvetana
Krsteva & Duško Vitas: Construction and exploitation of X-Serbian bitexts
[abstract]. In: Crosslingual Language Technology in service of an
integrated multilingual Europe, 4-5 May 2012,
(2012) Septina Dian Larasati: IDENTIC corpus: morphologically enriched
Indonesian-English parallel corpus. LREC
2012: Eighth international conference on Language Resources and Evaluation, 21-27 May 2012,
(2012) Marianna J.Martindale: Can statistical post-editing with a small
parallel corpus save a weak MT engine?
LREC 2012: Eighth international conference on Language
Resources and Evaluation,
21-27 May 2012,
(2012) Mohammed Mediani, Jan
Niehues, & Alex Waibel: Parallel phrase
scoring for extra-large corpora. Prague
Bulletin of Mathematical Linguistics 98, October 2012; pp.87-98. [PDF,
142KB]
(2012) Robert Munro &
Christopher D.Manning: Accurate unsupervised
joint named-entity extraction from unaligned parallel text. [ACL 2012] Proceedings of NEWS 2012 Named
Entities Workshop, July 12, 2012, Jeju,
(2012) Preslav Nakov & Hwee Tou Ng: Improving statistical machine translation for a
resource-poor language using related resource-rich languages. Journal of Artificial Intelligence Research
44 (2012); pp.179-222. [PDF, 421KB]
(2012) Carla Parra Escartin: Design and compilation of a specialized
Spanish-German parallel corpus. LREC
2012: Eighth international conference on Language Resources and Evaluation, 21-27 May 2012,
(2012) Matt Post, Chris
Callison-Burch, & Miles Osborne: Constructing
parallel corpora for six Indian languages via crowdsourcing. WMT 2012: 7th Workshop on Statistical Machine
Translation. Proceedings of the workshop, June 7-8, 2012,
(2012) Bruno Pouliquen, Christophe Mazenc, Cecilia
Elizalde, & Jose Garcia-Verdugo: Statistical
machine translation prototype using UN parallel documents. EAMT 2012: Proceedings of the 16th Annual
Conference of the European Association for Machine Translation, Trento,
Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; pp.12-19. [PDF, 251KB]
(2012) Felipe Sánchez-Martínez, Rafael C.Carrasco,
Miguel A.Martínez-Prieto, & Joaquín Adiego: Generalized bywords for bitext
compression and translation spotting. Journal
of Artificial Intelligence Research 43; pp.389-418. [PDF, 418KB]
(2012) Rico Sennrich: Perplexity
minimization for translation model domain adaptation in statistical machine
translation. [EACL 2012] Proceedings of the 13th Conference of the European Chapter
of the Association for Computational Linguistics,
(2012) Ińaki San Vicente &
Iker Manterola: PaCo2: a fully
automated tool for gathering parallel corpora from the Web. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Martina Katalin Szabó,
Veronika Vincze, & István Nagy T.: HunOr: a
Hungarian-Russian parallel corpus. LREC
2012: Eighth international conference on Language Resources and Evaluation, 21-27 May 2012,
(2012) George Tambouratzis, Marina Vassiliou, &
Sokratis Sofianopoulos: PRESEMT: pattern
recognition-based statistically enhanced MT. EACL Joint Workshop on Exploiting Synergies between Information
Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine
Translation (HyTra): Proceedings of the workshop, 23-24 April 2012,
Avignon, France; pp.65-68. [PDF, 170KB]
(2012) George Tambouratzis,
(2012) Aleš Tamchyna,
(2012) Veronika Vincze: Light verb constructions in the SzegedParallelFX
English-Hungarian parallel corpus. LREC 2012: Eighth
international conference on Language Resources and Evaluation, 21-27 May 2012,
(2012)
(2012) Dominikus Wetzel &
Francis Bond: Enriching parallel corpora for
statistical machine translation with semantic negation rephrasing. SSST-6, Sixth Workshop on Syntax, Semantics
and Structure in Statistical Translation, Jeju,
(2012) Qian Yu, Aurélien Max,
& François Yvon: Aligning bilingual
literary works: a pilot study. NAACL-HLT
Workshop on Computational Linguistics for Literature,
(2012) Daniel Zeman: Data issues of the multilingual translation matrix.
WMT 2012: 7th Workshop on Statistical
Machine Translation. Proceedings of the workshop, June 7-8, 2012,
(2011) Takeshi Abekawa & Kyo Kageura: Using seed terms for crawling bilingual
terminology lists on the Web. Translating and the Computer 33, 17-18 November 2011,
(2011) Marilisa Amoia, Kerstin Kunz, & Ekaterina
Lapshinova-Koltunski: Discontinuous constituents:
a problematic case for parallel corpora annotation and querying. AEPC 2011: proceedings of the Second
Workshop on Annotation and Exploitation of Parallel Corpora, associated
with the 8th International Conference on Recent Advances in Natural Language
Processing (RANLP 2011), 15th September 2011,
(2011) Alexandra Antonova & Alexey Misyurev: Building a web-based parallel corpus and
filtering out machine-translated text. ACL
2011: Proceedings of the Fourth Workshop on Building and Using Comparable
Corpora,
(2011)
(2011) Elizabeth Baran &
Nianwen Xue: Singular or plural? Exploiting
parallel corpora for Chinese number prediction. MT Summit XIII: the Thirteenth Machine Translation Summit
[organized by the] Asia-Pacific Association for Machine Translation (AAMT),
19-23 September 2011,
(2011) Luciano Barbosa, Srinivas Bangalore, &
Vivek Kumar Sridhar Rangarajan: Crawling back
and forth: using back and out links to locate bilingual sites. [IJCNLP
2011] Proceedings of the 5th
International Joint Conference on Natural Language Processing,
(2011) Caroline Barričre & Pierre Isabelle: Searching parallel corpora for contextually
equivalent terms. [EAMT 2011]:
proceedings of the 15th conference of the European Association for Machine
Translation, 30-31 May 2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi
Depraetere, Vincent Vandeghinste; pp.105-112. [PDF, 149KB]
(2011) Shane Bergsma, David Yarowsky, & Kenneth Church: Using large monolingual and bilingual corpora to
improve coordination disambiguation. ACL-HLT
2011: Proceedings of the 49th Annual Meeting of the Association for
Computational Linguistics,
(2011) Wenliang Chen, Jun’ichi Kazama, Min Zhang, Yoshimasa Tsuruoka,
Yujie Zhang, Yiou Wang, Kentaro Torisawa, & Haizhou Li: SMT helps bitext dependency parsing. [EMNLP
2011] Proceedings of the 2011 Conference
on Empirical Methods in Natural Language Processing, Edinburgh, Scotland,
UK, July 27-31, 2011; pp.73-83. [PDF, 583KB]
(2011) Oliver Čulo, Silvia Hansen-Schirra, Karin
Maksymski, & Stella Neumann: Empty links and
crossing lines: querying multi-layer annotation and alignment in parallel corpora.
Translation: Computation, Corpora, Cognition 1 (1), December 2011;
pp.75-104. [PDF, 923KB]
(2011) Guy De Pauw, Peter Waiganjo Wagacha, & Gilles-Maurice de
Schryver: Towards English-Swahili machine
translation. Machine Translation and Morphologically- rich
Languages: Research Workshop of the
Israel Science Foundation, University of Haifa, Israel, 23-27 January, 2011;
2pp. [PDF, 77KB]
(2011) Ali El-Kahky, Kareem Darwish, Ahmed Saad Aldein, Mohamed Abd
El-Wahab, Ahmed Hefny, & Waleed Ammar: Improved
transliteration mining using graph reinforcement. [EMNLP 2011] Proceedings of the 2011 Conference on
Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK,
July 27-31, 2011; pp.1384-1393. [PDF, 939KB]
(2011) Ruiji Fu, Bing Qin, & Ting Liu: Generating Chinese named entity data from a parallel
corpus. [IJCNLP 2011] Proceedings of
the 5th International Joint Conference on Natural Language Processing,
(2011) Souhir Gahbiche-Braham, Hélčne Bonneau-Maynard, & François
Yvon: Two ways to use a noisy parallel news
corpus for improving statistical machine translation. ACL 2011: Proceedings of the Fourth Workshop on Building and Using
Comparable Corpora,
(2011) Juri Ganitkevitch, Chris Callison-Burch, Courtney Napoles, &
Benjamin Van Durme: Learning sentential
paraphrases from bilingual parallel corpora for text-to-text generation.
[EMNLP 2011] Proceedings of the 2011
Conference on Empirical Methods in Natural Language Processing, Edinburgh,
Scotland, UK, July 27-31, 2011; pp.1168-1179. [PDF, 311KB]
(2011) Qin Gao & Stephan Vogel: Corpus
expansion for statistical machine translation with semantic role label
substitution rules. ACL-HLT 2011:
Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics: Short papers,
(2011) Petter Haugereid & Francis Bond: Extracting transfer rules for multiword
expressions from parallel corpora. Proceedings
of the Workshop on Multiword Expressions: from Parsing and Generation to the
Real World (MWE 2011),
(2011) Carlos A.Henríquez Q., José B.Marińo, &
Rafael E.Banchs: Deriving translation units
using small additional corpora. [EAMT
2011]: proceedings of the 15th conference of the European Association for
Machine Translation, 30-31 May 2011, Leuven, Belgium; eds. Mikel L.Forcada,
Heidi Depraetere, Vincent Vandeghinste; pp.121-128. [PDF, 438KB]
(2011) Masamichi Ideue,
Kazahide Yamamoto, Masao Utiyama, & Eiichiro Sumita: A comparison of unsupervised bilingual term extraction
methods using phrase tables. MT
Summit XIII: the Thirteenth Machine Translation Summit [organized by the]
Asia-Pacific Association for Machine Translation (AAMT), 19-23 September 2011,
(2011) Kriste Krstovski & David A.Smith: A minimally supervised approach for detecting and
ranking document translation pairs.
[WMT 2011] Proceedings of the 6th
Workshop on Statistical Machine Translation,
(2011) Sergey Kulikov: What is web-based machine translation up to?
Tralogy,
(2011) Emeline Lecuit, Denis Maurel, & Duško
Vitas: A tagged and aligned corpus for the study
of proper names in translation. AEPC
2011: proceedings of the Second Workshop on Annotation and Exploitation of
Parallel Corpora, associated with the 8th International Conference on Recent
Advances in Natural Language Processing (RANLP 2011), 15th September 2011,
(2011) Els Lefever, Véronique Hoste, & Martine De Cock: ParaSense or how to use parallel corpora for word
sense disambiguation. ACL-HLT 2011:
Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics: Short papers,
(2011) Feifan Liu, Fei Liu, & Yang Liu: Learning from Chinese-English parallel data for
Chinese tense prediction. [IJCNLP 2011] Proceedings
of the 5th International Joint Conference on Natural Language Processing,
(2011) Yashar Mehdad, Matteo Negri, & Marcello Federico: Using bilingual parallel corpora for cross-lingual
textual entailment. ACL-HLT 2011:
Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics,
(2011) Sara Morrissey: Body at work: using corpora in sign language
machine translation. International
Workshop on Sign Language Translation and Avatar Technology (SLTAT), 10-11
January 2011, Federal Ministry of Labour and Social Affairs,
(2011) Preslav Nakov: Reusing parallel corpora between
related languages [abstract] AEPC 2011: proceedings of the Second
Workshop on Annotation and Exploitation of Parallel Corpora, associated
with the 8th International Conference on Recent Advances in Natural Language
Processing (RANLP 2011), 15th September 2011,
(2011) Alexandre Patry & Philippe Langlais: Identifying parallel documents from a large
bilingual collection of texts: application to parallel article extraction in
Wikipedia. ACL 2011: Proceedings of
the Fourth Workshop on Building and Using Comparable Corpora,
(2011) Marion Potet, Raphaël Rubino, Benjamin Lecouteux, Stéphane Huet,
Hervé Blanchon, Laurent Besacier, & Fabrice Lefčvre: The LIGA (LIG/LIA) machine translation system for WMT
2011. [WMT 2011] Proceedings of the
6th Workshop on Statistical Machine Translation,
(2011) Spencer Rarrick,
Chris Quirk, & Will Lewis: MT detection in
web-scraped parallel corpora. MT
Summit XIII: the Thirteenth Machine Translation Summit [organized by the]
Asia-Pacific Association for Machine Translation (AAMT), 19-23 September 2011,
(2011) Mohammed Rushdi-Saleh, M.Teresa
Martín-Valdivia, L.Alfonso Ureńa-López, & José M.Perea-Ortega: Bilingual experiments with an Arabic-English
corpus for opinion mining. [RANLP 2011] Proceedings of Recent Advances
in Natural Language Processing, Hissar, Bulgaria, 12-14 September 2011;
pp.740-745. [PDF, 390KB]
(2011) Markus Saers & Dekai Wu: Principled induction of phrasal bilexica. [EAMT 2011]: proceedings of the 15th
conference of the European Association for Machine Translation, 30-31 May
2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent
Vandeghinste; pp.313-320. [PDF, 373KB]
(2011) Hassan Sajjad, Nadir Durrani, Helmut Schmid,
& Alexander Fraser: Comparing two
techniques for learning transliteration models using a parallel corpus.
[IJCNLP 2011] Proceedings of the 5th
International Joint Conference on Natural Language Processing,
(2011)
(2011) Kaveh Taghipour,
Shahram Khadivi, & Jia Xu: Parallel corpus
refinement as an outlier detection algorithm. MT Summit XIII: the Thirteenth Machine Translation Summit [organized
by the] Asia-Pacific Association for Machine Translation (AAMT), 19-23
September 2011,
(2011) Mara Tsoumari & Georgios Petasis: A new annotation tool for aligned bilingual
corpora. AEPC 2011: proceedings of
the Second Workshop on Annotation and Exploitation of Parallel Corpora,
associated with the 8th International Conference on Recent Advances in Natural
Language Processing (RANLP 2011), 15th September 2011,
(2011) Cristina Vertan & Monica Gavrila: Using manual and parallel aligned corpora for
machine translation services within an on-line content management system. AEPC 2011: proceedings of the Second
Workshop on Annotation and Exploitation of Parallel Corpora, associated
with the 8th International Conference on Recent Advances in Natural Language
Processing (RANLP 2011), 15th September 2011,
(2011) Špela Vintar & Darja Fišer: Enriching Slovene WordNet with domain-specific terms.
Translation: Computation, Corpora, Cognition 1 (1), December 2011; pp.29-44.
[PDF, 631KB]
(2011) Jia Xu & Weiwei
Sun: Generating virtual parallel corpus: a
compatibility centric method. MT
Summit XIII: the Thirteenth Machine Translation Summit [organized by the]
Asia-Pacific Association for Machine Translation (AAMT), 19-23 September 2011,
(2011) Cesare Zanca: Developing translation strategies and cultural
awareness using corpora and the web.
Tralogy,
(2010) Takeshi Abekawa,
Masao Utiyama, Eiichiro Sumita, & Kyo Kageura: Community-based construction of draft and final
translation corpus through a translation hosting site Minna no Hon’yaku (MNH). LREC
2010: proceedings of the seventh international conference on Language Resources
and Evaluation, 17-23 May 2010,
(2010) Lars Ahrenberg: Alignment-based profiling of Europarl data in an
English-Swedish parallel corpus. LREC
2010: proceedings of the seventh international conference on Language Resources
and Evaluation, 17-23 May 2010,
(2010) José Joăo Almeida & Alberto Simőes: Automatic parallel corpora and bilingual
terminology extraction from parallel websites. [LREC 2010] Proceedings
of the 3rd Workshop on Building and Using Comparable Corpora,
(2010) Vamshi Ambati,
Stephen Vogel, & Jaime Carbonell: Active
learning and crowd-sourcing for machine translation. LREC 2010: proceedings of the seventh international conference on
Language Resources and Evaluation, 17-23 May 2010,
(2010) Vamshi Ambati
& Stephan Vogel: Can crowds build
parallel corpora for machine translation systems? Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and
Language Data with Amazon’s Mechanical Turk,
(2010) Marianna Apidianaki & Yifan He: An algorithm for cross-lingual
sense-clustering tested in a MT evaluation setting. Proceedings of the 7th International Workshop on Spoken Language
Translation, 2-3 December 2010,
(2010) Ondřej Bojar, Adam Liška, & Zdeněk Žabokrtský: Evaluating utility of data sources in a large
parallel Czech-English corpus CzEng 0.9. LREC 2010: proceedings of the
seventh international conference on Language Resources and Evaluation,
17-23 May 2010,
(2010) Ondřej Bojar, Pavel Straňák, & Daniel Zeman: Data issues in English-to-Hindi machine
translation. LREC 2010: proceedings
of the seventh international conference
on Language Resources and Evaluation, 17-23 May 2010,
(2010) Fabienne Braune
& Alexander Fraser: Improved unsupervised
sentence alignment for symmetrical and asymmetrical parallel corpora. Coling 2010: 23rd International Conference
on Computational Linguistics, 23-27 August 2010, Beijing International
Convention Center, Beijing, China, Posters
volume; pp.81-89. [PDF, 228KB]
(2010) David Burkett,
Slav Petrov, John Blitzer, & Dan Klein: Learning
better monolingual models with unannotated bilingual text. CoNLL-2010: Fourteenth Conference on
Computational Natural Language Learning, Proceedings of the conference,
15-16 July 2010, Uppsala University, Uppsala, Sweden; pp.46-54. [PDF, 431KB]
(2010) Chen Yuncong
& Pascale Fung: Unsupervised synthesis of
multilingual Wikipedia articles. Coling
2010: 23rd International Conference on Computational Linguistics.
Proceedings of the conference, 23-27 August 2010,
(2010) Hercules
Dalianis, Hao-chun Xing, & Xin Zhang: Creating
a reusable English-Chinese parallel corpus for bilingual dictionary
construction. LREC 2010: proceedings
of the seventh international conference
on Language Resources and Evaluation, 17-23 May 2010,
(2010) Yanhui Feng, Yu
Hong, Zhenxiang Yan, Jianmin Yao, & Qiaoming Zhu: A novel method for bilingual web page acquisition
from search engine web records. Coling
2010: 23rd International Conference on Computational Linguistics, 23-27
August 2010, Beijing International Convention Center, Beijing, China, Posters volume; pp.294-302. [PDF, 184KB]
(2010) Mark
Fishel & Heiki-Jaan Kaalep: CorporAl: a
method and tool for handling overlapping parallel corpora. Fifth Machine Translation
Marathon, 13-18
September,
(2010) J.González-Rubio, J.Civera, A.Juan,
& F.Casacuberta: Saturnalia: a
Latin-Catalan parallel corpus for statistical MT. LREC 2010: proceedings of the
seventh international conference on Language Resources and Evaluation,
17-23 May 2010,
(2010) Philipp Koehn & Jean Senellart: Fast approximate string matching with suffix arrays
and A* parsing. AMTA 2010: the Ninth
conference of the Association for Machine Translation in the Americas,
Denver, Colorado, October 31 – November 4, 2010; 9pp. [PDF, 178KB]
(2010) Audrey Laroche
& Philippe Langlais: Revisiting
context-based projection methods for term-translation spotting in comparable
corpora. Coling 2010: 23rd
International Conference on Computational Linguistics. Proceedings of the
conference, 23-27 August 2010,
(2010) Sara Morrissey, Harold Somers, Robert Smith, Shane Gilchrist
& Sandipan Dandapat: Building a sign
language corpus for use in machine translation. [LREC 2010] 4th Workshop on the Representation and
Processing of Sign Languages: Corpora and Sign Language Technologies,
(2010) Smruthi Mukund,
Debanjan Ghosh, & Rohini K.Srihari: Using
cross-lingual projections to generate semantic role labeled corpus for Urdu – a
resource poor language. Coling 2010:
23rd International Conference on Computational Linguistics. Proceedings of
the conference, 23-27 August 2010,
(2010) Masaki Murata, Tomohiro Ohno, Shigeki
Matsubara, & Yasuyoshi Inagaki: Construction
of chunk-aligned bilingual lecture corpus for simultaneous machine translation.
LREC 2010: proceedings of the seventh international conference on Language
Resources and Evaluation, 17-23 May 2010,
(2010) Matteo Negri
& Yashar Mehdad: Creating bi-lingual
entailment corpus through translations with Mechanical Turk: $100 for a 10-day
rush. Proceedings of the NAACL HLT 2010
Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk,
(2010) Mike
O’Malley: The challenges of distributed parallel corpora. AMTA 2010: the Ninth conference of the Association for Machine
Translation in the Americas,
(2010) Paula Paiva: Corpus representativeness in the selection of medical
terms to be used in translation memory tools [abstract].
UCCTS 2010: Using Corpora in Contrastive
and Translation Studies,
(2010) Jocelyn Phillips, Carol Van Ess-Dykema, Timothy
Allison & Laurie Gerber: Parallel corpus
development at NVTC. AMTA 2010: the
Ninth conference of the Association for Machine Translation in the Americas,
Denver, Colorado, October 31 – November 4, 2010; 7pp. [PDF, 173KB]; abstract
(2010) John C.Platt, Kristina Toutanova, & Wen-tau
Yih: Translingual document representations from
discriminative projections. [EMNLP 2010] Proceedings of the 2010 Conference on Empirical Methods in Natural
Language Processing, MIT, Massachusetts, USA, 9-11 October 2010;
pp.251-261. [PDF, 319KB]
(2010) Reinhard Rapp
& Michael Zock: Utilizing citations of foreign
words in corpus-based dictionary generation. [Coling 2010] Proceedings of the
Second Workshop on NLP Challenges in the Information Explosion Era,
(2010)
Gudrun Rawoens: Multilingual corpora in
cross-lingusitic research: focus on the compilation of a Dutch-Swedish parallel
corpus. JADT 2010: 10th International
Conference on Statistical Analysis of Textual Data, 9-11 juin 2010,
(2010) Gábor Recski, András Rung, Attila Zséder, &
András Kornai: NP alignment in bilingual corpora.
LREC 2010: proceedings of the seventh international conference on Language
Resources and Evaluation, 17-23 May 2010,
(2010) Yulia Tsvetkov & Shuly Wintner: Automatic acquisition of parallel corpora from
websites with dynamic content. LREC 2010: proceedings of the seventh international conference on Language
Resources and Evaluation, 17-23 May 2010,
(2010) Yulia Tsvetkov
& Shuly Wintner: Extraction of
multi-word expressions from small parallel corpora. Coling 2010: 23rd International Conference on Computational Linguistics,
23-27 August 2010, Beijing International Convention Center, Beijing, China, Posters volume; pp.1256-1264. [PDF,
257KB]
(2010) Jakob Uszkoreit,
Jay M.Ponte, Ashok C.Popat, & Moshe Dubiner: Large scale parallel document mining for
machine translation. Coling 2010:
23rd International Conference on Computational Linguistics. Proceedings of
the conference, 23-27 August 2010,
(2010) Tom Vanallemeersch: Belgisch Staatsblad corpus: retrieving
French-Dutch sentences from official documents. LREC 2010: proceedings of the
seventh international conference on Language Resources and Evaluation,
17-23 May 2010,
(2010) Sina Zarrieß, Aoife Cahill, Jonas Kuhn, & Christian Rohrer: Cross-lingual induction of deep broad-coverage
syntax: a case study on German participles. Coling 2010: 23rd International Conference on Computational Linguistics,
23-27 August 2010, Beijing International Convention Center, Beijing, China, Posters volume; pp.1426-1434. [PDF,
106KB]
Bi-text see Bilingual corpora
Cleaning and filtering
(2014) Michel Simard: Clean data for
training statistical MT: the case of MT contamination. AMTA 2014: proceedings of the eleventh conference of
the Association for Machine Translation in the Americas, Vancouver, BC, October
22-26; pp.69-82. [PDF, 533KB]
(2014) Raivis Skadiņš, Jörg Tiedemann, Roberts
Rozis & Daiga Deksne: Billions of parallel
words for free: building and using the EU Bookshop corpus. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.1850-1855. [PDF, 521KB]
(2013) Alexey
Borisov,Jacob Dlougach & Irina Galinskaya: Yandex School of Data Analysis machine
translation systems for WMT13. WMT 2013: 8th Workshop on Statistical Machine Translation,
Proceedings of the Workshop, August 8-9, 2013,
(2013) Lei Cui, Dongdong Zhang, Shujie Liu, Mu Li,
& Ming Zhou: Bilingual data cleaning for SMT
using graph-based random walk. ACL-2013: Proceedings of the 51st Meeting of
the Association for Computational Linguistics, Short papers, Sofia,
Bulgaria, August 4-9 2013; pp.340-345. [PDF, 259KB]
(2013) Lluís
Formiga, Marta R. Costa-jussŕ, José B. Marińo, José A. R. Fonollosa, Alberto
Barrón-Cedeńo & Lluis Marquez: The TALP-UPC phrase-based translation systems for
WMT13: system combination with morphology generation, domain adaptation and
corpus filtering. WMT 2013:
8th Workshop on Statistical Machine Translation, Proceedings of the Workshop,
August 8-9, 2013,
(2013) Manuel Herranz, Alex Helle, Elia Yuste, Ruslan
Mitkov, & Lucia Specia: Pangeanic in the
EXPERT project: EXPloiting Emprical approaches to Translation. Proceedings of the XIV Machine Translation
(2013) William
Lewis & Sauleh Eetemadi: Dramatically reducing training data size through
vocabulary saturation. WMT 2013: 8th Workshop on Statistical
Machine Translation, Proceedings of the Workshop, August 8-9, 2013,
(2013) Sara
Stymne, Christian Hardmeier, Jörg Tiedemann & Joakim Nivre: Tunable
distortion limits and corpus cleaning for SMT. WMT 2013: 8th Workshop on Statistical Machine Translation, Proceedings
of the Workshop, August 8-9, 2013,
(2013) Samira Tofighi Zahabi, Somayeh Bakhshaei, &
Shahram Khadivi: Using context vectors in
improving a machine translation system with bridge language. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.318-322.
[PDF, 193KB]
(2012) Colin Cherry: Decoding. Machine Translation Marathon 2012 September 3-8,
(2012) Jie Jiang,
(2012) J.Howard Johnson: Conditional significance pruning: discarding more
of huge phrase tables. AMTA-2012: the
Tenth Biennial Conference of the Association for Machine Translation in the
(2012) Saab Mansour & Hermann Ney: A simple and effective weighted phrase extraction
for machine translation adaptation. IWSLT-2012:
9th International Workshop on Spoken Language Translation,
(2012) Juan Pino, Aurelien
Waite, & William Byrne: Simple and efficient
model filtering in statistical machine translation. Prague Bulletin of Mathematical Linguistics 98, October 2012;
pp.5-24. [PDF, 172KB]
(2012) Richard Zens, Daisy
Stanton, & Peng Xu: A systematic comparison
of phrase table pruning techniques. EMNLP-CoNLL
2012: Joint Conference on Empirical Methods in Natural Language Processing and
Computational Natural Language Learning, Proceedings of the conference,
July 12-14, Jeju Island, Korea; pp.972-983. [PDF, 203KB]
(2011) Alexandra Antonova & Alexey Misyurev: Building a web-based parallel corpus and
filtering out machine-translated text. ACL
2011: Proceedings of the Fourth Workshop on Building and Using Comparable
Corpora,
(2011) Saab Mansour, Joern Wuebker, & Hermann Ney:
Combining translation and language model
scoring for domain-specific data filtering. IWSLT 2011: Proceedings of the International Workshop on Spoken
Language Translation,
(2011) Česlav Przywara &
Ondřej Bojar: eppex: epochal phrase table
extraction for statistical machine translation. Sixth Machine Translation
Marathon, 5-10
September 2011,
(2011) Spencer Rarrick,
Chris Quirk, & Will Lewis: MT detection in
web-scraped parallel corpora. MT
Summit XIII: the Thirteenth Machine Translation Summit [organized by the]
Asia-Pacific Association for Machine Translation (AAMT), 19-23 September 2011,
(2010) Hailong Cao &
Eiichiro Sumita: Filtering syntactic constraints for
statistical machine translation. ACL
2010: the 48th Annual Meeting of the Association for Computational Linguistics,
(2010) Jie Jiang,
(2011) Sara Stymne: Spell checking
techniques for replacment of unknown words and data cleaning for Haitian Creole
SMS translation. [WMT 2011] Proceedings
of the 6th Workshop on Statistical Machine Translation,
Comparable corpora
(2015) Krzysztof Wolk & Krzysztof Marasek: PJAIT systems for the
IWSLT 2015 evaluation campaign enhanced by comparable corpora. [IWSLT 2015] Proceedings of the
International Workshop on Spoken Language Translation, December 3-4, 2015,
Da Nang, Vietnam; pp.101-104. [PDF, 2.9MB]
(2015) Krzysztof Wolk & Krzysztof Marasek: Unsupervised
comparable corpora preparation and exploration for bi-lingual translation
equivalents. [IWSLT 2015] Proceedings
of the International Workshop on Spoken Language Translation, December 3-4,
2015, Da Nang, Vietnam; pp.118-125. [PDF,5.3MB]
(2014) Ondřej Bojar, Vojtěch Diatka, Pavel
Rychlý, Pavel Straňák, Vit Suchomel, Aleš Tamchyna, & Daniel Zeman: HindEnCorp – Hindi-English and Hindi-only corpus for
machine translation. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.3550-3555. [PDF, 107KB]
(2014) Chenhui
(2014) Hernani Costa, Gloria Corpas Pastor, &
Miriam Seghiri: iCompileCorpora: a web-based
application to semi-automatically compile multilingual comparable corpora. Translating and the Computer 36: proceedings.
Asling: International Society for Advancement in Language Technology, 27-28
November 2014; pp.51-55. [PDF, 119KB]
(2014) Sandipat Dandapat & Declan Groves: MTWatch: a tool for the analysis of noisy
parallel data. LREC 2014: Ninth
International Conference on Language Resources and Evaluation, May 26-31,
2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.41-45. [PDF, 190KB]
(2014) Jennifer Drexler, Pushpendre Rastogi,
Jacqueline Aguilar, Benjamin Van Durme, & Matt Post: A Wikipedia-based corpus for contextualized
machine translation. LREC 2014: Ninth
International Conference on Language Resources and Evaluation, May 26-31,
2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.3593-3596. [PDF, 72KB]
(2014) Miquel Esplŕ-Gomis, Filip Klubička, Nikola
Ljubešić, Sergio Ortiz-Rojas, Vassilis Papavassiliou, & Prokopis
Prokopidis: Comparing two acquisition
systems for automatically building an English-Croatian parallel corpus from
multilingual websites. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.1252-1258. [PDF, 158KB]
(2014) Najeh Hajlaoui, David Kolovratnik, Jaakko
Väyrynen, Ralf Steinberger, & Daniel Varga: DCEP
– digital corpus of the European Parliament. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.3164-3171. [PDF, 252KB]
(2014) Ann Irvine & Chris Callison-Burch: Using comparable corpora to adapt MT models to new
domains. [WMT 2014] Proceedings of
the Ninth Workshop on Statistical Machine Translation,
(2014) B.R.Laranjeira, V.P.Moreira, A.Villavicencio,
C.Ramisch, & M.J.Finatto: Comparing the
quality of focused crawlers and of the translation resources obtained from them.
LREC 2014: Ninth International Conference
on Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall
and Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.3572-3578. [PDF, 803KB]
(2014) Wang Ling, Luís Marujo, Chris Dyer, Alan Black
& Isabel Trancoso: Crowdsourcing high-quality
parallel data extraction from Twitter. [WMT 2014] Proceedings of the Ninth Workshop on Statistical Machine Translation,
(2014) Thomas Mayer & Michael Cysouw: Creating a massively parallel Bible corpus. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.3158-3163. [PDF, 575KB]
(2014) Mircea Petic & Daniela Gîfu: Transliteration and alignment of parallel texts from
Cyrillic to Latin. LREC 2014: Ninth
International Conference on Language Resources and Evaluation, May 26-31,
2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.1819-1823. [PDF, 462KB]
(2014) Anita Rácz, István Nagy T., Veronika Vincze: 4FX: light verb constructions in a multilingual
parallel corpus. LREC 2014: Ninth
International Conference on Language Resources and Evaluation, May 26-31,
2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.710-715. [PDF, 140KB]
(2014) Jayendra
Rakesh Yeka, Prasanth Kolachina, & Dipti Misra Sharma: Benchmarking of English-Hindi parallel corpora.
LREC 2014: Ninth International Conference
on Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall
and Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.1812-1818. [PDF, 627KB]
(2014) Lise
Rebout & Philippe Langlais: An iterative
approach for mining parallel sentences in a comparable corpus. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.648-655. [PDF, 201KB]
(2014) Michael Rosner & Kurt Sultana: Automatic methods for the extension of a bilingual
dictionary using comparable corpora. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.3790-3797. [PDF, 195KB]
(2014) Raphael Rubino, Antonio Toral, Nikola
Ljubešić, & Gema Ramírez-Sánchez: Quality
estimation for synthetic parallel data generation. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.1843-1849. [PDF, 248KB]
(2014) Raivis Skadiņš, Jörg Tiedemann, Roberts
Rozis & Daiga Deksne: Billions of parallel
words for free: building and using the EU Bookshop corpus. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.1850-1855. [PDF, 521KB]
(2014) Liang Tian, Derek F.Wong, Lidia S.Chao, Paula
Quaresma, Francisco Oliveira, Yi Lu, Shuo Li, Yiming Wang, & Longyue Wang: UM-Corpu: a large English-Chinese parallel corpus for
statistical machine translation. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.1837-1842. [PDF, 605KB]
(2014) Dan Tufiş, Radu Ion, Ştefan
Dumitrescu, & Dan Ştefănescu: Large
SMT data-sets extracted from Wikipedia. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.656-663. [PDF, 232KB]
(2014) Pavel Vondřička: Aligning parallel texts with InterText. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.1875-1879. [PDF, 476KB]
(2014) Shikun Zhang, Wang Ling, & Chris Dyer: Dual subtitles as parallel corpora. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.1869-1874. [PDF, 254KB]
(2013) Proceedings
of the 6th Workshop on Building and Using
Comparable Corpora,
(2013) Haithem Afli, Loďc Barrault
& Holger Schwenk: Multimodal comparable
corpora as resources for extracting parallel data: parallel phrases extraction.
International Joint Conference on Natural
Language Processing,
(2013) Ahmet Aker, Monica Paramita, & Robert
Gaizauskas: Extracting bilingual terminologies from
comparable corpora. ACL-2013: Proceedings of the 51st Meeting of
the Association for Computational Linguistics,
(2013) Daniel Andrade, Masaaki
Tsuchida, Takashi Onishi, & Kai Ishikawa: Synonym
acquisition using bilingual comparable corpora. International Joint Conference on Natural Language Processing,
(2013) Daniel Andrade, Masaaki Tsuchida, Takashi
Onishi, & Kai Ishikawa: Translation
acquisition using synonym sets. [NAACL-HLT
2013] The 2013 conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 9-14 June 2013,
(2013) Dhouha Bouamor, Nasredine
Semmar, & Pierre Zweigenbaum: Building
specialized bilingual lexicons using word sense disambiguation. International Joint Conference on Natural
Language Processing,
(2013) Dhouha Bouamor, Nasredine Semmar, & Pierre
Zweigenbaum: Context vector disambiguation for
bilingual lexicon extraction from comparable corpora. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.759-764.
[PDF, 199KB]
(2013) Dhouha Bouamor, Nasredine Semmar, & Pierre
Zweigenbaum: Towards a generic approach for
bilingual lexicon extraction from comparable corpora. Proceedings of the XIV Machine Translation
(2013) Chenhui
(2013) Béatrice
Daille: TTC: terminology extraction,
translation tools and comparable corpora. Proceedings of
the XIV Machine Translation
(2013) Rima Harastani, Béatrice Daille
& Emmanuel Morin: Ranking translation
candidates acquired from comparable corpora. International Joint Conference on Natural Language Processing,
(2013) Amir Hazem & Emmanuel Morin:
Word co-occurrence counts prediction for
bilingual terminology extraction from comparable corpora. International Joint Conference on Natural
Language Processing,
(2013) Felix Hieber, Laura Jehl, & Stefan Riezler:
Task alternation in parallel sentence retrieval
for Twitter translation. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.323-327.
[PDF, 208KB]
(2013) Ann Irvine
& Chris Callison-Burch: Combining bilingual and comparable corpora for low
resource machine translation. WMT
2013: 8th Workshop on Statistical Machine Translation, Proceedings of the
Workshop, August 8-9, 2013,
(2013) Ann Irvine, Chris Quirk, &
Hal Daumé III: Monolingual marginal matching
for translation model adaptation. [EMNLP
2013] Proceedings of the 2013 Conference on Empirical Methods in Natural
Language Processing, Seattle, Washington, USA, 18-21 October 2013;
pp.1077-1088. [PDF, 247KB]
(2013) Ann Irvine: Statistical machine translation in low
resource settings. [NAACL-HLT 2013]
Proceedings of the NAACL HLT 2013 Student Research Workshop, 13 June 2013,
(2013) Ekaterina Lapshinova-Koltunski: VARTRA: a comparable corpus for analysis of
translation variation. Proceedings of
the 6th Workshop on Building and Using Comparable Corpora,
(2013) Taesung Lee & Seung-won Hwang: Bootstrapping entity translation on weakly comparable
corpora. ACL-2013: Proceedings of the 51st Meeting of the Association for
Computational Linguistics,
(2013) Lian Tze Lim, Lay-Ki Soon, Tek Yong Lim, Enya
Kong Tang, & Bali Ranaivo-Malançon: Context-dependent
multilingual lexical lookup for under-resourced languages. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.294-299.
[PDF, 373KB]
(2013) Wang Ling, Guang Xiang, Chris Dyer, Alan Black,
& Isabel Trancoso: Microblogs as parallel
corpora. ACL-2013: Proceedings of the 51st Meeting of the Association for
Computational Linguistics,
(2013) Xiaodong Liu, Kevin Duh, & Yuji Matsumoto: Topic models + word alignment = a flexible framework
for extracting bilingual dictionary from comparable corpus. Proceedings of the Seventeenth Conference on
Computational Natural Language Learning, Sofia, Bulgaria, 8-9 August 2013;
pp.212-221. [PDF, 487KB]
(2013) Oscar Mendoza Rivera, Ruslan Mitkov, &
Gloria Corpas Pastor: A flexible framework
for collocation retrieval and translation from parallel and comparable corpora.
[MT
(2013) John Richardson, Toshiaki
Nakazawa, & Sadao Kurohashi: Robust
transliteration mining from comparable corpora with bilingual topic models.
International Joint Conference on Natural
Language Processing,
(2013) Itsuki
(2013) Dan Tufiş, Radu Ion,
Ştefan Daniel Dumitrescu, & Dan Ştefănescu: Wikipedia as an SMT training corpus. Proceedings of Recent Advances in
Natural Language Processing, Hissar,
Bulgaria, 7-13 September 2013; pp.702-709. [PDF, 220KB]
(2013) Zede Zhu, Miao Li, Lei Chen, & Zhenxin
Yang: Building comparable corpora based on
bilingual LDA model. ACL-2013: Proceedings of the 51st Meeting of
the Association for Computational Linguistics, Short papers, Sofia,
Bulgaria, August 4-9 2013; pp.278-281. [PDF, 127KB]
(2012) [BUCC 2012] The 5th Workshop
on Building and Using Comparable Corpora: “Language Resources for Machine
Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Ahmet Aker, Evangelos Kanoulas, & Robert
Gaizauskas: A light way to collect comparable
corpora from the Web. LREC 2012:
Eighth international conference on Language Resources and Evaluation, 21-27
May 2012,
(2012) Walid Aransa, Holger Schwenk, & Loic
Barrault: Semi-supervised transliteration
mining from parallel and comparable corpora. IWSLT-2012: 9th International Workshop on Spoken Language Translation,
(2012) Emma Barker & Rob Gaizauskas: Assessing the comparability of news texts. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Julien Bourdaillet & Philippe Langlais: Identifying infrequent translations by
aligning non parallel sentences. AMTA-2012:
the Tenth Biennial Conference of the Association for Machine Translation in the
(2012) Bruno Cartoni & Thomas Meyer: Extracting directional and comparable corpora from
a multilingual corpus for translation studies. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Béatrice Daille: Building bilingual terminologies from comparable
corpora: the TTC TermSuite. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Estelle Delpech, Béatrice Daille, Emmanuel
Morin, & Claire Lemaire: Extraction of
domain-specific bilingual lexicon from comparable corpora: compositional
translation and ranking. Proceedings
of COLING 2012: Technical Papers, Mumbai, December 2012; pp.745-761. [PDF,
319KB]
(2012) Estelle Delpech, Béatrice Daille, Emmanuel
Morin, & Claire Lemaire: Identification of
fertile translations in medical comparable corpora: a morpho-compositional
approach. AMTA-2012: the Tenth
Biennial Conference of the Association for Machine Translation in the
(2012) Roger Granada, Lucelene Lopes, Carlos Ramisch,
Cassia Trojahn, Renata Vieira, & Aline Villavicencio: A comparable corpus based on aligned multilingual
ontologies. [ACL 2012] Proceedings of the
First Workshop on Multilingual Modeling, Jeju,
(2012) Amir Hazem &
Emmanuel Morin: ICA for bilingual lexicon
extraction from comparable corpora.
[BUCC 2012] The 5th Workshop on
Building and Using Comparable Corpora: “Language Resources for Machine
Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Iustina Ilisei, Diana
Inkpen, Gloria Corpas, & Ruslan Mitkov: Romanian
translational corpora: building comparable corpora for translation studies. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Radu Ion: PEXACC: a
parallel sentence mining algorithm from comparable corpora. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Elena Irimia: Experimenting with extracting lexical dictionaries
from comparable corpora for English-Romanian language pair. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Hiroyuki Kaji, Takashi
Tsunakawa, & Yoshihiro Komatsubara: Improving
compositional translation with comparable corpora. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Mahdi Khademian, Kaveh Taghipour, Saab Mansour,
& Shahram Khadivi: A holistic approach to
bilingual sentence fragment extraction from comparable corpora. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012)
(2012) Aimée Lahaussois &
Séverine Guillaume: A viewing and processing
tool for the analysis of a comparable corpus of Kiranti mythology. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Chunyang Liu, Qi Liu, Yang Liu, & Maosong
Sun: THUTR: a translation retrieval system.
Proceedings of COLING 2012: Demonstration
Papers, Mumbai, December 2012; pp. 321-328. [PDF, 306KB]
(2012) Nikola Ljubešić,
Špela Vintar, & Darja Fišer: Multi-word
term extraction from comparable corpora by combining contextual and constituent
clues. [BUCC 2012] The 5th Workshop on Building and Using
Comparable Corpora: “Language Resources for Machine Translation in
Less-Resourced Languages and Domains”,
LREC 2012 Workshop, 26 May 2012,
(2012) Philipp Petrenz &
Bonnie Webber: Robust cross-lingual genre
classification through comparable corpora. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language Resources
for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Mārcis Pinnis,
Radu Ion, Dan Ştefănescu, Fangzhong Su,
(2012) Magdalena Plamada &
Martin Volk: Towards a Wikipedia-extracted
Alpine corpus. [BUCC 2012] The 5th Workshop on Building and Using Comparable
Corpora: “Language Resources for Machine Translation in Less-Resourced
Languages and Domains”, LREC 2012
Workshop, 26 May 2012,
(2012) Reinhard Rapp, Serge
Sharoff, & Bogdan Babych: Identifying word
translations from comparable documents without a seed lexicon. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Robert Remus &
Mathias Bank: Textual characteristics of
different-sized corpora. [BUCC 2012]
The 5th Workshop on Building and Using
Comparable Corpora: “Language Resources for Machine Translation in
Less-Resourced Languages and Domains”,
LREC 2012 Workshop, 26 May 2012,
(2012) Hervé Saint-Amand, Jason Smith, & Magdalena
Plamada: Parallel corpus
extraction from CommonCrawl. Machine
Translation Marathon 2012 September 3-8,
(2012) Rahma Sellami, Fatiha Sadat, & Lamia
Hadrich Belguith: Exploiting Wikipedia as a
knowledge base for the extraction of linguistic resources: application on
Arabic-French comparable corpora and bilingual lexicons. AMTA-2012: Fourth workshop on computational
approaches to Arabic script-based languages. Proceedings,
(2012) Serge Sharoff: Beyond
translation memories: finding similar documents in comparable corpora.
[Aslib 2012] Translating and the Computer
34, 29-30 November 2012, One Birdcage Walk, London, UK; 7pp. [PDF, 145KB], presentation: 47 slides [PDF, 849KB]
(2012) Inguna Skadiņa: Analysis and evaluation
of comparable corpora for under-resourced areas of machine translation. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Inguna Skadiņa,
Ahmet Aker, Nikos Mastropavlos, Fangzhong Su, Dan Tufis, Mateja Verlic, Andrejs
Vasiļjevs, Bogdan Babych, Paul Clough, Robert Gaizauskas, Nikos Glaros,
Monica Lestari Paramita, & Mārcis Pinnis: Collecting and using comparable corpora for
statistical machine translation. LREC
2012: Eighth international conference on Language Resources and Evaluation, 21-27 May 2012,
(2012) Sanja Štajner &
Ruslan Mitkov: Using comparable corpora to
track diachronic and synchronic changes in lexical density and lexical richness. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Dan Ştefănescu, Radu Ion, &
Sabine Hunsicker: Hybrid parallel sentence
mining from comparable corpora. EAMT 2012: Proceedings of the 16th Annual
Conference of the European Association for Machine Translation, Trento,
Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; pp.137-144. [PDF, 493KB]
(2012) Dan
Ştefănescu: Mining for term
translations in comparable corpora.
[BUCC 2012] The 5th Workshop on
Building and Using Comparable Corpora: “Language Resources for Machine
Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Fangzhong Su &
Bogdan Babych: Development and application of a
cross-language document comparability metric. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Fangzhong Su & Bogdan Babych: Measuring comparability of documents in non-parallel
corpora for efficient extraction of (semi-)parallel translation equivalents.
EACL Joint Workshop on Exploiting
Synergies between Information Retrieval and Machine Translation (ESIRMT) and
Hybrid Approaches to Machine Translation (HyTra): Proceedings of the
workshop, 23-24 April 2012, Avignon, France; pp.10-19. [PDF, 188KB]
(2012) Akihiro Tamura,
(2012) Ivan Vulić & Marie-Francine Moens: Detecting highly confident word translations from
comparable corpora without any prior knowledge. [EACL 2012] Proceedings of the 13th Conference of the European Chapter
of the Association for Computational Linguistics,
(2012) Yunqing Xia, Guoyu
Tang, Peng Jin, & Xia Yang: CLTC: a
Chinese-English cross-lingual topic corpus.
LREC 2012: Eighth international conference on Language
Resources and Evaluation,
21-27 May 2012,
(2012) Manuela Yapomo, Gloria
Corpas, & Ruslan Mitkov: CLIR- and
ontology-based approach for bilingual extraction of comparable documents. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) ACCURAT:
Analysis and evaluation of comparable corpora for under resourced areas of
machine translation. [Project paper at] EAMT
2012: Proceedings of the 16th Annual Conference of the European Association for
Machine Translation, Trento, Italy, May 28-30 2012, ed. Mauro Cettolo,
Marcello Federico, Lucia
Specia, Andy Way;
p.205. [PDF, 72KB]
(2011) Vamshi Ambati, Sanjika Hewavitharana, Stephan Vogel, & Jaime
Carbonell: Active learning with multiple
annotations for comparable data classification task. ACL 2011: Proceedings of the Fourth Workshop on Building and Using
Comparable Corpora,
(2011) Anja Belz & Eric Kow: Unsupervised
alignment of comparable data and text resources. ACL 2011: Proceedings of the Fourth Workshop on Building and Using
Comparable Corpora,
(2011) Abhijit Bhole, Goutham Tholpadi, &
Raghavendra Udupa: Mining multi-word named entity
equivalents from comparable corpora. [IJCNLP
2011] Proceedings of the 2011 Named Entities Workshop,
(2011) Bruno Cartoni, Sandrine Zufferey, Thomas Meyer, & Andrei
Popescu-Belis: How comparable are parallel
corpora? Measuring the distribution of general vocabulary and connectives. ACL 2011: Proceedings of the Fourth Workshop
on Building and Using Comparable Corpora,
(2011) Mauro Cettolo, Nicola Bertoldi, & Marcello
Federico: Bootstrapping Arabic-Italian SMT
through comparable texts and pivot translation. [EAMT 2011]: proceedings of the 15th conference of the European
Association for Machine Translation, 30-31 May 2011, Leuven, Belgium; eds.
Mikel L.Forcada, Heidi Depraetere, Vincent Vandeghinste; pp.249-256. [PDF,
354KB]; presentation, 12 slides [PDF]
(2011) Darja Fišer & Nikola Ljubešić: Bilingual lexicon extraction from comparable
corpora for closely related languages.
[RANLP 2011] Proceedings of Recent
Advances in Natural Language Processing, Hissar, Bulgaria, 12-14 September
2011; pp.125-131. [PDF, 95KB]
(2011) Darja Fišer, Nikola Ljubešić, Špela Vintar, & Senja
Pollak: Building and using comparable corpora for
domain-specific bilingual lexicon extraction. ACL 2011: Proceedings of the Fourth Workshop on Building and Using
Comparable Corpora,
(2011) Amir Hazem, Emmanuel Morin & Sebastian Peńa Saldarriaga: Bilingual lexicon extraction from comparable corpora
as metasearch. ACL 2011: Proceedings
of the Fourth Workshop on Building and Using Comparable Corpora,
(2011) Sanjika Hewavitharana & Stephan Vogel: Extracting parallel phrases from comparable
data. ACL 2011: Proceedings of the
Fourth Workshop on Building and Using Comparable Corpora,
(2011) Miguel A.Jiménez-Crespo: To adapt or not to adapt in web localization: a
contrastive genre-based study of original and localised legal sections in
corporate websites. Journal of Specialised Translation 15 (January
2011); pp.2-27. [PDF, 237KB]
(2011) Kevin Knight: Putting a value
on comparable data [abstract]. ACL
2011: Proceedings of the Fourth Workshop on Building and Using Comparable
Corpora,
(2011) Bo Li, Eric Gaussier, & Akiko Aizawa: Clustering comparable corpora for bilingual lexicon
extraction. ACL-HLT 2011: Proceedings
of the 49th Annual Meeting of the Association for Computational Linguistics:
Short papers,
(2011) Emmanuel Morin & Emmanuel Prochasson: Bilingual lexicon extraction from comparable corpora
enhanced with parallel corpora. ACL
2011: Proceedings of the Fourth Workshop on Building and Using Comparable
Corpora,
(2011) Emmanuel Prochasson & Pascale Fung: Rare word translation extraction from aligned
comparable documents. ACL-HLT 2011:
Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics,
(2011) Matthew Snover, Xiang Li,
Wen-Pin Lin, Zheng Chen, Suzanne Tamang, Mingmin Ge, Adam Lee, Qi Li, Hao Li,
Sam Anzaroot, & Heng Ji: Cross-lingual slot
filling from comparable corpora. ACL
2011: Proceedings of the Fourth Workshop on Building and Using Comparable
Corpora,
(2011) Ivan Vulić, Wim De Smet, & Marie-Francine Moens: Identifying word translations from comparable corpora
using latent topic models. ACL-HLT
2011: Proceedings of the 49th Annual Meeting of the Association for
Computational Linguistics: Short papers,
(2010) Mauro Cettolo, Marcello Federico, & Nicola Bertoldi: Mining parallel fragments from comparable texts.
Proceedings of the 7th International
Workshop on Spoken Language Translation, 2-3 December 2010,
(2010) Diptesh
Chatterjee, Sudeshna Sarkar, & Arpit Mishra: Co-occurrence graph based iterative bilingual
lexicon extraction from comparable corpora. [Coling 2010] Proceedings of the 4th Workshop on Cross Lingual
Information Access,
(2010) Do Thi Ngoc Diep,
Laurent Besacier, & Eric Castelli: A fully
unsupervised approach for mining parallel data from comparable corpora. EAMT 2010: Proceedings of the 14th Annual
conference of the European Association for Machine Translation, 27-28 May
2010,
(2010) Andreas Eisele & Jia Xu: Improving
machine translation performance using comparable corpora. [LREC 2010] Proceedings of the 3rd
Workshop on Building and Using Comparable Corpora,
(2010) Pascale Fung, Emmanuel Prochasson, & Simon Shi: Trillions of comparable documents. [LREC 2010] Proceedings
of the 3rd Workshop on Building and Using Comparable Corpora,
(2010) Pablo Gamallo Otero & Isaac González López: Wikipedia as multilingual source of comparable
corpora. [LREC 2010] Proceedings of the 3rd Workshop on Building and
Using Comparable Corpora,
(2010) Degen Huang, Lian
Zhao, Lishuang Li, & Haitao Yu: Mining
large-scale comparable corpora from Chinese-English news collections. Coling 2010: 23rd International Conference
on Computational Linguistics, 23-27 August 2010, Beijing International
Convention Center, Beijing, China, Posters
volume; pp.472-480. [PDF, 255KB]
(2010) Hiroyuki Kaji, Takashi Tsunakawa, &
Daisuke Okada: Using comparable corpora to adapt a
translation model to domains. LREC
2010: proceedings of the seventh
international conference on Language Resources and Evaluation, 17-23 May
2010,
(2010) Lianhau Lee, Aiti
Aw, Min Zhang, & Haizhou Li: EM-based
hybrid model for bilingual terminology extraction from comparable corpora. Coling 2010: 23rd International Conference
on Computational Linguistics, 23-27 August 2010, Beijing International
Convention Center, Beijing, China, Posters
volume; pp.639-646. [PDF, 114KB]
(2010) Bo Li & Eric
Gaussier: Improving corpus comparability for
bilingual lexicon extraction from comparable corpora. Coling 2010: 23rd International Conference on Computational Linguistics.
Proceedings of the conference, 23-27 August 2010,
(2010) Bin Lu, Tao Jiang, Kapo Chow, & Benjamin K. Tsou: Building a large English-Chinese parallel corpus from
comparable patents and its experimental application to SMT. [LREC 2010] Proceedings of the 3rd
Workshop on Building and Using Comparable Corpora,
(2010) Inguna Skadiņa, Andrejs Vasiļjevs, Raivis
Skadiņš, Robert Gaizauskas, Dan Tufiş, & Tatiana Gornostay: Analysis and evaluation
of comparable corpora for under resourced areas of machine translation.
[LREC 2010] Proceedings of the 3rd Workshop on Building and Using Comparable
Corpora,
(2010) Jason R.Smith,
Chris Quirk, & Kristina Toutanova: Extracting
parallel sentences from comparable corpora using document level alignment. NAACL HLT 2010: Human Language Technologies:
the 2010 annual conference of the North American Chapter of the Association for
Computational Linguistics. Proceedings… June 2-4, 2010,
Concordances
(2013) Adam Kilgarriff: Terminology finding, parallel corpora and
bilingual word sketches in the Sketch Engine. [Aslib 2013] Translating and
the Computer 35, 28-29 November 2013, etc.venues, Paddington,
(2012) Ming-Hong Bai, Yu-Ming
Hsieh, Keh-Jiann Chen, & Jason S.Chang: DOMCAT:
a bilingual concordancer for domain-specific computer assisted translation.
[ACL 2012] Proceedings of the 50th Annual
Meeting of the Association for Computational Linguistics, Jeju,
(2012) Paola Valli: How
long is a piece of string? Concordance searches and user behavior
investigated. [Aslib 2012] Translating
and the Computer 34, 29-30 November 2012, One Birdcage Walk, London, UK;
11pp. [PDF, 359KB], presentation: 20
slides [PDF, 2909KB]
(2010) Alain Désilets: WeBiText: multilingual
concordancer built from public high quality web content. AMTA 2010: the Ninth conference of the Association for Machine
Translation in the Americas,
Corpora see Bilingual corpora, Comparable corpora, Monolingual corpora,
Multilingual corpora
Crowd sourcing
(2014) Shinsuke Goto, Donghui Lin, & Toru Ishida: Crowdsourcing for evaluating machine translation
quality. LREC 2014: Ninth
International Conference on Language Resources and Evaluation, May 26-31,
2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.3456-3463. [PDF, 214]
(2014) Miguel A.Jiménez Crespo: Beyond prescription: what empirical
studies are telling us about localization crowdsourcing. Translating and the Computer 36: proceedings.
Asling: International Society for Advancement in Language Technology, 27-28
November 2014; pp.27-35. [PDF, 142KB]
(2014) Mitesh M.Khapra, Ananthakrishnan Ramanathan,
Anoop Kunchukuttan, Karthik Visweswariah, & Pushpak Bhattacharyya: When transliteration met crowdsourcing: an
empirical study of transliteration via crowdsourcing using efficient,
non-redundant and fair quality control. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.196-202. [PDF, 180KB]
(2014) Wang Ling, Luís Marujo, Chris Dyer, Alan Black
& Isabel Trancoso: Crowdsourcing high-quality
parallel data extraction from Twitter. [WMT 2014] Proceedings of the Ninth Workshop on Statistical Machine Translation,
(2014) Erin Lyons: Far
from the maddening crowd: integrating collaborative translation technologies
into healthcare services in the developing world. Translating and the Computer 36: proceedings. Asling: International
Society for Advancement in Language Technology, 27-28 November 2014;
pp.165-173. [PDF, 291KB]
(2014) Eduard Šubert & Ondřej Bojar: Twitter Crowd Translation – design and objectives.
Translating and the Computer 36:
proceedings. Asling: International Society for Advancement in Language
Technology, 27-28 November 2014; pp.217-227. [PDF, 324KB]
(2014) XLike: cross-lingual
knowledge extraction. Project duration: January 2012 – December 2014.
Proceedings of the 17th annual conference of the European Association for
Machine Translation, EAMT 2014, Dubrovnik, Croatia, 16th-18th June 2014, edited
by Marko Tadić, Philipp Koehn, Johann Roturier, Andy Way; p.131. [PDF,
417KB]
(2013) Anoop Kunchukuttan, Rajen Chatterjee, Shourya
Roy, Abhijit Mishra, & Pushpak Bhattacharyya: TransDoop: a map-reduce based crowdsourced
translation for complex domains. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, System demonstrations,
(2013) Michael Matuschek, Christian M.Meyer, &
Iryna Gurevych: Multilingual knowledge in
aligned Wiktionary and OmegaWiki for translation applications. Translation: Computation, Corpora, Cognition 3 (1), June 2013; pp.87-118.
[PDF, 2898KB]
(2013) Aram Morera-Mesa, J.J.Collins, & David
Filip: Selected crowdsourced translation
practices. [Aslib 2013] Translating and the Computer 35, 28-29
November 2013, etc.venues, Paddington,
(2013) Rabih Zbib, Gretchen Markiewicz, Spyros
Matsoukas, Richard Schwartz, & John Makhoul: Systematic comparison of professional and
crowdsourced reference translations for machine translation. [NAACL-HLT 2013] The 2013 conference of the
North American Chapter of the Association for Computational Linguistics: Human
Language Technologies, 9-14 June 2013,
(2013) XLIKE:
cross-lingual knowledge extraction (XLike). Proceedings of the XIV Machine Translation Summit, Nice, September
2-6, 2013; ed. K.Sima’an, M.L.Forcada, D.Grasmick, H.Depraetere, A.Way; p.451.
[PDF, 191KB]
(2012) Anoop Kunchukuttan, Shourya Roy, Pratik Patel,
Kushal Ladha, Somya Gupta, Mitesh Khapra, & Pushpak Bhattacharyya: Experiences in resource generation for
machine translation through crowdsourcing.
LREC 2012: Eighth international conference on Language
Resources and Evaluation,
21-27 May 2012,
(2012) Dawn Lawrie, James Mayfield, Paul McNamee,
& Douglas W.Oard: Creating and curating a
cross-language person-entity linking collection. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Victor Muntés-Mulero, Patricia Paladini, Marc
Solé, & Jawad Manzoor: Multiplying
the potential of crowdsourcing with machine translation. AMTA-2012: the Tenth Biennial Conference of
the Association for Machine Translation in the
(2012) Michael Paul, Eiichiro Sumita, Luisa
Bentivogli, & Marcello Federico: Crowd-based
MT evaluation for non-English target languages. EAMT 2012: Proceedings of the 16th Annual Conference of the European
Association for Machine Translation, Trento, Italy, May 28-30 2012, ed.
Mauro Cettolo, Marcello
Federico, Lucia Specia, Andy Way; pp.229-237. [PDF, 260KB]
(2012) Matt Post, Chris
Callison-Burch, & Miles Osborne: Constructing
parallel corpora for six Indian languages via crowdsourcing. WMT 2012: 7th Workshop on Statistical
Machine Translation. Proceedings of the workshop, June 7-8, 2012,
(2012) Marion Potet,
Emmanuelle Esperança-Rodier, Laurent Besacier, & Hervé Blanchon: Collection of a large database of French-English SMT
output corrections. LREC
2012: Eighth international conference on Language Resources and Evaluation, 21-27 May 2012,
(2012) Midori Tatsumi, Takako Aikawa, Kentaro
Yamamoto, & Hitoshi Isahara: How good is
crowd post-editing? Its potential and limitations. AMTA-2012: Workshop on post-editing technology and practice.
Proceedings,
(2012) ACCEPT:
Automated Community Content Editing porTal. [Project paper at] EAMT 2012: Proceedings of the 16th Annual
Conference of the European Association for Machine Translation, Trento,
Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; p.89. [PDF, 74KB]
(2012) Confident
MT: estimating translation quality for improved statistical machine
translation. [Project paper at] EAMT
2012: Proceedings of the 16th Annual Conference of the European Association for
Machine Translation, Trento, Italy, May 28-30 2012, ed. Mauro Cettolo,
Marcello Federico, Lucia
Specia, Andy Way;
p.98. [PDF, 424KB]
(2011) Luisa Bentivogli, Marcello Federico, Giovanni
Moretti, & Michael Paul: Getting expert
quality from the crowd for machine translation evaluation. MT Summit XIII: the Thirteenth Machine
Translation Summit [organized by the] Asia-Pacific Association for Machine
Translation (AAMT), 19-23 September 2011, Xiamen, China; pp.521-528. [PDF,
337KB]
(2011) Karën Fort, Gilles Adda, &
K.Bretonnel Cohen: Amazon Mechanical Turk: gold mine
or coal mine? Computational Linguistics 37 (2), pp. 413-420 [PDF, 96KB]
(2011) Chang Hu, Philip Resnik, Yakov Kronrod, Vladimir Eidelman,
Olivia Buzek, & Benjamin B.Bederson: The value of
monolingual crowdsourcing in a real-world translation scenrio: simulation using
Haitian Creole emergency SMS messages. [WMT 2011] Proceedings of the 6th Workshop on Statistical Machine Translation,
(2011) Shasha Liao, Cheng Wu, & Juan Huerta: Evaluating human correction quality for machine
translation from crowdsourcing. [RANLP 2011] Proceedings of Recent Advances
in Natural Language Processing, Hissar, Bulgaria, 12-14 September 2011;
pp.598-603. [PDF, 487KB]
(2011) Matteo Negri, Luisa Bentivogli, Yashar Mehdad, Danilo
Giampiccolo, & Alessandro Marchetti: Divide and
conquer: crowdsourcing the creation of cross-lingual textual entailment corpora. [EMNLP 2011] Proceedings of the 2011 Conference on Empirical Methods in Natural
Language Processing, Edinburgh, Scotland, UK, July 27-31, 2011; pp.670-679.
[PDF, 493KB]
(2011) Omar F.Zaidan & Chris Callison-Burch: Crowdsourcing translation: professional quality from
non-professionals. ACL-HLT 2011:
Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics,
(2010) Gilles Adda
& Joseph Mariani: Language resources &
Amazon Mechanical Turk: ethical, legal and other issues. LREC 2010: Le gal Issues for Sharing
Language Resources - LISLR2010 Workshop, 17 May 2010,
(2010) Vamshi Ambati,
Stephen Vogel, & Jaime Carbonell: Active
learning and crowd-sourcing for machine translation. LREC 2010: proceedings of the seventh international conference on
Language Resources and Evaluation, 17-23 May 2010,
(2010) Vamshi Ambati
& Stephan Vogel: Can crowds build
parallel corpora for machine translation systems? Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and
Language Data with Amazon’s Mechanical Turk,
(2010) Michael
Denkowski, Hasan Al-Haj, & Alon Lavie: Turker-assisted paraphrasing for
English-Arabic machine translation. Proceedings
of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with
Amazon’s Mechanical Turk,
(2010) Michael Denkowski
& Alon Lavie: Exploring
normalization techniques for human judgments of machine translation adequacy
collected using Amazon Mechanical Turk. Proceedings
of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with
Amazon’s Mechanical Turk,
(2010)
Alain Désilets: Collaborative translation: technology,
crowdsourcing, and the translator perspective. Introduction to workshop at AMTA 2010: the Ninth conference
of the Association for Machine Translation in the
(2010) Bill Dolan: Building
partnerships with language communities: the importance of shared technology and
shared data. META-FORUM 2010:
Challenges for multilingual Europe, November 17/18 2010,
(2010) Qin Gao &
Stephan Vogel: Consensus versus expertise: a
case study of word alignment with Mechanical Turk. Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and
Language Data with Amazon’s Mechanical Turk,
(2010) Yakov Kronrod, Philip Resnik, Olivia Buzek, Chang
Hu, Alex Quinn, & Benjamin B.Bederson: Improving
translation via targeted paraphrasing. Contribution to workshop of
‘Collaborative translation’ at AMTA
2010: the Ninth conference of the Association for Machine Translation in the
Americas, Denver, Colorado, October 31, 2010; 4pp. [PDF, 79KB]
(2010) Robert
Munro, Steven Bethard, Victor Kuperman, Vicky Tzuyin Lai, Robin Melnick,
Christopher Potts, Tyler Schnoebelen, & Harry Tily: Crowdsourcing and language studies: the new
generation of linguistic data. Proceedings of the NAACL
HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical
Turk,
(2010)
Robert Munro: Crowdsourcing translation for
emergency response in Haiti: the global collaboration of local knowledge.
Contribution to workshop of ‘Collaborative translation’ at AMTA 2010: the Ninth conference of
the Association for Machine Translation in the Americas, Denver, Colorado,
October 31, 2010; 4pp. [PDF, 331KB]
(2010) Sharon O’Brien & Reinhard Schäler: Next generation translation and localization:
users are taking charge. Translating
and the Computer 32, 18-19 November 2010,
(2010) Mike O’Malley: The challenges of distributed
parallel corpora. AMTA 2010: the Ninth
conference of the Association for Machine Translation in the Americas,
(2010) Willem Stoeller: Community translation. Translingual Europe 2010,
(2010) Anas Tawileh: Managing
social translation: online tools for translators’ communities. Translating and the Computer 32, 18-19
November 2010,
(2010)
Jost Zetzsche: Crowdsourcing and the
professional translator. Contribution to workshop of ‘Collaborative
translation’ at AMTA 2010: the
Ninth conference of the Association for Machine Translation in the Americas,
Denver, Colorado, October 31, 2010; 1p. [PDF, 54KB]
Data elicitation
(2011) Sergei
Nirenburg & Marjorie McShane: Morphological
aspects of computer-driven elicitation of
knowledge about any language [abstract]. Machine Translation and Morphologically- rich Languages: Research Workshop
of the Israel Science Foundation,
(2011) Keiji
Yasuda, Hideo Okuma, Masao Utiyama, & Eiichiro Sumita: Annotating data selection for improving machine
translation. IWSLT 2011: Proceedings
of the International Workshop on Spoken Language Translation,
(2010) Vamshi Ambati, Stephan Vogel & Jaime Carbonell: Active learning-based elicitation for
semi-supervised word alignment. ACL
2010: the 48th Annual Meeting of the Association for Computational Linguistics,
Dictionaries
see
Lexical resources
Domain
identification
(2013) Tsutomu Hirao, Tomoharu Iwata, & Masaaki
Nagata: Latent semantic matching: application to
cross-language text categorization without alignment information. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.212-216.
[PDF, 846KB]
(2013) Vivi Nastase & Carlo Strapparava: Bridging languages through etymology: the case of
cross language text categorization. ACL-2013: Proceedings of the 51st Meeting of
the Association for Computational Linguistics,
(2013) Magdalena Plamadă & Martin Volk: Mining for domain-specific text from Wikipedia. Proceedings
of the 6th Workshop on Building and Using Comparable Corpora,
(2011) Zhengxian Gong, Min Zhang, & Guodong Zhou: Cache-based document-level statistical machine translation.
[EMNLP 2011] Proceedings of the 2011
Conference on Empirical Methods in Natural Language Processing, Edinburgh,
Scotland, UK, July 27-31, 2011; pp.909-919. [PDF, 330KB]
(2011) Zhengxian Gong,
Guodong Zhou, & Liangyou Li: Improve SMT with
source-side “topic-document” distributions. MT Summit XIII: the Thirteenth Machine Translation Summit
[organized by the] Asia-Pacific Association for Machine Translation (AAMT),
19-23 September 2011,
(2011) Bruno Pouliquen, Christophe Mazenc & Aldo
Iorio: Tapta: a user-driven translation
system for patent documents based on domain-aware statistical machine
translation. [EAMT 2011]: proceedings
of the 15th conference of the European Association for Machine Translation,
30-31 May 2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere,
Vincent Vandeghinste; pp.5-12. [PDF, 342KB]; presentation, 15 slides [PDF, 1642KB]
(2011) Ivan Vulić, Wim De
Smet, & Marie-Francine Moens: Identifying word
translations from comparable corpora using latent topic models. ACL-HLT 2011: Proceedings of the 49th Annual
Meeting of the Association for Computational Linguistics: Short papers,
Domain restriction, adaptation and specification
(2015) Jinhua Du, Andy Way, Zhengwei Qiu, Asanka
Wasala, & Reinhard Schaler: Domain adaptation for
social localisation-based SMT: a case study using the Trommons platform. MT Summit XV, October 30 – November 3, 2015,
Miami, Florida, USA. Proceedings of MT Summit XV: Fourth Workshop on Post-editing
Technology and Practice (WPTP 4); p.57-65. [PDF, 450KB]
(2015) Nadir Durrani, Hassan Sajjad, Shafiq Joty,
Ahmed Abdelali, & Stephan Vogel: Using joint models or
domain adaptation in statistical machine translation. MT Summit XV, October 30 – November 3, 2015, Miami, Florida, USA.
Proceedings of MT Summit XV: vol.1: MT Researchers’ Track; pp.117-130. [PDF,
637KB]
(2015) Matthias Huck, Alexandra Birch, & Barry
Haddow: Mixed domain
vs. multi-domain statistical machine translation. MT Summit XV, October 30 – November 3, 2015, Miami, Florida, USA.
Proceedings of MT Summit XV: vol.1: MT Researchers’ Track; pp.240-255. [PDF,
578KB]
(2015) Minh-Thang Luong & Christopher Manning: Stanford neural
machine translation systems for spoken language domains. [IWSLT 2015] Proceedings of the
International Workshop on Spoken Language Translation, December 3-4, 2015,
Da Nang, Vietnam; pp.76-79. [PDF, 1.2MB]
(2015) Keisuke Noguchi & Takashi Ninomiya: Resampling
approach for instance-based domain adaptation from patent domain to newspaper
domain in statistical machine translation. MT Summit XV, October 30 – November 3, 2015, Miami, Florida, USA.
Proceedings of MT Summit XV: Sixth Workshop on Patent and Scientific Literature
Translation (PSLT6); pp.81-88. [PDF, 447KB]
(2015) Krzysztof Wolk & Krzysztof Marasek: PJAIT systems for the
IWSLT 2015 evaluation campaign enhanced by comparable corpora. [IWSLT 2015] Proceedings of the
International Workshop on Spoken Language Translation, December 3-4, 2015,
Da Nang, Vietnam; pp.101-104. [PDF, 2.9MB]
(2014) Marine Carpuat, Cyril Goutte, & George
Foster: Linear mixture models for robust machine
translation. [WMT 2014] Proceedings
of the Ninth Workshop on Statistical Machine Translation,
(2014) Mauro Cettolo, Nicola Bertoldi, & Marcello
Federico: The
repetition rate of text as a predictor of the effectiveness of machine
translation adaptation. AMTA 2014:
proceedings of the eleventh conference of the Association for Machine
Translation in the Americas, Vancouver, BC, October 22-26; pp. 166-179. [PDF,
954KB]
(2014) Boxing Chen, Roland Kuhn, & George Foster: A comparison of
mixture and vector space techniques for translation model adaptation. AMTA 2014: proceedings of the eleventh conference of
the Association for Machine Translation in the Americas, Vancouver, BC, October
22-26; pp.124-138 [PDF, 521KB]
(2014) Eva Hasler, Barry Haddow, & Philipp Koehn: Dynamic topic adaptation for SMT using distribution
profiles. [WMT 2014] Proceedings of
the Ninth Workshop on Statistical Machine Translation,
(2014) Ann Irvine & Chris Callison-Burch: Using comparable corpora to adapt MT models to new
domains. [WMT 2014] Proceedings of
the Ninth Workshop on Statistical Machine Translation,
(2014) Yi Lu, Longyue Wang, Derek F.Wong, Lidia S.Chao,
Yiming Wang, & Francisco Oliveira: Domain
adaptation for medical text translation using web resources. [WMT 2014] Proceedings of the Ninth Workshop on
Statistical Machine Translation,
(2014) Saab Mansour & Herman Ney: Translation model based weighting for phrase
extraction. Proceedings of the 17th annual conference of the European
Association for Machine Translation, EAMT 2014, Dubrovnik, Croatia, 16th-18th
June 2014; pp.35-43. [PDF, 448KB]
(2014) Saab Mansour & Hermann Ney: Unsupervised adaptation for statistical machine
translation. [WMT 2014] Proceedings
of the Ninth Workshop on Statistical Machine Translation,
(2014) Shachar Mirkin & Laurent Besacier: Data selection for
compact adapted SMT models. AMTA
2014: proceedings of the eleventh conference of the Association for Machine
Translation in the Americas, Vancouver, BC, October 22-26; pp.301-314. [PDF,
610KB]
(2014) Katsuhito Sudoh, Masaaki Nagata, Shinsuke Mori,
& Tatsuya Kawahara: Japanese-to-English patent translation system based on
domain-adapted word segmentation and post-ordering. AMTA 2014: proceedings of the eleventh conference of
the Association for Machine Translation in the Americas, Vancouver, BC, October
22-26; pp.234-248. [PDF, 743KB]
(2014) Longyue Wang, Yi Lu, Derek F.Wong, Lidia Chao,
Yiming Wang, & Francisco Oliveira: Combining
domain adaptation approaches for medical text translation. [WMT 2014] Proceedings of the Ninth Workshop on
Statistical Machine Translation,
(2014) Marion Weller, Alexander Fraser, & Ulrich
Heid: Combining bilingual terminology mining and
morphological modeling for domain adaptation in SMT. Proceedings of the
17th annual conference of the European Association for Machine Translation,
EAMT 2014, Dubrovnik, Croatia, 16th-18th June 2014, edited by Marko Tadić,
Philipp Koehn, Johann Roturier, Andy Way; pp.11-18. [PDF, 387KB]
(2013) Mihael
Arcan, Susan Marie Thomas, Derek de Brandt, & Paul Buitelaar: Translating the FINREP taxonomy using a
domain-specific corpus. Proceedings
of the XIV Machine Translation
(2013) Pratyush Banerjee, Raphael Rubino, Johann
Roturier, & Josef van Genabith: Quality
estimation-guided data selection for domain adaptation of SMT. Proceedings of the XIV Machine Translation
Summit, Nice, September 2-6, 2013; ed. K.Sima’an, M.L.Forcada, D.Grasmick,
H.Depraetere, A.Way; pp.107-108. [PDF, 658KB]
(2013) Peter Bell, Fergus McInnes, Siva Reddy
Gangireddy, Mark Sinclair, Alexandra Birch, & Steve Renals: The UEDIN English ASR system for the IWSLT 2013
evaluation. [IWSLT 2013]
Proceedings of the 10th International
Workshop on Spoken Language Translation,
(2013) Nicola Bertoldi, Mauro Cettolo, & Marcello
Federico: Cache-based online adaptation for
machine translation enhanced computer assisted translation. Proceedings of the XIV Machine Translation
(2013) Dhouha Bouamor, Adrian Popescu,
Nasredine Semmar, & Pierre Zweigenbaum: Building
specialized bilingual lexicons using large-scale background knowledge. [EMNLP 2013] Proceedings of the 2013
Conference on Empirical Methods in Natural Language Processing,
(2013) Pierrette Bouillon: Automated Community Content Editing PorTal (ACCEPT).
Proceedings of the XIV Machine
Translation
(2013) Mauro Cettolo, Christophe Servan, Nicola
Bertoldi, Marcello Federico, Loďc Barrault, & Holger Schwenk: Issues in incremental adaptation of statistical
MT from human post-edits. Proceedings
of MT
(2013) Boxing Chen, Roland Kuhn, & George Foster: Vector space model for adaptation in statistical
machine translation. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics,
(2013) Lei Cui, Xilun Chen, Dongdong
Zhang, Shujie Liu, Mu Li, & Ming Zhou: Multi-domain
adaptation for SMT using multi-task learning. [EMNLP 2013] Proceedings of the 2013 Conference on Empirical Methods in
Natural Language Processing, Seattle, Washington, USA, 18-21 October 2013;
pp.1055-1065. [PDF, 302KB]
(2013) Kevin Duh, Graham Neubig, Katsuhito Sudoh,
& Hajime Tsukada: Adapation data selection using
neural language models: experiments in machine translation. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.678-683.
[PDF, 215KB]
(2013) Mirela-Ştefania Duma & Cristina
Vertan: Integration of machine translation in
on-line multilingual applications – domain adaptation. Translation: Computation, Corpora,
Cognition 3 (1), June 2013; pp.67-74. [PDF, 395KB]
(2013) Nadir
Durrani, Barry Haddow, Kenneth Heafield & Philipp Koehn: Edinburgh’s
machine translation systems for European language pairs. WMT 2013: 8th Workshop on Statistical
Machine Translation, Proceedings of the Workshop, August 8-9, 2013,
(2013) Marcello Federico, Philipp Koehn, Holger Schwenk,
& Marco Trombetti: Matecat: Machine
Translation Enhanced Computer Assisted Translation. Proceedings of the XIV Machine Translation
(2013) Lluís
Formiga, Marta R. Costa-jussŕ, José B. Marińo, José A. R. Fonollosa, Alberto
Barrón-Cedeńo & Lluis Marquez: The TALP-UPC phrase-based translation systems for
WMT13: system combination with morphology generation, domain adaptation and
corpus filtering. WMT 2013:
8th Workshop on Statistical Machine Translation, Proceedings of the Workshop,
August 8-9, 2013,
(2013) George Foster, Boxing Chen, & Roland Kuhn: Simulating discriminative training for linear
mixture adaptation in statistical machine translation. Proceedings of the XIV Machine Translation
(2013) Than-Le Ha, Teresa Herrmann, Jan Niehues,
Mohammed Mediani, Eunah Cho, Yuqi Zhang, Isabel Slawik, & Alex Waibel: The KIT translation systems for IWSLT 2013. [IWSLT 2013] Proceedings of the 10th International Workshop on Spoken
Language Translation,
(2013) Sanjika Hewavitharana, Dennis N.Mehay,
Sankaranarayanan Ananthakrishnan, & Prem Natarajan: Incremental topic-based translation model
adaptation for conversational spoken language translation. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.697-701.
[PDF, 279KB]
(2013) Felix Hieber, Laura Jehl, & Stefan Riezler:
Task alternation in parallel sentence retrieval
for Twitter translation. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.323-327.
[PDF, 208KB]
(2013) An-Chang Hsieh, Hen-Hsen Huang, & Hsin-His
Chen: Uses of monolingual in-domain corpora for
cross-domain adaptation with hybrid MT approaches. Proceedings of the Second Workshop on Hybrid Approaches to Translation,
(2013) Ann Irvine, Chris Quirk, &
Hal Daumé III: Monolingual marginal matching
for translation model adaptation. [EMNLP
2013] Proceedings of the 2013 Conference on Empirical Methods in Natural
Language Processing, Seattle, Washington, USA, 18-21 October 2013;
pp.1077-1088. [PDF, 247KB]
(2013) Yun Jin, Oh-Woog Kwon, Seung-Hoon Na &
Young-Gil Kim: Patent translation as technical
document translation: customizing a Chinese-Korean MT system to patent domain.
[MT
(2013) Samuel Läubli, Mark Fishel, Manuela Weibel,
& Martin Volk: Statistical machine
translation for automobile marketing texts. Proceedings of the XIV Machine Translation Summit, Nice, September
2-6, 2013; ed. K.Sima’an, M.L.Forcada, D.Grasmick, H.Depraetere, A.Way;
pp.265-272. [PDF, 541KB]
(2013) Stephan
Peitz, Saab Mansour, Jan-Thorsten Peter, Christoph Schmidt, Joern Wuebker,
Matthias Huck, Markus Freitag, & Hermann Ney: The RWTH Aachen machine translation
system for WMT 2013. WMT 2013:
8th Workshop on Statistical Machine Translation, Proceedings of the Workshop,
August 8-9, 2013,
(2013) Hassan Sajjad, Francisco Guzmán, Preslav Nakov,
Ahmed Abdelali, Kenton Murray, Fahad Al Obaidli, & Stephan Vogel: QCRI at IWSLT 2013: experiments in Arabic-English
and English-Arabic spoken language translation. [IWSLT
2013] Proceedings of the 10th
International Workshop on Spoken Language Translation,
(2013) Rico Sennrich, Holger Schwenk & Walid
Aransa: A multi-domain translation model
framework for statistical machine translation. ACL-2013: Proceedings of the 51st Meeting of the Association for
Computational Linguistics,
(2013) Longyue Wang, Derek F.Wong,
Lidia S.Chao, Junwen Xing, Yi Lu, & Isabel Trancoso: Edit distance: a new data selection criterion for
domain adaptation in SMT. Proceedings
of Recent Advances in Natural Language
Processing, Hissar, Bulgaria, 7-13 September 2013; pp.727-732. [PDF, 231KB]
(2013) Petra Wolf & Ulrike Bernardi: Hybrid domain adaptation for a rule based MT system.
Proceedings of the XIV Machine
Translation Summit, Nice, September 2-6, 2013; ed. K.Sima’an, M.L.Forcada,
D.Grasmick, H.Depraetere, A.Way; pp.321-328. [PDF, 415KB]
(2013) Heng Yu, Jinsong Su, Yajuan Lü,
& Qun Liu: A topic-triggered language model
for statistical machine translation. International
Joint Conference on Natural Language Processing,
(2013) Jiajun Zhang & Chengqing Zong: Learning a phrase-based translation model from monolingual
data with application to domain adaptation.
ACL-2013: Proceedings of the 51st
Meeting of the Association for Computational Linguistics,
(2013) Conghui Zhu, Taro Watanabe, Eiichiro Sumita,
& Tiejun Zhao: Hierarchical phrase table
combination for machine translation. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics,
(2012) A.Ryan Aminzadeh, Jennifer Drexler, Timothy
Anderson, & Wade Shen: Improved phrase translation modeling using MAP
adaptation. TSD 2012: 15th International
Conference on Text, Speech and Dialogue, Brno, Czech Republic, September
3-7, 2012; abstract #496, 1p. [HTML]
(2012) Mihael Arcan, Christian Federmann, & Paul
Buitelaar: Experiments with term translation.
Proceedings of COLING 2012: Technical
Papers, Mumbai, December 2012; pp.67-82. [PDF, 178KB]
(2012) Mihael Arcan, Paul
Buitelaar, & Christian Federmann: Using
domain-specific and collaborative resources for term translation. SSST-6, Sixth Workshop on Syntax, Semantics and
Structure in Statistical Translation, Jeju,
(2012) Amittai Axelrod, QingJun Li, & William
D.Lewis: Applications of data selection via
cross-entropy difference for real-world statistical machine translation. IWSLT-2012: 9th International Workshop on
Spoken Language Translation,
(2012) Ming-Hong Bai, Yu-Ming
Hsieh, Keh-Jiann Chen, & Jason S.Chang: DOMCAT:
a bilingual concordancer for domain-specific computer assisted translation.
[ACL 2012] Proceedings of the 50th Annual
Meeting of the Association for Computational Linguistics, Jeju,
(2012) Pratyush
Banerjee, Sudip Kumar Naskar, Johann Roturier, Andy Way, & Josef van
Genabith: Domain adaptation in SMT of user-generated
forum content guided by OOV word reduction: normalization and/or supplementary
data? EAMT 2012: Proceedings of the
16th Annual Conference of the European Association for Machine Translation,
Trento, Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia
Specia, Andy Way; pp.169-176. [PDF, 160KB]
(2012) Pratyush Banerjee, Sudip Kumar Naskar, Johann
Roturier,
(2012) Núria Bel, Vassilis
Papavasiliou, Prokopis Prokopidis, Antonio Toral, & Victoria Arranz: Mining and exploiting domain-specific corpora in the
PANACEA platform. [BUCC 2012] The 5th
Workshop on Building and Using Comparable Corpora: “Language Resources for
Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Nicola Bertoldi, Mauro
Cettolo, Marcello Federico, & Christian Buck: Evaluating the learning curve of domain adaptive statistical
machine translation systems. WMT
2012: 7th Workshop on Statistical Machine Translation. Proceedings of the
workshop, June 7-8, 2012,
(2012) Nicola Bertoldi & Marcello Federico:
Practical domain adaptation in SMT. [Tutorial at] AMTA-2012: the Tenth Biennial Conference of the Association for Machine
Translation in the Americas. Proceedings,
(2012) Arianna Bisazza & Marcello Federico: Cutting the long tail: hybrid language models for
translation style adaptation. [EACL
2012] Proceedings of the 13th Conference of the European Chapter of the
Association for Computational Linguistics,
(2012) Frédéric Blain, Holger Schwenk, & Jean
Senellart: Incremental adaptation using
translation information and post-editing analysis. IWSLT-2012: 9th International Workshop on Spoken Language Translation,
(2012) Han-Bin Chen, Hen-Hsen Huang, Hsin-His Chen,
& Ching-Ting Tan: A
simplification-translation-restoration framework for cross-domain SMT
applications. Proceedings of COLING
2012: Technical Papers, Mumbai, December 2012; pp.545-560. [PDF, 744KB]
(2012) Jinying Chen, Jacob Devlin, Huaigu Cao, Rohit
Prasad, & Premkumar Natarajan: Automatic tune
set generation for machine translation with limited in-domain data. EAMT 2012: Proceedings of the 16th Annual
Conference of the European Association for Machine Translation, Trento,
Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; pp.161-168. [PDF, 287KB]
(2012) Jonathan H.Clark, Alon Lavie, & Chris Dyer:
One system, many domains: open-domain statistical
machine translation via feature augmentation. AMTA-2012: the Tenth Biennial Conference of the Association for Machine
Translation in the
(2012) Béatrice Daille: Building bilingual terminologies from comparable
corpora: the TTC TermSuite. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Hal Daumé III, Marine Carpuat, Alex Fraser,
& Chris Quirk: Domain adaptation in machine translation: findings from the
2012 Johns Hopkins University Summer Workshop. Keynote [abstract]. AMTA-2012: the Tenth Biennial Conference of the Association for Machine
Translation in the
(2012) Estelle Delpech, Béatrice Daille, Emmanuel
Morin, & Claire Lemaire: Extraction of
domain-specific bilingual lexicon from comparable corpora: compositional
translation and ranking. Proceedings
of COLING 2012: Technical Papers, Mumbai, December 2012; pp.745-761. [PDF,
319KB]
(2012) Qing Dou & Kevin
Knight: Large scale decipherment for out-of-domain
machine translation. EMNLP-CoNLL
2012: Joint Conference on Empirical Methods in Natural Language Processing and
Computational Natural Language Learning, Proceedings of the conference,
July 12-14, Jeju Island, Korea; pp.266-275. [PDF, 578KB]
(2012) Vladimir Eidelman,
Jordan Boyd-Graber, & Philip Resnick: Topic
models for dynamic translation model adaptation. [ACL 2012] Proceedings of the 50th Annual Meeting of the Association
for Computational Linguistics, Jeju,
(2012) Atefeh Farzindar & Wael Khreich: Evaluation of domain adaptation techniques for
TRANSLI in a real-world environment. AMTA-2012:
the Tenth Biennial Conference of the Association for Machine Translation in the
(2012) Lluís Formiga, Carlos
A.Henríquez Q.,
(2012) Li Gong, Aurélien Max, & François Yvon: Towards contextual adaptation for any-text
translation. IWSLT-2012: 9th
International Workshop on Spoken Language Translation,
(2012) Roger Granada, Lucelene Lopes, Carlos Ramisch,
Cassia Trojahn, Renata Vieira, & Aline Villavicencio: A comparable corpus based on aligned multilingual
ontologies. [ACL 2012] Proceedings of the
First Workshop on Multilingual Modeling, Jeju,
(2012) Barry Haddow &
Philipp Koehn: Analysing the effect of
out-of-domain data on SMT systems. WMT
2012: 7th Workshop on Statistical Machine Translation. Proceedings of the
workshop, June 7-8, 2012,
(2012) Eva Hasler, Barry Haddow, & Philipp Koehn: Sparse lexicalised features and topic adaptation
for SMT. IWSLT-2012: 9th International
Workshop on Spoken Language Translation,
(2012) Amir Hazem &
Emmanuel Morin: ICA for bilingual lexicon
extraction from comparable corpora.
[BUCC 2012] The 5th Workshop on
Building and Using Comparable Corpora: “Language Resources for Machine
Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Claire Jaja, Douglas M.Briesch, Jamal Laoudi,
& Claire R.Voss: Assessing divergence measures
for automated document routing in an adaptive MT system. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Laura Jehl, Felix
Hieber, & Stefan Riezler: Twitter translation
using translation-based cross-lingual retrieval. WMT 2012: 7th Workshop on Statistical Machine Translation.
Proceedings of the workshop, June 7-8, 2012,
(2012) Maxim Khalilov & Rahzeb Choudury: Building English-Chinese and Chinese-English MT
engines for the computer software domain. EAMT 2012: Proceedings of the 16th Annual Conference of the European
Association for Machine Translation, Trento, Italy, May 28-30 2012, ed.
Mauro Cettolo, Marcello
Federico, Lucia Specia, Andy Way; pp.7-11. [PDF, 193KB]
(2012) Patrik
Lambert, Holger Schwenk, & Frédéric Blain: Automatic
translation of scientific documents in the HAL archive. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Nikola Ljubešić,
Špela Vintar, & Darja Fišer: Multi-word
term extraction from comparable corpora by combining contextual and constituent
clues. [BUCC 2012] The 5th Workshop on Building and Using
Comparable Corpora: “Language Resources for Machine Translation in
Less-Resourced Languages and Domains”,
LREC 2012 Workshop, 26 May 2012,
(2012) Shixiang Lu, Wei Wei,
Xiaoyin Fu, & Bo Xu: Translation model based
cross-lingual language model adaptation: from word models to phrase models.
EMNLP-CoNLL 2012: Joint Conference on
Empirical Methods in Natural Language Processing and Computational Natural
Language Learning, Proceedings of the conference, July 12-14, Jeju Island,
Korea; pp.512-522. [PDF, 196KB]
(2012) Saab Mansour & Hermann Ney: A simple and effective weighted phrase extraction
for machine translation adaptation. IWSLT-2012:
9th International Workshop on Spoken Language Translation,
(2012) Evgeny Matusov: Incremental
re-training of a hybrid English-French MT system with customer translation
memory data. AMTA-2012: the Tenth
Biennial Conference of the Association for Machine Translation in the
(2012) Jan Niehues & Alex Waibel: Detailed analysis of different strategies for
phrase table adaptation in SMT. AMTA-2012:
the Tenth Biennial Conference of the Association for Machine Translation in the
(2012) Lene Offersgaard &
Dorte Haltrup Hansen: SMT systems for
less-resourced languages based on domain-specific data. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Tsuyoshi Okita, Antonio Toral, & Josef van
Genabith: Topic modeling-based domain
adaptation for system combination. COLING
2012: Second Workshop on Applying Machine Learning Techniques to Optimise the
Division of Labour in
(2012) Pavel Pecina, Antonio Toral, Vassilis Papavassiliou, Prokopis
Prokopidis, & Josef van Genabith: Domain
adaptation of statistical machine translation using web-crawled resources: a
case study. EAMT 2012: Proceedings of
the 16th Annual Conference of the European Association for Machine Translation,
Trento, Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; pp.145-152. [PDF, 201KB]
(2012) Pavel Pecina, Antonio Toral, & Josef van
Genabith: Simple and effective parameter
tuning for domain adaptation of statistical machine translation. Proceedings of COLING 2012: Technical Papers,
Mumbai, December 2012; pp.2209-2224. [PDF, 166KB]
(2012) Stephan Peitz, Saab Mansour, Markus Freitag,
Minwei Feng, Matthias Huck, Joern Wuebker, Malte Nuhn, Markus Nußbaum-Thom,
& Hermann Ney: The RWTH Aachen speech
recognition and machine translation system for IWSLT 2012. IWSLT-2012: 9th International Workshop on
Spoken Language Translation,
(2012) Magdalena Plamada &
Martin Volk: Towards a Wikipedia-extracted
Alpine corpus. [BUCC 2012] The 5th Workshop on Building and Using
Comparable Corpora: “Language Resources for Machine Translation in
Less-Resourced Languages and Domains”,
LREC 2012 Workshop, 26 May 2012,
(2012) Majid Razmara, George
Foster, Baskaran Sankaran, & Anoop Sarkar: Mixing
multiple translation models in statistical machine translation. [ACL 2012] Proceedings of the 50th Annual
Meeting of the Association for Computational Linguistics, Jeju,
(2012) Robert Remus &
Mathias Bank: Textual characteristics of different-sized
corpora. [BUCC 2012] The 5th Workshop on Building and Using
Comparable Corpora: “Language Resources for Machine Translation in
Less-Resourced Languages and Domains”,
LREC 2012 Workshop, 26 May 2012,
(2012) Raphaël Rubino, Stéphane Huet, Fabrice Lefčvre,
& Georges Linarčs: Statistical post-editing
of machine translation for domain adaptation. EAMT 2012: Proceedings of the 16th Annual Conference of the European Association
for Machine Translation, Trento, Italy, May 28-30 2012, ed. Mauro Cettolo,
Marcello Federico, Lucia
Specia, Andy Way;
pp.221-228. [PDF, 225KB]
(2012) Nick Ruiz & Marcello Federico: MDI adaptation for the lazy: avoiding
normalization in LM adaptation for lecture translation. IWSLT-2012: 9th International Workshop on
Spoken Language Translation,
(2012) Rico Sennrich: Mixture-modeling
with unsupervised clusters for domain adaptation in statistical machine
translation. EAMT 2012: Proceedings
of the 16th Annual Conference of the European Association for Machine
Translation, Trento, Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; pp.185-192. [PDF, 217KB]
(2012) Rico Sennrich: Perplexity
minimization for translation model domain adaptation in statistical machine
translation. [EACL 2012] Proceedings of the 13th Conference of the European Chapter
of the Association for Computational Linguistics,
(2012) Kashif Shah, Loďc Barrault, & Holger
Schwenk: A general framework to weight
heterogenous parallel data for model adaptation in statistical machine
translation. AMTA-2012: the Tenth Biennial Conference of the Association for Machine
Translation in the
(2012) Chunqi Shi, Donghui
Lin, & Toru Ishida: Service composition
scenarios for task-oriented translation. LREC 2012: Eighth
international conference on Language Resources and Evaluation, 21-27 May 2012,
(2012) Inguna Skadiņa: Analysis and evaluation
of comparable corpora for under-resourced areas of machine translation. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Jinsong Su, Hua Wu,
Haifeng Wang, Yidong Chen, Xiaodong Shi, Huailin Dong, & Qun Liu: Translation model adaptation for statistical machine
translation with monolingual topic information. [ACL 2012] Proceedings of the 50th Annual Meeting of the Association
for Computational Linguistics, Jeju,
(2012) John Tinsley, Alexandru Ceausu, Jian Zhang,
Heidi Depraetere, & Joeri Van de Walle: IPTranslator:
facilitating patent search with machine translation. AMTA-2012: the Tenth Biennial Conference of the Association for Machine
Translation in the
(2012) Wei Wang, Klaus Macherey, Wolfgang Macherey,
Franz Och, & Peng Xu: Improved domain
adaptation for statistical machine translation. AMTA-2012:
the Tenth Biennial Conference of the Association for Machine Translation in the
(2011) Pratyush Banerjee, Hala Almaghout, Sudip
Naskar, Johann Roturier, Jie Jiang, Andy Way, & Josef van Genabith: The DCU machine translation systems for IWSLT
2011. IWSLT 2011: Proceedings of the
International Workshop on Spoken Language Translation,
(2011) Pratyush Banerjee,
Sudip Kumar Naskar, Johann Roturier, Andy Way, & Josef van Genabith: Domain adaptation in statistical machine
translation of user-forum data using component-level mixture modelling. MT Summit XIII: the Thirteenth Machine
Translation Summit [organized by the] Asia-Pacific Association for Machine
Translation (AAMT), 19-23 September 2011,
(2011) Arianna Bisazza, Nick Ruiz, & Marcello
Federico: Fill-up versus interpolation methods
for phrase-based SMT adaptation. IWSLT
2011: Proceedings of the International Workshop on Spoken Language Translation,
(2011) Alexandru Ceauşu, John Tinsley, Jian
Zhang, &
(2011) Han-Bin Chen, Hen-Hsen Huang, Jengwei Tjiu, Ching-Ting Tan,
& Hsin-His Chen: Identification and
translation of significant patterns for cross-domain SMT applications. MT Summit XIII: the Thirteenth Machine
Translation Summit [organized by the] Asia-Pacific Association
for Machine Translation (AAMT), 19-23 September 2011,
(2011) Hal Daumé & Jagadeesh Jagarlamudi: Domain adaptation for machine translation by mining
unseen words. ACL-HLT 2011:
Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics: Short papers,
(2011) Kevin Duh, Katsuhito
Sudoh, Tomoharu Iwata, & Hajime Tsukada: Alignment
inference and Bayesian adaptation for machine translation. MT Summit XIII: the Thirteenth Machine
Translation Summit [organized by the] Asia-Pacific Association for Machine
Translation (AAMT), 19-23 September 2011,
(2011) Kevin Duh, Akinori Fujino, & Masaaki Nagata: Is machine translation ripe for cross-lingual sentiment
classification? ACL-HLT 2011: Proceedings of the 49th Annual Meeting of the Association
for Computational Linguistics: Short papers,
(2011) Cristina
Espańa-Bonet, Ramona Enache,
(2011) Souhir Gahbiche-Braham, Hélčne Bonneau-Maynard, & François
Yvon: Two ways to use a noisy parallel news
corpus for improving statistical machine translation. ACL 2011: Proceedings of the Fourth Workshop on Building and Using
Comparable Corpora,
(2011) Monica Gavrila &
(2011) Miguel A.Jiménez-Crespo: To adapt or not to adapt in web localization: a
contrastive genre-based study of original and localised legal sections in
corporate websites. Journal of Specialised Translation 15 (January
2011); pp.2-27. [PDF, 237KB]
(2011) Mitesh M.Khapra, Salil Joshi, & Pushpak
Bhattacharyya: It takes two to tango: a bilingual unsupervised approach for estimating sense distributions
using expectation maximization. [IJCNLP 2011] Proceedings of the 5th International Joint
Conference on Natural Language Processing,
(2011) Patrik Lambert, Holger Schwenk, Christophe Servan, & Sadaf
Abdul-Rauf: Investigations on translation model
adaptation using monolingual data. [WMT 2011] Proceedings of the 6th Workshop on Statistical Machine Translation,
(2011)
Thomas Lavergne, Alexandre Allauzen, Hai-Son Le, & François Yvon: LIMSI’s experiments in domain adaptation for
IWSLT11. IWSLT 2011: Proceedings of
the International Workshop on Spoken Language Translation,
(2011) Abby Levenberg, Miles Osborne, & David Matthews: Multi-stream language models for statistical
machine translation. [WMT 2011] Proceedings of the 6th Workshop on
Statistical Machine Translation,
(2011) John McCrae, Maurizio Espinoza, Elena Monteil-Ponsoda, Guadalupe
Aguado-de-Cea, & Philipp Cimiano: Combining
statistical and semantic approaches to the translation of ontologies and
taxonomies. Proceedings of SSST-5,
Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation,
ACL HLT 2011, Portland, Oregon, USA, June 2011; pp.116-125. [PDF, 576KB]
(2011) Paul Maergner, Kevin
Kilgour,
(2011) Saab Mansour, Joern Wuebker, & Hermann Ney:
Combining translation and language model
scoring for domain-specific data filtering. IWSLT 2011: Proceedings of the International Workshop on Spoken
Language Translation,
(2011) Emmanuel Morin & Emmanuel Prochasson: Bilingual lexicon extraction from comparable corpora
enhanced with parallel corpora. ACL 2011:
Proceedings of the Fourth Workshop on Building and Using Comparable Corpora,
(2011) Jan Niehues & Alex Waibel: Using Wikipedia to translate domain-specific
terms in SMT. IWSLT 2011: Proceedings
of the International Workshop on Spoken Language Translation,
(2011) Pavel Pecina, Antonio Toral, Andy Way, Vassilis
Papavassiliou, Prokopis Prokopidis, & Maria Giagkou: Towards using web-crawled data for domain
adaptation in statistical machine translation. [EAMT 2011]: proceedings of the
15th conference of the European Association for Machine Translation, 30-31
May 2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent
Vandeghinste; pp.297-304. [PDF, 363KB]; presentation,
25 slides [PDF]
(2011) Anders Sřgaard & Martin Haulrich: Sentence-level instance-weighting for graph-based
and transition-based dependency parsing. IWPT 2011: 12th International Confernce on Parsing Technologies,
October 5-7, 2011,
(2011) Linfeng Song, Haitao
Mi, Yajuan Lü, & Qun Liu: Bagging-based system
combination for domain adaptation. MT
Summit XIII: the Thirteenth Machine Translation Summit [organized by the]
Asia-Pacific Association for Machine Translation (AAMT), 19-23 September 2011,
(2011) George Tambouratzis, Fotini Simistira, Sokratis
Sofianopoulos, Nikos Tsimboukakis, & Marina Vassiliou: A resource-light phrase scheme for
language-portable MT. [EAMT 2011]:
proceedings of the 15th conference of the European Association for Machine
Translation, 30-31 May 2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi
Depraetere, Vincent Vandeghinste; pp.185-1 92. [PDF, 186KB]
(2011) Špela Vintar & Darja Fišer: Enriching Slovene WordNet with domain-specific terms.
Translation: Computation, Corpora, Cognition 1 (1), December 2011; pp.29-44.
[PDF, 631KB]
(2011) Joern Wuebker, Matthias Huck, Saab Mansour,
Markus Freitag, Minwei Feng, Stephan Peitz, Christoph Schmidt, & Hermann
Ney: The RWTH Aachen machine translation
system for IWSLT 2011. IWSLT 2011: Proceedings
of the International Workshop on Spoken Language Translation,
(2010) Pratyush Banerjee, Jinhua Du, Sudip Naskar, Baoli
Li,
(2010) Josep Maria
Crego, Aurélien Max, & François Yvon: Local lexical adaptation in machine translation through
triangulation: SMT helping SMT. Coling
2010: 23rd International Conference on Computational Linguistics.
Proceedings of the conference, 23-27 August 2010,
(2010) Kevin Duh, Katsuhito Sudoh, & Hajime Tsukada: Analysis of translation model adaptation in
statistical machine translation. Proceedings of the 7th International
Workshop on Spoken Language Translation, 2-3 December 2010,
(2010) George Foster, Cyril
Goutte, & Roland Kuhn: Discriminative
instance weighting for domain adaptation in statistical machine translation. [EMNLP 2010] Proceedings of the 2010 Conference on Empirical Methods in Natural
Language Processing, MIT, Massachusetts, USA, 9-11 October 2010;
pp.451-459. [PDF, 256KB]
(2010)
Laura Elisabeth Jehl: Machine translation for Twitter.
Master of Science,
(2010) Hiroyuki Kaji, Takashi Tsunakawa, &
Daisuke Okada: Using comparable corpora to adapt a
translation model to domains. LREC
2010: proceedings of the seventh
international conference on Language Resources and Evaluation, 17-23 May
2010,
(2010) Petr Knoth, Trevor Collins, Elsa Sklavounou, & Zdenek
Zdrahal: Facilitating cross-language retrieval
and machine translation by multilingual domain ontologies. [LREC 2010] Workshop on Supporting eLearning with
Language Resources and Semantic Data,
(2010) William D.Lewis, Chris Wendt, & David
Bullock: Achieving domain specificity in SMT
without overt siloing. LREC 2010:
proceedings of the seventh international
conference on Language Resources and Evaluation, 17-23 May 2010,
(2010) Jan Niehues &
Alex Waibel: Domain adaptation in statistical
machine translation using factored translation models. EAMT 2010: Proceedings of the 14th Annual conference of the European
Association for Machine Translation, 27-28 May 2010,
(2010)
Mohammad Taher Pilevar & Heshaam Faili: PersianSMT:
a first attempt to English-Persian statistical machine translation. JADT 2010: 10th International Conference on
Statistical Analysis of Textual Data, 9-11 juin 2010,
(2010) Germán Sanchis-Trilles & Mauro Cettolo: Online language model adaptation via
n-gram mixtures for statistical machine translation. EAMT 2010: Proceedings of the 14th Annual conference of the European
Association for Machine Translation, 27-28 May 2010,
(2010) Germán Sanchis-Trilles, Jesús Andrés-Ferrer, Guillem
Gascó, Jesús González-Rubio, Pascual Martínez-Gómez, Martha-Alicia Rocha, Joan-Andreu Sánchez, & Francisco
Casacuberta: UPV-PRHLT English-Spanish
system for WMT10. ACL 2010: Joint
Fifth Workshop on Statistical Machine Translation and MetricsMATR.
Proceedings of the workshop, 15-16 July 2010,
(2010) Kashif Shah, Loďc Barrault, & Holger Schwenk: Translation
model adaptation by resampling. ACL
2010: Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR.
Proceedings of the workshop, 15-16 July 2010,
(2010) Jörg Tiedemann: Context adaptation in statistical machine
translation using models with exponentially decaying cache. Proceedings of the 2010 Workshop on Domain
Adaptation for Natural Language Processing, ACL 2010, Uppsala, Sweden, 15
July 2010; pp.8-15. [PDF, 148KB]
(2010) Jörg Tiedemann: To cache or not to cache? Experiments with
adaptive models in statistical machine translation. ACL
2010: Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR.
Proceedings of the workshop, 15-16 July 2010,
(2010) Bin Wei & Christopher Pal: Cross
lingual adaptation: an experiment in sentiment classifications. ACL 2010: the 48th Annual Meeting of the
Association for Computational Linguistics,
Filtering see Cleaning and filtering
Knowledge
representation see Ontologies
Knowledge
resources
(2013) Timmy Oumai Wang & Mark Shuttleworth: Knowledge management issues in the workflow of
translation memory systems. [Aslib
2013] Translating and the Computer 35,
28-29 November 2013, etc.venues, Paddington,
Language
resources (see also Bilingual corpora, Lexical resources, Multilingual corpora,
Scarce resources)
(2015) Andrzej Zydroń: FALCON: building
the localization web. Proceedings of
the 37th Conference Translating and the Computer, London, November 26-27,
2015; pp.33-36. [PDF, 125KB]
(2014) Michael Carl, Mercedes García Martínez,
Bartolomé Mesa-Lao, & Nancy Underwood: CFT13:
a resource for research into the post-editing process. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.1757-1764. [PDF, 1011KB]
(2014) Grégoire Détrez, Víctor M.Sánchez-Cartagena,
& Aarne Ranta: Sharing resources between
free/open-source rule-based machine translation systems: Grammatical Framework
and Apertium. LREC 2014: Ninth
International Conference on Language Resources and Evaluation, May 26-31,
2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.4394-4400. [PDF, 146KB]
(2014) Nizar Ghoula, Jacques Guyot, & Gilles
Falquet: Terminology management revisited.
Translating and the Computer 36:
proceedings. Asling: International Society for Advancement in Language
Technology, 27-28 November 2014; pp.56-65. [PDF, 916KB]
(2014) Jorge Gracia, Elena Montiel-Ponsoda, Daniel
Vila-Suero, & Guadalupe Aguado-de-Cea: Enabling
language resources to expose translations as linked data on the Web. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.409-413. [PDF, 346KB]
(2014) Stelios Piperidis, Harris Papageorgiou,
Christian Spurk, Georg Rehm, Khalid Choukri, Olivier Hamon, Nicoletta
Calzolari, Riccardo del Gratta, Bernardo Magnini, & Christian Girardi: META-SHARE: one year after. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.1208-1211. [PDF, 861KB]
(2014) Georg Rehm et al.: The
strategic impact of META-NET on the regional, national and intenational level.
LREC 2014: Ninth International Conference
on Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall
and Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.1517-1524. [PDF, 184KB]
(2013) Núria Bel, Marc Poch & Antonio Toral: PANACEA: platform for automatic, normalised
annotation and cost-effective acquisition of language resources for human
language technologies. Proceedings of
the XIV Machine Translation
(2013) Olzhas Makhambetov, Aibek
Makazhanov, Zhandos Yessenbayev, Bakhyt Matkarimov, Islam Sabyrgaliyev, &
Anuar Sharafudinov: Assembling the Kazakh
language corpus. [EMNLP 2013] Proceedings
of the 2013 Conference on Empirical Methods in Natural Language Processing,
Seattle, Washington, USA, 18-21 October 2013; pp.1022-1031. [PDF, 184KB]
(2013) Marc Poch & Antonio Toral: PANACEA tutorial. Proceedings of the XIV Machine Translation Summit, Nice, September
3, 2013; 39 slides. [PDF of PPT, 1538KB]
(2013) Raivis Skadiņš, Mārcis Pinnis,
Tatiana Gornostay, & Andrejs Vasiļjevs: Application
of online terminology services in statistical machine translation. Proceedings of the XIV Machine Translation
Summit, Nice, September 2-6, 2013; ed. K.Sima’an, M.L.Forcada, D.Grasmick,
H.Depraetere, A.Way; pp.281-286. [PDF, 573KB]
(2012) Núria Bel, Vassilis
Papavasiliou, Prokopis Prokopidis, Antonio Toral, & Victoria Arranz: Mining and exploiting domain-specific corpora in the
PANACEA platform. [BUCC 2012] The 5th
Workshop on Building and Using Comparable Corpora: “Language Resources for
Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Antonio Branco: Language technology for
Portuguese: progress and prospects [abstract].
In: Crosslingual
Language Technology in service of an integrated multilingual Europe,
4-5 May 2012,
(2012) António Branco: METANET4U: contribution to META-SHARE. META-FORUM,
(2012) Nicola Cancedda: Private access to phrase tables for statistical
machine translation. [ACL 2012] Proceedings
of the 50th Annual Meeting of the Association for Computational Linguistics,
Jeju,
(2012) Mauro Cettolo, Christian Girardi, &
Marcello Federico: WIT3: web
inventory of transcribed and translated talks. EAMT 2012: Proceedings of the 16th Annual Conference of the European
Association for Machine Translation, Trento, Italy, May 28-30 2012, ed.
Mauro Cettolo, Marcello
Federico, Lucia Specia, Andy Way; pp.261-268. [PDF, 197KB]
(2012) Dan Cristea & Ionuţ Cristian Pistol:
Multilingual linguistic workflows [abstract].
In: Crosslingual
Language Technology in service of an integrated multilingual Europe,
4-5 May 2012,
(2012) Christian Federmann, Ioanna Giannopoulou,
Christian Girardi, Olivier Hamon, Dimitris Mavroeidis, Salvatore Minutoli,
& Marc Schröder: META-SHARE v2: an open
network of repositories for language resources including data and tools. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Darja Fišer: Language resources and tools for
semantically enhanced processing of Slovene [abstract]. In: Crosslingual
Language Technology in service of an integrated multilingual Europe,
4-5 May 2012,
(2012) Monica Gavrila, Walther v.Hahn, & Cristina
Vertan: Same domain different discourse style:
a case study on language resources for data-driven machine translation. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Maria Gavrilidou, Penny Labropoulou, Elina
Desipri, Stelios Piperidis,
(2012) Maria Gavrilidou: Using the META-SHARE model implementation for
describing and documenting language resources. Tutorial at: LREC
2012: Eighth international conference on Language Resources and Evaluation, 21-27 May 2012,
(2012) Masood Ghayoomi: From grammar rule extraction to treebanking: a
bootstrapping approach. LREC 2012:
Eighth international conference on Language Resources and Evaluation, 21-27
May 2012,
(2012)
(2012) Judith Klavans: Government catalog of language
resources (GCLR) [abstract]. AMTA-2012: the Tenth Biennial Conference of
the Association for Machine Translation in the
(2012) Xuansong Li, Stephanie M.Strassel, Heng Ji,
Kira Griffitt, & Joe Ellis: Linguistic
resources for entity linking evaluation: from monolingual to cross-lingual. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Joseph Mariani: Language resources and
evaluation for a multilingual
(2012) Elaine Marsh: Return on investment for
government human language technology systems [abstract].
AMTA-2012: the Tenth Biennial Conference
of the Association for Machine Translation in the
(2012) Tomás Pariente: Text analytics and big data. META-FORUM,
(2012) Bolette Sandford
Pedersen: The META-NET language white paper
series: overview and key results. META-FORUM,
(2012) Stelios Piperidis: META-SHARE: the open exchange platform:
overview – current state – towards v3.0. META-FORUM,
(2012) Marc Poch, Antonio Toral, & Núria Bel: Language resources factory: case study on the
acquisition of translation memories.
[EACL 2012] Proceedings of the
Demonstrations at the 13th Conference of the European Chapter of the
Association for Computational Linguistics,
(2012) Adam Przepiórkowski: Polish language resources
and tools: towards multilinguality [abstract]. In: Crosslingual Language Technology in service
of an integrated multilingual Europe, 4-5 May 2012,
(2012) Mike Rosner & Jan Joachimsen: Maltese:
mixed language and multilingual technology [abstract]. In: Crosslingual Language Technology in service
of an integrated multilingual Europe, 4-5 May 2012,
(2012) Rahma Sellami, Fatiha Sadat, & Lamia
Hadrich Belguith: Exploiting Wikipedia as a
knowledge base for the extraction of linguistic resources: application on
Arabic-French comparable corpora and bilingual lexicons. AMTA-2012: Fourth workshop on computational
approaches to Arabic script-based languages. Proceedings,
(2012) Ralf Steinberger,
Andreas Eisele, Szymon Klocek, Spyridon Pilos, & Patrick Schlüter: DGT-TM: a freely available translation memory in 22 languages. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Jörg Tiedemann,
(2012) Tamás Váradi: CESAR:
comprehensive language resources and tools for Europe. HLT Days 27-28 September 2012,
(2012) Tamás Váradi: The contribution of CESAR to META-SHARE. META-FORUM,
(2012) Tamás Váradi & Marko Tadić: Central and South-East European resources in
META-SHARE. Proceedings of COLING
2012: Demonstration Papers, Mumbai, December 2012; pp. 431-437. [PDF,
1205KB]
(2012) Andrejs Vasiļjevs:
META-NORD overview. META-FORUM,
(2012) Yunqing Xia, Guoyu
Tang, Peng Jin, & Xia Yang: CLTC: a
Chinese-English cross-lingual topic corpus.
LREC 2012: Eighth international conference on Language
Resources and Evaluation,
21-27 May 2012,
(2012) Andrius Utka: Multilingual resources and their
application for the Lithuanian language [abstract]. In: Crosslingual Language Technology in service
of an integrated multilingual Europe, 4-5 May 2012,
(2012)
(2012) Andrejs Vasiļjevs, Tatiana Gornostay,
Inguna Skadiņa, Daiga Deksne, Raivis Skadiņš, & Mārcis
Pinnis: Recent advances in the development and sharing of language resources
and tools for Latvian [abstract]. In: Crosslingual
Language Technology in service of an integrated multilingual Europe,
4-5 May 2012,
(2012) CESAR:
Central and
(2012) PANACEA
(Platform for Automatic, Normalised Annotation and Cost-Effective Acquisition
of Language Resources for Human Language Technologies). [Project paper at] EAMT 2012: Proceedings of the 16th Annual
Conference of the European Association for Machine Translation, Trento,
Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; p.90. [PDF, 87KB]
(2011) Pushpak Bhattacharyya: IndoWordNet
and multilingual resource conscious word sense disambiguation. Proceedings of the 8th international NLPSC
workshop. Special theme: Human-machine interaction in translation,
Copenhagen Business School, 20-21 August 2011; ed.Bernadette Sharp, Michael
Zock, Michael Carl, Arnt Lykke Jakobsen (Copenhagen Studies in Language 41),
Frederiksberg: Samfundslitteratur, 2011; pp.29-30. [PDF, 677KB]
(2011) Svetla Koeva: Furthering natural language processing in
Bulgaria. META-FORUM 2011: Solutions for multilingual Europe, June
27/28 2011, Hotel Marriott,
(2011) Luís Marujo, Nuno Grazina, Tiago Luís, Wang
Ling, Luísa Coheur, & Isabel Trancoso: BP2EP
– adaptation of Brazilian Portuguese texts to European Portuguese. [EAMT 2011]: proceedings of the 15th
conference of the European Association for Machine Translation, 30-31 May
2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent
Vandeghinste; pp.129-136. [PDF, 352KB]
(2011) Maciej Ogrodniczuk & Adam Przepiórkowski: Polish LRTs: CESAR’s story. META-FORUM
2011: Solutions for multilingual
Europe, June 27/28 2011, Hotel Marriott,
(2011) Stelios Piperidis: META-SHARE: an open resource exchange
infrastructure for stimulating research and innovation. META-FORUM
2011: Solutions for multilingual
Europe, June 27/28 2011, Hotel Marriott,
(2011) Marko Tadić: The CESAR project: enabling LRT for 70m+
speakers. META-FORUM 2011: Solutions
for multilingual Europe, June 27/28 2011, Hotel Marriott,
(2011)
(2011) Antonio Toral, Pavel Pecina, Andy Way, &
Marc Poch: Towards a user-friendly webservice
architecture for statistical machine translation in the PANACEA project. [EAMT 2011]: proceedings of the 15th
conference of the European Association for Machine Translation, 30-31 May
2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent
Vandeghinste; pp.63-70. [PDF, 511KB]
(2011) Tamás Váradi: Hungarian language technology - from platform
to alliance. META-FORUM 2011: Solutions
for multilingual Europe, June 27/28 2011, Hotel Marriott,
(2011) LIWP – EU
language industry web platform. (European Machine Translation Projects.) [EAMT 2011]: proceedings of the 15th
conference of the European Association for Machine Translation, 30-31 May
2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent
Vandeghinste; p.339. [PDF, 40KB]
(2011) PANACEA
(Platform for automatic, normalised annotation and cost-effective acquisition
of language resources for human language technologies). (European Machine
Translation Projects.) [EAMT 2011]:
proceedings of the 15th conference of the European Association for Machine
Translation, 30-31 May 2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi
Depraetere, Vincent Vandeghinste; p.349. [PDF, 106KB]
(2010) Gilles Adda
& Joseph Mariani: Language resources &
Amazon Mechanical Turk: ethical, legal and other issues. LREC 2010: Le gal Issues for Sharing
Language Resources - LISLR2010 Workshop, 17 May 2010,
(2010) Mossab Al-Hunaity: Utilizing web service technology to create
Danish Arabic language resources. LREC
2010: Web Services and Processing Pipelines in HLT - WSPP2010 Workshop, 17
May 2010,
(2010) Lynne Bowker & Elizabeth Marshman:
Toward a model of active and situated learning in the teaching of
computer-aided translation: introducing the CERTT project [abstract].
Journal of
Translation Studies 13 (1-2), Special issue: The teaching of computer-aided translation, ed. Chan
Sin-wai; pp. 199-226.
(2010) Arif Bramantoro,
Ulrich Schäfer, & Toru Ishida: Towards
an integrated architecture for composite language services and multiple
linguistic processing components. LREC
2010: proceedings of the seventh
international conference on Language Resources and Evaluation, 17-23 May
2010,
(2010) Jennifer DeCamp: Language Technology Resource Center. EAMT 2010: Proceedings of the 14th Annual
conference of the European Association for Machine Translation, 27-28 May
2010,
(2010) Alice Dijkstra: Dutch/Flemish HLT cooperation. Translingual
Europe 2010,
(2010) Sabine
Kirchmeier-Andersen: Linguistic
diversity and language change – future challenges for MT. Translingual Europe 2010,
(2010)
Swaran Lata: Human language computing in
Indian languages – a holistic perspective. META-FORUM 2010: Challenges for multilingual Europe, November 17/18
2010,
(2010) Rūta
Marcinkevičienė & Daiva Vitkutė-Adžgauskienė: Developing the human language
technology infrastructure in Lithuania.
Human Language Technologies—The
Baltic Perspective, 4th International
Conference, Riga, Latvia, October 7-8, 2010; 24 slides [PDF of PPT, 6452KB]
(2010) Einar
Meister, Jaak Vilo & Neeme Kahusk: National
programme for Estonian language technology: a pre-final summary. Human Language Technologies—The Baltic
Perspective, 4th International
Conference ,
(2010) Robert Munro, Steven Bethard, Victor
Kuperman, Vicky Tzuyin Lai, Robin Melnick, Christopher Potts, Tyler
Schnoebelen, & Harry Tily: Crowdsourcing
and language studies: the new generation of linguistic data. Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language
Data with Amazon’s Mechanical Turk,
(2010) Stelios Piperidis: META-SHARE: the open resource exchange
facility. META-FORUM 2010:
Challenges for multilingual Europe, November 17/18 2010,
(2010) Marc Poch: PANACEA – platform for the automatic,
normalized annotation and cost-effective acquisition of language resources.
(European Community supported project.) Presented at EAMT 2010: 14th Annual conference of the European Association for
Machine Translation, 28 May 2010,
(2010) Georg Rehm: META-NET and META-SHARE: an overview. Human Language Technologies – the Baltic
Perpective, 4th International
Conference, Riga,
(2010) Mike Rosner: Maltese on the brink. Translingual Europe 2010,
(2010) Víctor M.Sánchez-
(2010) Iguna Skadiņa,
Ilze Auziņa, Normunds Grūzītis, Kristīna
Levāne-Petrova, Gunta Nešpore, Raivis Skadiņš, & Andrejs
Vasiļjevs: Language resources and
technology for humanities in Latvia 2004-2010. Human
Language Technologies—The Baltic Perspective, 4th International Conference,
(2010) Zhiyi Song, Stephanie Strassel, Gary Krug,
& Kazuaki Maeda: Enhanced infrastructure for
creation and collection of translation resources. LREC 2010: proceedings of the
seventh international conference on Language Resources and Evaluation,
17-23 May 2010,
(2010) Andrejs
Vasiljevs: Big solutions for
small languages. Translingual Europe
2010,
(2010) Karthik
Visweswariah, Vijil Chenthamarakshan, & Nandakishore Kambhatla: Urdu and Hindi: translation and sharing
of linguistic resources. Coling 2010:
23rd International Conference on Computational Linguistics, 23-27 August
2010, Beijing International Convention Center, Beijing, China, Posters volume; pp.1283-1291. [PDF,
217KB]
(2010) John Hendrik Weitzmann & Prodromos Tsiavos:
Language resources and legal issues:
problems and solutions for basic and industrial research. META-FORUM
2010: Challenges for multilingual Europe, November 17/18 2010,
Lexical resources and lexical acquisition
(2015) Oliver Adams, Graham Neubig, Trevor Cohn &
Steven Bird: Inducing
bilingual lexicons from small quantities of sentence-aligned phonemic
transcriptions. [IWSLT 2015]
Proceedings of the International Workshop on Spoken Language Translation,
December 3-4, 2015, Da Nang, Vietnam; pp.248-255. [PDF, 2.8MB]
(2015) Gerard de Melo: Wiktionary-based word
embeddings. MT Summit XV, October 30
– November 3, 2015, Miami, Florida, USA. Proceedings of MT Summit XV:
vol.1: MT Researchers’ Track; pp.346-359. [PDF, 709KB]
(2014) Judit Ács: Pivot-based
multilingual dictionary building using Wiktionary. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.1938-1942. [PDF, 81KB]
(2014) Krasimir Angelov: Bootstrapping open-source English-Bulgarian
computational dictionary. LREC 2014:
Ninth International Conference on Language Resources and Evaluation, May
26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.1018-1023. [PDF, 158KB]
(2014) Anabela
(2014) Olga Beregovaya & David Landan: Source content analysis and training data
selection impact on an MT-driven program design. Proceedings of the 17th
annual conference of the European Association for Machine Translation, EAMT
2014, Dubrovnik, Croatia, 16th-18th June 2014, edited by Marko Tadić,
Philipp Koehn, Johann Roturier, Andy Way; p.59. [PDF, 303KB]
(2014) Kurt Eberle: AutoLearn<Word>.
Translating and the Computer 36:
proceedings. Asling: International Society for Advancement in Language
Technology, 27-28 November 2014; pp.145-154. [PDF, 446KB]
(2014) Maud Ehrmann,
Francesco Cecconi, Daniele Vannella, John McCrae, Philipp Cimiano, &
Roberto Navigli: Representing multilingual data
as linked data: the case of BabelNet 2.0.
LREC 2014: Ninth International
Conference on Language Resources and Evaluation, May 26-31, 2014 Harpa
Concert Hall and Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari
et al.; pp.401-408. [PDF, 674KB]
(2014) Miquel Esplŕ-Gomis, Víctor M.Sánchez-Cartegna,
Felipe Sánchez-Martínez, Rafael C.Carrasco, Mikel L.Forcada, & Juan Antonio
Pérez-Ortiz: An efficient method to assist
non-expert users in extending dictionaries by assigning stems and inflectional
paradigms to unknknown words. Proceedings of the 17th annual conference of
the European Association for Machine Translation, EAMT 2014, Dubrovnik,
Croatia, 16th-18th June 2014, edited by Marko Tadić, Philipp Koehn, Johann
Roturier, Andy Way; pp.19-26. [PDF, 311KB]
(2014) Mozhgan Ghassemiazghandi & Tengku Sepora
Tengku Mahadi: Losses and gains in
computer-assisted translation: some remarks on online translation of English to
Malay. Translating and the Computer
36: proceedings. Asling: International Society for Advancement in Language
Technology, 27-28 November 2014; pp.194-201. [PDF, 160KB]
(2014) B.R.Laranjeira, V.P.Moreira, A.Villavicencio,
C.Ramisch, & M.J.Finatto: Comparing the
quality of focused crawlers and of the translation resources obtained from them.
LREC 2014: Ninth International Conference
on Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall
and Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.3572-3578. [PDF, 803KB]
(2014) John Richardson, Toshiaki Nakazawa, & Sadao
Kurohashi: Bilingual dictionary construction
with transliteration filtering. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.1013-1017. [PDF, 164KB]
(2014) Michael Rosner & Kurt Sultana: Automatic methods for the extension of a bilingual
dictionary using comparable corpora. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.3790-3797. [PDF, 195KB]
(2014) Yves Scherrer & Benoît Sagot: A language-independent and fully unsupervised
approach t lexicon induction and part-of-speech tagging for closely related
languages. LREC 2014: Ninth
International Conference on Language Resources and Evaluation, May 26-31,
2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.502-508. [PDF, 184KB]
(2013) Judit Ács, Katalin Pajkossy, & András
Kornai: Building basic vocabulary across 40
languages. Proceedings of the 6th
Workshop on Building and Using Comparable Corpora,
(2013) Mihael Arcan & Paul Buitelaar: MONNET: multilingual ontologies for networked
knowledge. Proceedings of the XIV
Machine Translation
(2013) Brijesh Bhatt, Lahari Poddar, & Pushpak
Bhattacharyya: IndoNet: a multilingual lexical
knowledge network for Indian languages. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.268-272.
[PDF, 364KB]
(2013) Dhouha Bouamor, Nasredine
Semmar, & Pierre Zweigenbaum: Building
specialized bilingual lexicons using word sense disambiguation. International Joint Conference on Natural
Language Processing,
(2013) Dhouha Bouamor, Adrian Popescu,
Nasredine Semmar, & Pierre Zweigenbaum: Building
specialized bilingual lexicons using large-scale background knowledge. [EMNLP 2013] Proceedings of the 2013
Conference on Empirical Methods in Natural Language Processing,
(2013) Dhouha Bouamor, Nasredine Semmar, & Pierre
Zweigenbaum: Context vector disambiguation for
bilingual lexicon extraction from comparable corpora. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.759-764.
[PDF, 199KB]
(2013) Dhouha Bouamor, Nasredine Semmar, & Pierre
Zweigenbaum: Towards a generic approach for
bilingual lexicon extraction from comparable corpora. Proceedings of the XIV Machine Translation
(2013) Rahma Boujelbane, Mariem Ellouze Khemekhem,
Siwar BenAyed, & Lamia Hadrich Belguith: Building
bilingual lexicon to create dialect Tunisian corpora and adapt language model.
Proceedings of the Second Workshop on
Hybrid Approaches to Translation,
(2013) Silvana Hartmann & Iryna Gurevych: FrameNet on the way to Babel: creating a bilingual
FrameNet using Wiktionary as interlingual connection. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics,
(2013) Amir Hazem & Emmanuel Morin: A comparison of smoothing techniques for bilingual
lexicon extraction from comparable corpora.
Proceedings of the 6th Workshop on
Building and Using Comparable Corpora,
(2013) Amir Hazem & Emmanuel Morin:
Word co-occurrence counts prediction for
bilingual terminology extraction from comparable corpora. International Joint Conference on Natural
Language Processing,
(2013) Tsutomu Hirao, Tomoharu Iwata, & Masaaki
Nagata: Latent semantic matching: application to
cross-language text categorization without alignment information. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.212-216.
[PDF, 846KB]
(2013) Ann Irvine & Chris Callison-Burch: Supervised bilingual lexicon induction with
multiple monolingual signals. [NAACL-HLT
2013] The 2013 conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 9-14 June 2013,
(2013) Hong-Seok Kwon, Hyeong-Won Seo, & Jae-Hoon
Kim: Bilingual lexicon extraction via pivot language
and word alignment tool. Proceedings of the 6th Workshop on Building
and Using Comparable Corpora,
(2013) Khang Nhut Lam & Jugal Kalita: Creating reverse bilingual dictionaries. [NAACL-HLT 2013] The 2013 conference of the
North American Chapter of the Association for Computational Linguistics: Human
Language Technologies, 9-14 June 2013,
(2013) Lian Tze Lim, Lay-Ki Soon, Tek Yong Lim, Enya
Kong Tang, & Bali Ranaivo-Malançon: Context-dependent
multilingual lexical lookup for under-resourced languages. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational Linguistics,
Short papers, Sofia, Bulgaria, August 4-9 2013; pp.294-299. [PDF, 373KB]
(2013) Xiaodong Liu, Kevin Duh, & Yuji Matsumoto: Topic models + word alignment = a flexible framework
for extracting bilingual dictionary from comparable corpus. Proceedings of the Seventeenth Conference on
Computational Natural Language Learning, Sofia, Bulgaria, 8-9 August 2013;
pp.212-221. [PDF, 487KB]
(2013) Michael Matuschek, Christian M.Meyer, &
Iryna Gurevych: Multilingual knowledge in
aligned Wiktionary and OmegaWiki for translation applications. Translation: Computation, Corpora, Cognition 3 (1), June 2013; pp.87-118.
[PDF, 2898KB]
(2013) Vassilis Papavassiliou, Prokopis Prokopidis,
& Gregor Thurmair: A modular
open-source focused crawler for mining monolingual and bilingual corpora from
the web. Proceedings of the 6th Workshop on Building and Using Comparable
Corpora,
(2013) Magdalena Plamadă & Martin Volk: Mining for domain-specific text from Wikipedia. Proceedings
of the 6th Workshop on Building and Using Comparable Corpora,
(2013) Rahma Sellami, Fatiha Sadat, & Lamia
Hadrich Belguith: Exploiting multiple
resources for Japanese to English patent translation. [MT
(2013) Jason R.Smith, Herve Saint-Amand, Magdalena
Plamada, Philipp Koehn, Chris Callison-Burch, & Adam Lopez: Dirt cheap web-scale parallel text from the Common
Crawl. ACL-2013: Proceedings of the 51st Meeting of the Association for
Computational Linguistics,
(2013) Itsuki
(2013) Takashi Tsunakawa, Yosuke
Yamamoto, & Hiroyuki Kaji: Improving
calculation of contextual similarity for constructing a bilingual dictionary
via a third language. International
Joint Conference on Natural Language Processing,
(2013) Ivan Vulić &
Marie-Francine Moens: A study on bootstrapping
bilingual vector spaces from non-parallel data (and nothing else). [EMNLP 2013] Proceedings of the 2013
Conference on Empirical Methods in Natural Language Processing, Seattle,
Washington, USA, 18-21 October 2013; pp.1044-1054. [PDF, 261KB]
(2012) Dhouha Bouamor, Nasredine Semmar, & Pierre
Zweigenbeaum: Automatic construction of a
multi-word expressions bilingual lexicon: a statistical machine translation
evaluation perspective. COLING 2012:
Proceedings of the 3rd Workshop on Cognitive Aspects of the Lexicon
(CogALex-III), Mumbai, December 2012; pp.95-107. [PDF, 210KB]
(2012) Valeria Caruso & Anna De Meo: What else can databases do to assist translators?
Illustrating a rated inventory of Web dictionaries. [Aslib 2012] Translating and the Computer 34, 29-30
November 2012, One Birdcage Walk, London, UK; 12pp. [PDF, 848KB], presentation by Martin Thomas: 50 slides
[PDF, 3336KB]
(2012) Estelle Delpech, Béatrice Daille, Emmanuel Morin,
& Claire Lemaire: Extraction of
domain-specific bilingual lexicon from comparable corpora: compositional
translation and ranking. Proceedings
of COLING 2012: Technical Papers, Mumbai, December 2012; pp.745-761. [PDF,
319KB]
(2012) Douwe Gelling &
Trevor Cohn: Using senses in HMM word
alignment. NAACL-HLT Workshop on the
Induction of Linguistic Structure,
(2012) Amir Hazem &
Emmanuel Morin: ICA for bilingual lexicon
extraction from comparable corpora.
[BUCC 2012] The 5th Workshop on
Building and Using Comparable Corpora: “Language Resources for Machine
Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Elena Irimia: Experimenting with extracting lexical dictionaries
from comparable corpora for English-Romanian language pair. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Angelina Ivanova: Evaluation of a bilingual dictionary extracted
from Wikipedia. [BUCC 2012] The 5th Workshop on Building and Using
Comparable Corpora: “Language Resources for Machine Translation in Less-Resourced
Languages and Domains”, LREC 2012
Workshop, 26 May 2012,
(2012) Salil Joshi, Arindam Chatterjee, Arun
Karthikeyan Karra, & Pushpak Bhattacharyya: Eating your own cooking: automatically linking wordnet synsets of two
languages. Proceedings of
COLING 2012: Demonstration Papers, Mumbai, December 2012; pp. 239-246.
[PDF, 1527KB]
(2012) Mahdi Khademian, Kaveh Taghipour, Saab Mansour,
& Shahram Khadivi: A holistic approach to
bilingual sentence fragment extraction from comparable corpora. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Nikola Ljubešić,
Špela Vintar, & Darja Fišer: Multi-word
term extraction from comparable corpora by combining contextual and constituent
clues. [BUCC 2012] The 5th Workshop on Building and Using
Comparable Corpora: “Language Resources for Machine Translation in
Less-Resourced Languages and Domains”,
LREC 2012 Workshop, 26 May 2012,
(2012) Clara Inés López Rodríguez, Miriam Buendía Castro &
Alejandro García Aragón: User needs to the test:
evaluating a terminological knowledge base on the environment by trainee
translators. Journal of Specialised Translation 18 (July 2012);
pp.57-76. [PDF, 785KB]
(2012) Gerard de Melo &
Gerhard Weikum: UWN: a large multilingual lexical
knowledge base. [ACL 2012]
Proceedings of the 50th Annual Meeting of the Association for Computational
Linguistics, Jeju,
(2012) Xinfan Meng, Furu Wei, Ge Xu, Longkai Zhang,
Xiaohua Liu, Ming Zhou, & Houfeng Wang: Lost
in translations? Building sentiment lexicons using context based machine
translation. Proceedings of COLING
2012: Posters, Mumbai, December 2012; pp.829-838. [PDF, 232KB]
(2012) Christian M.Meyer & Iryna Gurevych: To exhibit is not to loiter: a multilingual,
sense-disambiguated Wiktionary for measuring verb similarity. Proceedings of COLING 2012: Technical Papers,
Mumbai, December 2012; pp.1763-1780. [PDF, 771KB]
(2012) Pavel Pecina, Antonio Toral, Vassilis Papavassiliou, Prokopis
Prokopidis, & Josef van Genabith: Domain
adaptation of statistical machine translation using web-crawled resources: a
case study. EAMT 2012: Proceedings of
the 16th Annual Conference of the European Association for Machine Translation,
Trento, Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; pp.145-152. [PDF, 201KB]
(2012) Reinhard Rapp, Serge
Sharoff, & Bogdan Babych: Identifying word
translations from comparable documents without a seed lexicon. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Víctor M.Sánchez-
(2012) Felipe Sánchez-Martínez, Rafael C.Carrasco,
Miguel A.Martínez-Prieto, & Joaquín Adiego: Generalized bywords for bitext
compression and translation spotting. Journal
of Artificial Intelligence Research 43; pp.389-418. [PDF, 418KB]
(2012) Xabier Saralegi, Iker
Manterola, & Ińaki San Vicente: Building a
Basque-Chinese dictionary by using English as pivot. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Rahma Sellami, Fatiha Sadat, & Lamia
Hadrich Belguith: Exploiting Wikipedia as a
knowledge base for the extraction of linguistic resources: application on
Arabic-French comparable corpora and bilingual lexicons. AMTA-2012: Fourth workshop on computational
approaches to Arabic script-based languages. Proceedings,
(2012) Sanja Štajner &
Ruslan Mitkov: Using comparable corpora to
track diachronic and synchronic changes in lexical density and lexical richness. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Akihiro Tamura,
(2012) Gregor Thurmair & Vera Aleksić: Creating term and lexicon entries from phrase
tables. EAMT 2012: Proceedings of the
16th Annual Conference of the European Association for Machine Translation,
Trento, Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; pp.253-260. [PDF, 423KB]
(2012) Paola Valli: How
long is a piece of string? Concordance searches and user behavior
investigated. [Aslib 2012] Translating
and the Computer 34, 29-30 November 2012, One Birdcage Walk, London, UK;
11pp. [PDF, 359KB], presentation: 20
slides [PDF, 2909KB]
(2012) Ivan Vulić & Marie-Francine Moens: Detecting highly confident word translations from
comparable corpora without any prior knowledge. [EACL 2012] Proceedings of the 13th Conference of the European Chapter
of the Association for Computational Linguistics,
(2012)
(2011) Vamshi Ambati, Sanjika Hewavitharana, Stephan Vogel, & Jaime
Carbonell: Active learning with multiple
annotations for comparable data classification task. ACL 2011: Proceedings of the Fourth Workshop on Building and Using
Comparable Corpora,
(2011) Miquel Esplŕ-Gomis, Victor M.Sánchez-Cartagena,
& Juan Antonio Pérez-Ortiz: Enlarging
monolingual dictionaries for machine translation with active learning and
non-expert users. [RANLP 2011]
Proceedings of Recent Advances in Natural Language Processing, Hissar,
Bulgaria, 12-14 September 2011; pp.339-346. [PDF, 148KB]
(2011) Miquel Esplŕ-Gomis,
Víctor M.Sánchez-Cartegna & Juan Antonio Pérez-Ortiz: Multimodal building of monolingual
dictionaries for machine translation by non-expert users. MT Summit XIII: the Thirteenth Machine
Translation Summit [organized by the] Asia-Pacific Association for Machine
Translation (AAMT), 19-23 September 2011,
(2011) Darja Fišer & Nikola Ljubešić: Bilingual lexicon extraction from comparable
corpora for closely related languages.
[RANLP 2011] Proceedings of Recent
Advances in Natural Language Processing, Hissar, Bulgaria, 12-14 September
2011; pp.125-131. [PDF, 95KB]
(2011) Darja Fišer, Nikola Ljubešić, Špela Vintar, & Senja
Pollak: Building and using comparable corpora for
domain-specific bilingual lexicon extraction. ACL 2011: Proceedings of the Fourth Workshop on Building and Using
Comparable Corpora,
(2011) Elena Grasso, Piercarlo Rossi, & Andrea
Violato: Towards on-line knowledge sharing
dictionaries for European law: the Legal Taxonomy Syllabus 3.0. Translating
and the Computer 33, 17-18
November 2011,
(2011) Amir Hazem, Emmanuel Morin & Sebastian Peńa Saldarriaga: Bilingual lexicon extraction from comparable corpora
as metasearch. ACL 2011: Proceedings
of the Fourth Workshop on Building and Using Comparable Corpora,
(2011) Sanjika Hewavitharana & Stephan Vogel: Extracting parallel phrases from comparable
data. ACL 2011: Proceedings of the
Fourth Workshop on Building and Using Comparable Corpora,
(2011) Matthias Huck, Saab Mansour, Simon Wiesler,
& Hermann Ney: Lexicon models for
hierarchical phrase-based machine translation. IWSLT 2011: Proceedings of the International Workshop on Spoken
Language Translation,
(2011) Johannes Knopp: Extending
a multilingual lexical resource by bootstrapping named entity classification
using Wikipedia’s category system. [IJCNLP 2011] Proceedings of the 5th Workshop on Cross Lingual Information Access,
(2011) Bo Li, Eric Gaussier, & Akiko Aizawa: Clustering comparable corpora for bilingual lexicon
extraction. ACL-HLT 2011: Proceedings
of the 49th Annual Meeting of the Association for Computational Linguistics:
Short papers,
(2011) Emmanuel Morin & Emmanuel Prochasson: Bilingual lexicon extraction from comparable corpora
enhanced with parallel corpora. ACL 2011:
Proceedings of the Fourth Workshop on Building and Using Comparable Corpora,
(2011) Seiji Okura, Yuji
Yamamoto,
(2011) Emmanuel Prochasson & Pascale Fung: Rare word translation extraction from aligned
comparable documents. ACL-HLT 2011:
Proceedings of the 49th Annual Meeting of the Association for Computational
Linguistics,
(2011) Markus Saers & Dekai Wu: Principled induction of phrasal bilexica. [EAMT 2011]: proceedings of the 15th
conference of the European Association for Machine Translation, 30-31 May
2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent
Vandeghinste; pp.313-320. [PDF, 373KB]
(2010)
Nasredine Semmar, Christophe Servan, Gaël de Chalendar, Benoît Le Ny,
& Jean-Jacques Bouzaglou: A hybrid word
alignment approach to improve translation lexicons with compound words and
idiomatic expressions. Translating
and the Computer 32, 18-19 November 2010,
(2011) Petra Wolf, Ulrike Bernardi, Christian
Federmann, & Sabine Hunsicker: From
statistical term extraction to hybrid machine translation. [EAMT 2011]: proceedings of the 15th
conference of the European Association for Machine Translation, 30-31 May
2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent
Vandeghinste; pp.225-232. [PDF, 329KB]; presentation,
24 slides [PDF]
(2010) Kathryn Baker, Michael Bloodgood, Bonnie
J.Dorr, Nathaniel W.Filardo, Lori Levin, & Christine Piatko: A modality lexicon and its use in automatic tagging.
LREC 2010: proceedings of the seventh international conference on Language
Resources and Evaluation, 17-23 May 2010,
(2010) Timothy Baldwin,
Jonathan Pool, & Susan M.Colowick: PanLex
and LEXTRACT: translating all words of all languages of the world. Coling 2010: 23rd International Conference
on Computational Linguistics, 23-27 August 2010, Beijing International
Convention Center, Beijing, China, Demonstrations
volume; pp.37-40. [PDF, 309KB]
(2010) Bruno Cartoni
& Marie-Aude Lefer: The MuLeXFoR database:
representing word-formation processes in a multilingual lexicographic
environment. LREC
2010: proceedings of the seventh
international conference on Language Resources and Evaluation, 17-23 May
2010,
(2010) Diptesh
Chatterjee, Sudeshna Sarkar, & Arpit Mishra: Co-occurrence graph based iterative bilingual
lexicon extraction from comparable corpora. [Coling 2010] Proceedings of the 4th Workshop on Cross Lingual
Information Access,
(2010) Josep Maria
Crego, Aurélien Max, & François Yvon: Local lexical adaptation in machine translation
through triangulation: SMT helping SMT. Coling
2010: 23rd International Conference on Computational Linguistics.
Proceedings of the conference, 23-27 August 2010,
(2010)
Hercules Dalianis, Hao-chun Xing, & Xin Zhang: Creating a reusable English-Chinese parallel
corpus for bilingual dictionary construction. LREC 2010: proceedings of the
seventh international conference on Language Resources and Evaluation,
17-23 May 2010,
(2010) Do Thi Ngoc Diep, Laurent Besacier, &
Eric Castelli: Improved Vietnamese-French
parallel corpus mining using English language. Proceedings of the 7th International Workshop on Spoken Language
Translation, 2-3 December 2010,
(2010)
Steffen Eger & Ineta Sejane: Computing
semantic similarity from bilingual dictionaries. JADT 2010: 10th International Conference on Statistical Analysis of
Textual Data, 9-11 juin 2010,
(2010) Pascale Fung, Emmanuel Prochasson, & Simon Shi: Trillions of comparable documents. [LREC 2010] Proceedings
of the 3rd Workshop on Building and Using Comparable Corpora,
(2010) Benoît Gaillard, Malek Boualem, & Olivier Collin: Query translation using Wikipedia-based resources
for analysis and disambiguation. EAMT
2010: Proceedings of the 14th Annual conference of the European Association for
Machine Translation, 27-28 May 2010,
(2010) Filip Graliński:
Mining parenthetical translations for Polish-English lexica [abstract]. CICLING 2010: 11th International Conference on Intelligent Text
Processing and Computational Linguistics, March 21-27, 2010,
(2010) Matthias Huck, Martin Ratajczak, Patrick Lehnen,
& Hermann Ney: A comparison of various types
of extended lexicon models for statistical machine translation. AMTA 2010: the Ninth conference of the
Association for Machine Translation in the Americas, Denver, Colorado,
October 31 – November 4, 2010; 8pp. [PDF, 130KB]
(2010) Minwoo Jeong, Kristina Toutanova, Hisami Suzuki,
& Chris Quirk: A discriminative lexicon model
for complex morphology. AMTA 2010:
the Ninth conference of the Association for Machine Translation in the Americas,
Denver, Colorado, October 31 – November 4, 2010; 10pp. [PDF, 374KB]
(2010) Mitesh M.Khapra,
Saurabh Sohoney, Anup Kulkarni, & Pushpak Bhattacharyya: Value for money: balancing annotation effort,
lexicon building and accuracy for multilingual WSD. Coling 2010: 23rd International Conference on Computational Linguistics.
Proceedings of the conference, 23-27 August 2010,
(2010) Amit Kirschenbaum & Shuly Wintner: A general method for creating a bilingual
transliteration dictionary. LREC
2010: proceedings of the seventh international
conference on Language Resources and Evaluation, 17-23 May 2010,
(2010) Adrien
Lardilleux, Julien Gosme, & Yves Lepage: Bilingual
lexicon induction: effortless evaluation of word alignment tools and production
of resources for improbable language pairs. [LREC 2010]: Proceedings of the Second Workshop on
African Language Technology, AFLAT 2010,
(2010) Bo Li & Eric
Gaussier: Improving corpus comparability for
bilingual lexicon extraction from comparable corpora. Coling 2010: 23rd International Conference on Computational Linguistics.
Proceedings of the conference, 23-27 August 2010,
(2010) Wang Ling, Tiago Luís, Joăo Graça, Luísa
Coheur & Isabel Trancoso: Towards a general
and extensible phrase-extraction algorithm. Proceedings of the 7th International Workshop on Spoken Language
Translation, 2-3 December 2010,
(2010) Reinhard Rapp
& Michael Zock: The noisier the better: identifying
multilingual word translations using a single monolingual corpus. [Coling 2010] Proceedings of the 4th
Workshop on Cross Lingual Information Access,
(2010) Reinhard Rapp
& Michael Zock: Utilizing citations of
foreign words in corpus-based dictionary generation. [Coling 2010] Proceedings of the
Second Workshop on NLP Challenges in the Information Explosion Era,
(2010) Nasredine
Semmar & Laib Meriama: Using a hybrid word
alignment approach for automatic construction and updating of Arabic to French
lexicons. LREC 2010: Workshop on
Language Resources and Human Language Technology for Semitic Languages,
(2010) Jakob Uszkoreit,
Jay M.Ponte, Ashok C.Popat, & Moshe Dubiner: Large scale parallel document mining for
machine translation. Coling 2010:
23rd International Conference on Computational Linguistics. Proceedings of
the conference, 23-27 August 2010,
(2010) David Vilar,
Daniel Stein, Matthias Huck, & Hermann Ney: Jane:
open source hierarchical translation, extended with reordering and lexicon
models. ACL 2010: Joint Fifth
Workshop on Statistical Machine Translation and MetricsMATR. Proceedings of
the workshop, 15-16 July 2010,
Limited domain see Domain restriction and specification
Limited
resources see Scarce resources, Rapid
development of MT
Low resourced languages see Scarce resources
Monolingual corpora
(2013) Yuki Arase & Ming Zhou: Machine translation detection from monolingual
web-text. ACL-2013: Proceedings of the 51st Meeting of the Association for
Computational Linguistics,
(2013) An-Chang Hsieh, Hen-Hsen Huang, & Hsin-His
Chen: Uses of monolingual in-domain corpora for
cross-domain adaptation with hybrid MT approaches. Proceedings of the Second Workshop on Hybrid Approaches to Translation,
(2013) Ann Irvine: Statistical machine translation in low
resource settings. [NAACL-HLT 2013]
Proceedings of the NAACL HLT 2013 Student Research Workshop, 13 June 2013,
(2013) Ann Irvine & Chris Callison-Burch: Supervised bilingual lexicon induction with
multiple monolingual signals. [NAACL-HLT
2013] The 2013 conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 9-14 June 2013,
(2013) Mike Lewis & Mark Steedman: Unsupervised induction of cross-lingual semantic
relations. [EMNLP 2013] Proceedings
of the 2013 Conference on Empirical Methods in Natural Language Processing,
Seattle, Washington, USA, 18-21 October 2013; pp.681-692. [PDF, 149KB]
(2013) Linda Mitchell, Johann Roturier, & Sharon
O’Brien: Community-based post-editing of
machine-translated content: monolingual vs. bilingual. Proceedings of MT
(2013) Vassilis Papavassiliou, Prokopis Prokopidis,
& Gregor Thurmair: A modular
open-source focused crawler for mining monolingual and bilingual corpora from
the web. Proceedings of the 6th Workshop on Building and Using Comparable
Corpora,
(2013) Majid Razmara, Maryam Siahbani, Gholamreza
Haffari, & Anoop Sarkar: Graph propagation
for paraphrasing out-of-vocabulary words in statistical machine translation. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics,
(2013) George Tambouratzis, Sokratis Sofianopoulos,
& Marina Vassiliou: Language-independent
hybrid MT with PRESEMT. Proceedings
of the Second Workshop on Hybrid Approaches to Translation,
(2013) George Tambouratzis, Marina Vassiliou, &
Sokratis Sofianopoulos: A review of the PRESEMT
project. Proceedings of the XIV
Machine Translation
(2013) Elke Teich, Stefania Degaetano-Ortlieb, Hannah
Kermes, & Ekaterina Lapshinova-Koltunski: Scientific
registers and disciplinary diversification: a comparable corpus approach. Proceedings of the 6th Workshop on Building
and Using Comparable Corpora,
(2013) Xuchen
(2013) Jiajun Zhang & Chengqing Zong: Learning a phrase-based translation model from
monolingual data with application to domain adaptation. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics,
(2013) Guangyou Zhou, Fang Liu, Yang Liu, Shizhu He,
& Jun Zhao: Statistical machine translation
improves question retrieval in community question answering via matrix
factorization. ACL-2013: Proceedings
of the 51st Meeting of the Association for Computational Linguistics,
(2012) Houda Bouamor, Aurélien Max, & Anne Vilnat:
Validation of sub-sentential paraphrases
acquired from parallel monolingual corpora. [EACL 2012] Proceedings of the 13th Conference of the European Chapter
of the Association for Computational Linguistics,
(2012) Qing Dou & Kevin
Knight: Large scale decipherment for out-of-domain
machine translation. EMNLP-CoNLL
2012: Joint Conference on Empirical Methods in Natural Language Processing and
Computational Natural Language Learning, Proceedings of the conference,
July 12-14, Jeju Island, Korea; pp.266-275. [PDF, 578KB]
(2012) Jie Jiang, Andy Way, Nelson Ng, Rejwanul Haque,
Mike Dillinger, & Jun Lu: Monolingual data
optimisation for bootstrapping SMT engines. AMTA-2012: Monolingual machine translation-2012 workshop.
Proceedings,
(2012) Adam Kilgarriff &
George Tambouratzis: The PRESEMT project.
[BUCC 2012] The 5th Workshop on Building
and Using Comparable Corpora: “Language Resources for Machine Translation in
Less-Resourced Languages and Domains”,
LREC 2012 Workshop, 26 May 2012,
(2012) Alexandre Klementiev, Ann Irvine, Chris
Callison-Burch, & David Yarowsky: Toward
statistical machine translation without parallel corpora. [EACL 2012] Proceedings of the 13th Conference
of the European Chapter of the Association for Computational Linguistics,
(2012) Takanori Kusumoto & Tomoyosi Akiba: Statistical machine translation without a
source-side parallel corpus using word lattice and phrase extension. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) André Lynum, Erwin Marsi, Lars Bungum, &
Björn Gambäck: Disambiguating word translations with target language models. TSD 2012: 15th International Conference on
Text, Speech and Dialogue, Brno, Czech Republic, September 3-7, 2012; abstract #477, 1p. [HTML]
(2012) Toshiaki Nakazawa & Sadao Kurohashi: Alignment by bilingual generation and
monolingual derivation. Proceedings
of COLING 2012: Technical Papers, Mumbai, December 2012; pp.1963-1978.
[PDF, 1172KB]
(2012) Dávid Márk Nemeskey
& Eszter Simon: Automatically generated NE
tagged corpora for Englsih and Hungarian. [ACL 2012] Proceedings of NEWS 2012 Named Entities Workshop, July
12, 2012, Jeju,
(2012) Malte Nuhn, Arne
Mauser, & Hermann Ney: Deciphering foreign
language by combining language models and context vectors. [ACL 2012] Proceedings of the 50th Annual
Meeting of the Association for Computational Linguistics, Jeju,
(2012) V.M.Sánchez-Cartagena, M.Esplŕ-Gomis,
F.Sánchez-Martínez, & J.A.Pérez-Ortiz: Choosing the correct paradigm for
unknown words in rule-based machine translation systems. In: Free/Open-Source
Rule-Based Machine Translation, ed.Cristina Espańa-Bonet and Aarne Ranta. Proceedings of a Workshop held in
Gothenburg, 14-15 June, 2012; pp.27-39. [PDF, 502KB]
(2012) Víctor M.Sánchez-
(2012) Sokratis
(2012) Aleš Tamchyna, Petra
Galuščáková, Amir Kamran, Miloš Stanojević, & Ondřej Bojar: Selecting data for English-to-Czech machine
translation. WMT 2012: 7th Workshop
on Statistical Machine Translation. Proceedings of the workshop, June 7-8,
2012,
(2012) George Tambouratzis,
Michalis Troullinos, Sokratis Sofianopoulos, & Marina Vassiliou: Accurate phrase alignment in a bilingual
corpus for EBMT systems. [BUCC 2012]
The 5th Workshop on Building and Using
Comparable Corpora: “Language Resources for Machine Translation in
Less-Resourced Languages and Domains”,
LREC 2012 Workshop, 26 May 2012,
(2012) George Tambouratzis, Sokratis Sofianopoulos,
& Marina Vassiliou: Evaluating the
translation accuracy of a novel language-independent MT methodology. Proceedings of COLING 2012: Technical Papers,
Mumbai, December 2012; pp.2569-2583. [PDF, 340KB]
(2012) Jinsong Su, Hua Wu,
Haifeng Wang, Yidong Chen, Xiaodong Shi, Huailin Dong, & Qun Liu: Translation model adaptation for statistical machine
translation with monolingual topic information. [ACL 2012] Proceedings of the 50th Annual Meeting of the Association
for Computational Linguistics, Jeju,
(2012) George Tambouratzis, Marina Vassiliou, &
Sokratis Sofianopoulos: PRESEMT: pattern
recognition-based statistically enhanced MT. EACL Joint Workshop on Exploiting Synergies between Information
Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine
Translation (HyTra): Proceedings of the workshop, 23-24 April 2012,
Avignon, France; pp.65-68. [PDF, 170KB]
(2012) Sander Wubben, Antal
van den Bosch, & Emiel Krahmer: Sentence
simplification by monolingual machine translation. [ACL 2012] Proceedings of the 50th Annual Meeting of the Association
for Computational Linguistics, Jeju,
(2011) Vamshi Ambati,
Stephan Vogel, & Jaime Carbonell: Multi-strategy
approaches to active learning for statistical machine translation. MT
Summit XIII: the Thirteenth Machine Translation Summit [organized by the]
Asia-Pacific Association for Machine Translation (AAMT), 19-23 September 2011,
(2011) Ondřej Bojar & Aleš Tamchyna: Forms wanted: training SMT on monolingual data.
Machine
Translation and Morphologically- rich Languages: Research Workshop of the Israel Science Foundation,
(2011) Ondřej Bojar &
Aleš Tamchyna: Improving translation model by
monolingual data . [WMT 2011] Proceedings
of the 6th Workshop on Statistical Machine Translation,
(2011) Han-Bin Chen, Hen-Hsen
Huang, Jengwei Tjiu, Ching-Ting Tan, & Hsin-His Chen: Identification and translation of significant
patterns for cross-domain SMT applications. MT Summit XIII: the Thirteenth Machine Translation Summit
[organized by the] Asia-Pacific Association for Machine Translation (AAMT),
19-23 September 2011,
(2011) Patrik Lambert, Holger
Schwenk, Christophe Servan, & Sadaf Abdul-Rauf: Investigations on translation model adaptation using
monolingual data. [WMT 2011] Proceedings
of the 6th Workshop on Statistical Machine Translation,
(2011) Gennadi Lembersky, Noam
Ordan, & Shuly Wintner: Language models
for machine translation: original vs. translated texts. [EMNLP 2011] Proceedings of the 2011 Conference on
Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK,
July 27-31, 2011; pp.363-374. [PDF, 288KB]
(2011) Gennadi
Lembersky, Noam Ordan, & Shuly Wintner: Language
models for machine translation: original vs. translated texts. Machine Translation and Morphologically-
rich Languages: Research Workshop of the Israel Science Foundation,
(2011) Zhifei Li, Jason Eisner,
Ziyuan Wang, Sanjeev Khudanpur, & Brian Roark: Minimum
imputed risk: unsupervised discriminative training for machine translation.
[EMNLP 2011] Proceedings of the 2011
Conference on Empirical Methods in Natural Language Processing, Edinburgh,
Scotland, UK, July 27-31, 2011; pp.920-929. [PDF, 236KB]
(2011) Jeff Ma, Spyros Matsoukas, &
Richard Schwartz: Improving low-resource
statistical machine translation with a novel semantic word clustering algorithm.
MT Summit XIII: the Thirteenth Machine
Translation Summit [organized by the] Asia-Pacific Association for Machine
Translation (AAMT), 19-23 September 2011,
(2011) Erwin Marsi, André Lynum, Lars Bungum, &
(2011) Nasredine Semmar & Dhouha Bouamor: A new hybrid machine translation approach using
cross-language information retrieval and only target text corpora. [LIHMT] International Workshop on Using Linguistic
Information for Hybrid Machine Translation, 18th November 2011, Universitat
Politčcnica de Catalunya,
(2011) Rui Wang & Chris Callison-Burch: Paraphrase
fragment extraction from monolingual comparable corpora. ACL 2011: Proceedings of the Fourth Workshop
on Building and Using Comparable Corpora,
(2011) Jia Xu & Weiwei
Sun: Generating virtual parallel corpus: a
compatibility centric method. MT
Summit XIII: the Thirteenth Machine Translation Summit [organized by the]
Asia-Pacific Association for Machine Translation (AAMT), 19-23 September 2011,
(2010) Wilker Aziz, Marc
Dymetman, Shachar Mirkin, Lucia Specia, Nicola Cancedda, & Ido Dagan: Learning an expert from human annotations in
statistical machine translation: the case of out-of-vocabulary words. EAMT 2010: Proceedings of the 14th Annual
conference of the European Association for Machine Translation, 27-28 May
2010,
(2010) Chen Yuncong
& Pascale Fung: Unsupervised synthesis of
multilingual Wikipedia articles. Coling
2010: 23rd International Conference on Computational Linguistics.
Proceedings of the conference, 23-27 August 2010,
(2010) Rashmi
Gangadharaiah, Ralf D.Brown, & Jaime Carbonell: Monolingual distributional profiles for
word substitution in machine translation. Coling 2010: 23rd International Conference on Computational Linguistics,
23-27 August 2010, Beijing International Convention Center, Beijing, China, Posters volume; pp.320-328. [PDF, 605KB]
(2010) Zhanyi Liu,
Haifeng Wang, Hua Wu, & Sheng Li: Improving
statistical machine translation with monolingual collocation. ACL 2010: the 48th Annual Meeting of the
Association for Computational Linguistics,
(2010) Yuval Marton: Improved
statistical machine translation with hybrid phrasal paraphrases derived from
monolingual text and shallow lexical resource. AMTA 2010: the Ninth conference of the Association for Machine
Translation in the Americas, Denver, Colorado, October 31 – November 4,
2010; 10pp. [PDF, 293KB]
(2010) Smruthi Mukund,
Debanjan Ghosh, & Rohini K.Srihari: Using
cross-lingual projections to generate semantic role labeled corpus for Urdu – a
resource poor language. Coling 2010:
23rd International Conference on Computational Linguistics. Proceedings of
the conference, 23-27 August 2010,
(2010) Reinhard Rapp
& Michael Zock: The noisier the better:
identifying multilingual word translations using a single monolingual corpus.
[Coling 2010] Proceedings of the 4th
Workshop on Cross Lingual Information Access,
(2010) Stefan Riezler & Yi Liu: Query rewriting using monolingual statistical
machine translation. Computational
Linguistics 36 (3), pp. 569-582 [PDF, 145KB]
(2010) Xabier Saralegi & Maddalen Lopez de
Lacalle: Dictionary and monolingual
corpus-based query translation for Basque-English CLIR. LREC 2010: proceedings of the seventh international conference on Language
Resources and Evaluation, 17-23 May 2010,
(2010) Hiroyuki Shindo,
Akinori Fujino, & Masaaki Nagata: Word
alignment with synonym regularization. ACL
2010: the 48th Annual Meeting of the Association for Computational Linguistics,
(2010) Yanli Sun, Sharon
O’Brien, Minako O’Hagan, & Fred Hollowood: A
novel statistical pre-processing model for rule-based machine translation
system. EAMT 2010: Proceedings of the 14th Annual conference of the European
Association for Machine Translation, 27-28 May 2010,
(2010) Yulia Tsvetkov
& Shuly Wintner: Extraction of
multi-word expressions from small parallel corpora. Coling 2010: 23rd International Conference on Computational Linguistics,
23-27 August 2010, Beijing International Convention Center, Beijing, China, Posters volume; pp.1256-1264. [PDF,
257KB]
(2010) Gae-won You, Seung-won Hwang, Young-In Song, Long Jiang, &
Zaiqing Nie: Mining name translations from entity
graph mapping. [EMNLP 2010] Proceedings of the 2010 Conference on
Empirical Methods in Natural Language Processing, MIT, Massachusetts, USA,
9-11 October 2010; pp.430-439. [PDF, 1542KB]
Multilingual corpora
(2015) Martin Benjamin, Amar Mukunda & Jeff Allen:
Kamusi pre-D-source-side
disambiguation and a sense aligned multilingual lexicon. Proceedings of the 37th Conference
Translating and the Computer, London, November 26-27, 2015; pp.27-32. [PDF,
188KB]
(2015) Zied Elloumi, Hervé Blanchon, Gilles Serasset,
& Laurent Besacier: METEOR for multiple
target languages using DBnary. MT
Summit XV, October 30 – November 3, 2015, Miami, Florida, USA. Proceedings
of MT Summit XV: vol.1: MT Researchers’ Track; pp.80-89. [PDF, 569KB]
(2014) Judit Ács: Pivot-based
multilingual dictionary building using Wiktionary. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.1938-1942. [PDF, 81KB]
(2014) Shyam S.Agrawal, Abhimane, Shweta Bansal, &
Minakshi Mahakshi: Statistical analysis of multilingual
text corpus and development of language models. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.2436-2440. [PDF, 815KB]
(2014) Hernani Costa, Gloria Corpas Pastor, &
Miriam Seghiri: iCompileCorpora: a web-based
application to semi-automatically compile multilingual comparable corpora. Translating and the Computer 36: proceedings.
Asling: International Society for Advancement in Language Technology, 27-28
November 2014; pp.51-55. [PDF, 119KB]
(2014) Maud Ehrmann, Francesco Cecconi, Daniele
Vannella, John McCrae, Philipp Cimiano, & Roberto Navigli: Representing multilingual data as linked data: the
case of BabelNet 2.0. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.401-408. [PDF, 674KB]
(2014) Tatiana Erekhinskaya, Meghana Satpute, &
Dan Moldovan: Multilingual eXtended Word
Net Knowledge Base semantic parsing and translation of glosses. LREC 2014:
Ninth International Conference on Language Resources and Evaluation, May
26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.2990-2994. [PDF, 82KB]
(2014) Miquel Esplŕ-Gomis, Filip Klubička, Nikola
Ljubešić, Sergio Ortiz-Rojas, Vassilis Papavassiliou, & Prokopis
Prokopidis: Comparing two acquisition
systems for automatically building an English-Croatian parallel corpus from
multilingual websites. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.1252-1258. [PDF, 158KB]
(2014) Najeh Hajlaoui, David Kolovratnik, Jaakko
Väyrynen, Ralf Steinberger, & Daniel Varga: DCEP
– digital corpus of the European Parliament. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.3164-3171. [PDF, 252KB]
(2014) Valérie Hanoka & Benoît Sagot: YaMTG: an open-source heavily multilingual
translation graph extracted from wiktionaries and parallel corpora. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.3179-3186. [PDF, 245KB]
(2014) Lars Hellan, Dorothee Beermann, Tore Bruland,
Mary Esther Kropp Dakubu, Montserrat Marimon: MultiVal
– towards a multilingual valence lexicon. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.2478-2485. [PDF, 247KB]
(2014) Guillaume Jacquet, Maud Ehrmann, & Ralf
Steinberger: Clustering of multi-word named
entity variants: multilingual evaluation. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.2696-2700. [PDF, 217KB]
(2014) David Kamholz, Jonathan Pool, & Susan
M.Colowick: PanLex: building a resource for
panlingual lexical translation. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.2696-2700. [PDF, 145KB]
(2014) Thomas Mayer & Michael Cysouw: Creating a massively parallel Bible corpus. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.3158-3163. [PDF, 575KB]
(2014) Anita Rácz, István Nagy T., Veronika Vincze: 4FX: light verb constructions in a multilingual
parallel corpus. LREC 2014: Ninth
International Conference on Language Resources and Evaluation, May 26-31,
2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.710-715. [PDF, 140KB]
(2013) Judit Ács, Katalin Pajkossy, & András
Kornai: Building basic vocabulary across 40
languages. Proceedings of the 6th
Workshop on Building and Using Comparable Corpora,
(2013) Yegin Genc, Elizabeth A.Lennon, Winter Mason,
& Jeffrey V.Nickerson: Building ontologies
from collaborative knowledge bases to search and interpret multilingual corpora. Proceedings
of the 6th Workshop on Building and Using Comparable Corpora,
(2013) Michael Matuschek, Christian M.Meyer, & Iryna
Gurevych: Multilingual knowledge in aligned
Wiktionary and OmegaWiki for translation applications. Translation: Computation, Corpora,
Cognition 3 (1), June 2013; pp.87-118. [PDF, 2898KB]
(2013) Motaz Saad, David Langlois, & Kamel Smaili:
Comparing multilingual comparable articles based
on opnions. Proceedings of the 6th
Workshop on Building and Using Comparable Corpora,
(2012) Eleftherios Avramidis, Marta R.Costa-jussŕ,
Christian Federmann, Maite Melero, Pavel Pecina, & Josef van Genabith: A richly annotated, multilingual parallel
corpus for hybrid machine translation. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Cristina Bosco, Manuela Sanguinetti, &
Leonardo Lesmo: The Parallel-TUT: a multilingual
and multiformat treebank. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Bruno Cartoni & Thomas Meyer: Extracting directional and comparable corpora from
a multilingual corpus for translation studies. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Yu Chen & Andreas Eisele: MultiUN v2: UN documents with multilingual alignments. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Susanne Jekat: Multilingual information
management for special purposes [abstract].
In: Crosslingual
Language Technology in service of an integrated multilingual Europe,
4-5 May 2012,
(2012)
(2012) Gerard de Melo &
Gerhard Weikum: UWN: a large multilingual lexical
knowledge base. [ACL 2012]
Proceedings of the 50th Annual Meeting of the Association for Computational
Linguistics, Jeju,
(2012) Hervé Saint-Amand, Jason Smith, & Magdalena
Plamada: Parallel corpus
extraction from CommonCrawl. Machine
Translation Marathon 2012 September 3-8,
(2012) Oscar Täckström: Nudging the envelope of direct transfer
methods for multilingual named entity recognition. NAACL-HLT Workshop on the Induction of Linguistic Structure,
(2011) Steven Abney & Steven Bird: Towards
a data model for the universal corpus. ACL
2011: Proceedings of the Fourth Workshop on Building and Using Comparable
Corpora,
(2011) Shane Bergsma, David Yarowsky, & Kenneth Church: Using large monolingual and bilingual corpora to
improve coordination disambiguation. ACL-HLT
2011: Proceedings of the 49th Annual Meeting of the Association for
Computational Linguistics,
(2011) Michael Elhadad, Meni Adler, Yoav Goldberg, & Rafi Cohen: Topic models for morphologically rich languages and
their usage to explore multilingual corpora [abstract]. Machine
Translation and Morphologically- rich Languages: Research Workshop of the Israel Science Foundation,
(2011) Kriste Krstovski & David A.Smith: A minimally supervised approach for detecting and
ranking document translation pairs.
[WMT 2011] Proceedings of the 6th
Workshop on Statistical Machine Translation,
(2011) Bin Lu, Ka Po Chow,
& Benjamin K.Tsou: The cultivation of a
Chinese-English-Japanese trilingual parallel corpus from comparable patents.
MT Summit XIII: the Thirteenth Machine
Translation Summit [organized by the] Asia-Pacific Association for Machine
Translation (AAMT), 19-23 September 2011,
(2011) Matteo Negri, Luisa Bentivogli, Yashar Mehdad, Danilo
Giampiccolo, & Alessandro Marchetti: Divide
and conquer: crowdsourcing the creation of cross-lingual textual entailment
corpora. [EMNLP 2011] Proceedings of the 2011 Conference on
Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK,
July 27-31, 2011; pp.670-679. [PDF, 493KB]
(2011) Violeta Seretan & Eric Wehrli: FipsCoView:
on-line visualisation of collocations extracted from multilingual parallel
corpora. Proceedings of the Workshop
on Multiword Expressions: from Parsing and Generation to the Real World (MWE
2011),
(2011) Johanka Spoustová & Miroslav Spousta: Comparable fora. ACL 2011: Proceedings of the Fourth Workshop on Building and Using
Comparable Corpora,
(2011) Wolfgang Täger: The
sentence-aligned European patent corpus. [EAMT 2011]: proceedings of the 15th conference of the European
Association for Machine Translation, 30-31 May 2011, Leuven, Belgium; eds.
Mikel L.Forcada, Heidi Depraetere, Vincent Vandeghinste; pp.177-184. [PDF,
234KB]
(2011) ACCURAT:
analysis and evaluation of comparable corpora for under resourced areas of
machine translation. (European Machine Translation Projects.) [EAMT 2011]: proceedings of the 15th
conference of the European Association for Machine Translation, 30-31 May
2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent
Vandeghinste; p.323. [PDF, 287KB]
(2010) Thomas Eckart & Uwe Quasthoff: Statistical
corpus and language comparison using comparable corpora. [LREC 2010] Proceedings
of the 3rd Workshop on Building and Using Comparable Corpora,
(2010) Andreas Eisele
& Yu Chen: MultiUN: a multilingual corpus
from United Nation documents. LREC
2010: proceedings of the seventh
international conference on Language Resources and Evaluation, 17-23 May
2010,
(2010) Tomaž Erjavec: MULTEXT-East version 4:
multilingual morphosyntactic specifications, lexicons and corpora. LREC 2010: proceedings of the seventh international conference on Language
Resources and Evaluation, 17-23 May 2010,
(2010) Pablo Gamallo Otero & Isaac González López: Wikipedia as multilingual source of comparable
corpora. [LREC 2010] Proceedings of the 3rd Workshop on Building and
Using Comparable Corpora,
(2010) Tomoharu Iwata, Daichi Mochihashi, & Hiroshi Sawada: Learning common grammar from multilingual corpus.
ACL 2010: the 48th Annual Meeting of the
Association for Computational Linguistics,
(2010) Adam Kilgarriff: Comparable corpora
within and across languages, word frequency lists and the KELLY project.
[LREC 2010] Proceedings of the 3rd Workshop on Building and Using Comparable
Corpora,
(2010) Petr Knoth, Trevor Collins, Elsa Sklavounou, & Zdenek
Zdrahal: Facilitating cross-language retrieval
and machine translation by multilingual domain ontologies. [LREC 2010] Workshop on Supporting eLearning with
Language Resources and Semantic Data,
(2010) Adrien
Lardilleux, Julien Gosme, & Yves Lepage: Bilingual
lexicon induction: effortless evaluation of word alignment tools and production
of resources for improbable language pairs. [LREC 2010]: Proceedings of the Second Workshop on
African Language Technology, AFLAT 2010,
(2010) Els Lefever & Véronique Hoste: Construction of a benchmark data set for
cross-lingual word sense disambiguation. LREC 2010: proceedings of the
seventh international conference on Language Resources and Evaluation,
17-23 May 2010,
(2010) Simon Mille & Leo Wanner: Syntactic dependencies for multilingual and
multilevel corpus annotation. LREC
2010: proceedings of the seventh
international conference on Language Resources and Evaluation, 17-23 May
2010,
(2010)
Gudrun Rawoens: Multilingual corpora in
cross-lingusitic research: focus on the compilation of a Dutch-Swedish parallel
corpus. JADT 2010: 10th International
Conference on Statistical Analysis of Textual Data, 9-11 juin 2010,
(2010) Fei Xia, Carrie Lewis, & William
D.Lewis: The problems of language identification
within hugely multilingual data sets.
LREC 2010: proceedings of the seventh international conference on Language
Resources and Evaluation, 17-23 May 2010,
(2010) Martin Volk, Noah Bubenhofer, Adrian
Althaus, Maya Bangerter, Lenz Furrer, & Beni Ruef: Challenges in building a multilingual Alpine heritage
corpus. LREC 2010: proceedings of
the seventh international conference on
Language Resources and Evaluation, 17-23 May 2010,
Ontologies
(2014) Anabela
(2014) Maud Ehrmann, Francesco Cecconi, Daniele
Vannella, John McCrae, Philipp Cimiano, & Roberto Navigli: Representing multilingual data as linked data: the
case of BabelNet 2.0. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.401-408. [PDF, 674KB]
(2014) Jorge Gracia, Elena Montiel-Ponsoda, Daniel
Vila-Suero, & Guadalupe Aguado-de-Cea: Enabling
language resources to expose translations as linked data on the Web. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.409-413.
[PDF, 346KB]
(2013) Mihael Arcan & Paul Buitelaar: MONNET: multilingual ontologies for networked
knowledge. Proceedings of the XIV
Machine Translation
(2013) Mihael Arcan & Paul Buitelaar: Ontology label translation. [NAACL-HLT 2013] Proceedings of the NAACL
HLT 2013 Student Research Workshop, 13 June 2013,
(2013) Maria Pia di Buono, Johanna Monti, Mario
Monteleone, & Federica Marano: Multi-word
processing in an ontology-based cross-language information retrieval model for
specific domain collections. [MT
(2013) Yegin Genc, Elizabeth A.Lennon, Winter Mason,
& Jeffrey V.Nickerson: Building ontologies
from collaborative knowledge bases to search and interpret multilingual corpora. Proceedings
of the 6th Workshop on Building and Using Comparable Corpora,
(2013) Meritxell Gonzŕlez, Maria Mateva, Ramona
Enache, Cristina Espańa, Lluís Mŕrquez, Borislav Popov, & Aarne Ranta: MT techniques in a retrieval system of
semantically enriched patents. Proceedings
of the XIV Machine Translation Summit, Nice, September 2-6, 2013; ed.
K.Sima’an, M.L.Forcada, D.Grasmick, H.Depraetere, A.Way; pp.273-280. [PDF,
908KB]
(2013) Clara López Rodríguez, Juan Antonio Prieto
Velasco, & Maribel Tercedor Sánchez: Multimodal
representation of specialised knowledge in ontology-based terminological
databases: the case of EcoLexicon. Journal
of Specialised Translation 20, July 2013; pp.49-67. [PDF, 336KB]
(2013) XLIKE:
cross-lingual knowledge extraction (XLike). Proceedings of the XIV Machine Translation Summit, Nice, September
2-6, 2013; ed. K.Sima’an, M.L.Forcada, D.Grasmick, H.Depraetere, A.Way; p.451.
[PDF, 191KB]
(2012) Mihael Arcan, Christian Federmann, & Paul
Buitelaar: Experiments with term translation.
Proceedings of COLING 2012: Technical
Papers, Mumbai, December 2012; pp.67-82. [PDF, 178KB]
(2012) Mihael Arcan, Paul
Buitelaar, & Christian Federmann: Using
domain-specific and collaborative resources for term translation. SSST-6, Sixth Workshop on Syntax, Semantics
and Structure in Statistical Translation, Jeju,
(2012) Kartik Asooja, Jorge Gracia, Nitish Aggarwal,
& Asunión Gómez Pérez: Using cross-lingual
explicit semantic analysis for improving ontology translation. COLING 2012: Second Workshop on Applying
Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT
(ML4HMT), Mumbai, December 2012; pp.25-35. [PDF, 143KB]
(2012) Gerhard Budin: Terminological ontologies in
multi-lingual cross-domain communities of practice [abstract]. In: Crosslingual
Language Technology in service of an integrated multilingual Europe,
4-5 May 2012,
(2012) Isabel Durán Muńoz, Gloria Corpas Pastor & Le
An Ha: ProTermino: a comprehensive
web-based terminological management tool based on knowledge representation.
[Aslib 2012] Translating and the Computer
34, 29-30 November 2012, One Birdcage Walk,
(2012) Roger Granada, Lucelene Lopes, Carlos Ramisch,
Cassia Trojahn, Renata Vieira, & Aline Villavicencio: A comparable corpus based on aligned multilingual
ontologies. [ACL 2012] Proceedings of the
First Workshop on Multilingual Modeling, Jeju,
(2012) Oliver Kutz, Christoph Lange, Till Mossakowski,
C.Maria Keet, Fabian Neuhaus, & Michael Gruninger: The Babel of the Semantic Web tongues – in search of
the Rosetta stone of interoperability. Semantic
Web Conference 2012; 6pp. [PDF, 813KB]
(2012) Cristina Vertan: Two approaches for integrating translation and
retrieval in real applications. EACL
Joint Workshop on Exploiting Synergies between Information Retrieval and
Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation
(HyTra): Proceedings of the workshop, 23-24 April 2012, Avignon, France;
pp. 59-64. [PDF, 226KB]
(2012) Manuela Yapomo, Gloria
Corpas, & Ruslan Mitkov: CLIR- and
ontology-based approach for bilingual extraction of comparable documents. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2011) Fumiko Kano Glückstad: Application
of classical psychology theory to terminological ontology alignment. Proceedings of the 8th international NLPSC
workshop. Special theme: Human-machine interaction in translation,
Copenhagen Business School, 20-21 August 2011; ed.Bernadette Sharp, Michael
Zock, Michael Carl, Arnt Lykke Jakobsen (Copenhagen Studies in Language 41),
Frederiksberg: Samfundslitteratur, 2011; pp.227-238. [PDF, 1187KB]
(2011) John McCrae, Maurizio Espinoza, Elena Monteil-Ponsoda, Guadalupe
Aguado-de-Cea, & Philipp Cimiano: Combining
statistical and semantic approaches to the translation of ontologies and
taxonomies. Proceedings of SSST-5,
Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation,
ACL HLT 2011, Portland, Oregon, USA, June 2011; pp.116-125. [PDF, 576KB]
(2011)
(2011) Junichi Tsujii: Resource-rich research on
natural language processing and understanding. Keynote at: IWSLT 2011: Proceedings of the International
Workshop on Spoken Language Translation, San Francisco, December 8-9, 2011,
ed. Marcello Federico, Mei-Yuh Hwang, Margit Rödder, Sebastian Stüker
(2011) MONNET: multilingual
ontologies for networked knowledge. (European Machine Translation
Projects.) [EAMT 2011]: proceedings of
the 15th conference of the European Association for Machine Translation,
30-31 May 2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere,
Vincent Vandeghinste; p.343. [PDF, 36KB]
(2010) Gosse Bouma: Cross-lingual ontology alignment using EuroWordNet
and Wikipedia. LREC 2010: proceedings
of the seventh international conference on
Language Resources and Evaluation, 17-23 May 2010,
(2010) Helena de
Medeiros Caseli, Bruno Akio Sugiyama, & Junia Coutinho Anacleto: Using common sense to generate culturally
contextualized machine translation. Proceedings
of the NAACL HLT 2010 Young Investigators Workshop on Computational Approaches
to Languages of the Americas,
(2010) Petr Knoth, Trevor Collins, Elsa Sklavounou, & Zdenek
Zdrahal: Facilitating cross-language retrieval
and machine translation by multilingual domain ontologies. [LREC 2010] Workshop on Supporting eLearning with
Language Resources and Semantic Data,
Open source
(2015) Christophe Servan, Ngoc Tien Le, Ngoc Quang
Luong, Benjamin Lecouteux, & Laurent Besacier: An open-source
toolkit for word-level confidence estimation in machine translation. [IWSLT 2015] Proceedings of the
International Workshop on Spoken Language Translation, December 3-4, 2015,
Da Nang, Vietnam; pp.196-203. [PDF, 3.2MB]
(2014) Krasimir Angelov: Bootstrapping open-source English-Bulgarian
computational dictionary. LREC 2014:
Ninth International Conference on Language Resources and Evaluation, May
26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.1018-1023. [PDF, 158KB]
(2014) Grégoire Détrez, Víctor M.Sánchez-Cartagena,
& Aarne Ranta: Sharing resources between
free/open-source rule-based machine translation systems: Grammatical Framework
and Apertium. LREC 2014: Ninth
International Conference on Language Resources and Evaluation, May 26-31,
2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland; ed.
Nicoletta Calzolari et al.; pp.4394-4400. [PDF, 146KB]
(2014) Maud Ehrmann, Francesco Cecconi, Daniele
Vannella, John McCrae, Philipp Cimiano, & Roberto Navigli: Representing multilingual data as linked data: the
case of BabelNet 2.0. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.401-408. [PDF, 674KB]
(2014) Marcello Federico, Nicola Bertoldi, Marco
Trombetti, & Alessandro Cattelan: MateCat: an open source CAT tool for MT post-editing. AMTA 2014: proceedings of the eleventh conference of
the Association for Machine Translation in the Americas, Vancouver, BC, October
22-26; Tutorials, 98 slides
(2014) Spence Green, Daniel Cer, & Christpher
D.Manning: Phrasal: a toolkit for new directions
in statistical machine translation. [WMT 2014] Proceedings of the Ninth Workshop on Statistical Machine Translation,
(2014) David Kamholz, Jonathan Pool, & Susan
M.Colowick: PanLex: building a resource for
panlingual lexical translation. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.2696-2700. [PDF, 145KB]
(2014) Juan Luo & Yves Lepage: Production of phrase tables in 11 European languages
using an improved sub-sentential aligner. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.664-669. [PDF, 206KB]
(2014) Stelios Piperidis, Harris Papageorgiou,
Christian Spurk, Georg Rehm, Khalid Choukri, Olivier Hamon, Nicoletta
Calzolari, Riccardo del Gratta, Bernardo Magnini, & Christian Girardi: META-SHARE: one year after. LREC 2014: Ninth International Conference on
Language Resources and Evaluation, May 26-31, 2014 Harpa Concert Hall and
Conference Center, Reykjavik, Iceland; ed. Nicoletta Calzolari et al.;
pp.1208-1211. [PDF, 861KB]
(2014) Alex Rudnick, Taylor Skidmore, Alberto
Samaniego, & Michael Gasser: Guampa: a
toolkit for collaborative translation. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.1659-1663. [PDF, 284KB]
(2014) Lane Schwartz: An open
source desktop post-editing tool.
AMTA 2014: proceedings of the eleventh conference of the Association for
Machine Translation in the Americas, Vancouver, BC, October 22-26; Workshop on Post-editing Technology and
Practice (WPTP-3); p. 122. [PDF, 103KB]
(2014) Antonio Toral, Raphael Rubino, Miquel
Esplŕ-Gomis, Tommi Pirinen, Andy Way, & Gema Ramírez-Sánchez: Extrinsic
evaluation of web-crawlers in machine translation: a study on Croatian-English
for the tourism domain. Proceedings of the 17th annual conference of the
European Association for Machine Translation, EAMT 2014, Dubrovnik, Croatia,
16th-18th June 2014, edited by Marko Tadić, Philipp Koehn, Johann
Roturier, Andy Way; pp. 221-224. [PDF, 345KB]
(2014) Jonathan North Washington, Ilnar Salimzyanov,
& Francis M.Tyers: Finite-state
morphological transducers for three Kypchak languages. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.3378-3385. [PDF, 1727KB]
(2013) Christian Hardmeier, Sara Stymne, Jörg
Tiedemann, & Joakim Nivre: Docent: a
document-level decoder for phrase-based statistical machine translation. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, System demonstrations,
(2013) Vassilis Papavassiliou, Prokopis Prokopidis,
& Gregor Thurmair: A modular
open-source focused crawler for mining monolingual and bilingual corpora from
the web. Proceedings of the 6th Workshop on Building and Using Comparable
Corpora,
(2013) Matt Post,
Juri Ganitkevitch, Luke Orland, Jonathan Weese, Yuan Cao & Chris
Callison-Burch: Joshua 5.0: sparser, better, faster, server. WMT 2013: 8th Workshop on Statistical
Machine Translation, Proceedings of the Workshop, August 8-9, 2013,
(2013) Kenneth Heafield, Ivan Pouzyrevsky, Jonathan
H.Clark, & Philipp Koehn: Scalable modified
Kneser-Ney language model estimation.
ACL-2013: Proceedings of the 51st
Meeting of the Association for Computational Linguistics, Short papers,
Sofia, Bulgaria, August 4-9 2013; pp.690-696. [PDF, 212KB]
(2013) Graham Neubig: Travatar:
a forest-to-string machine translation engine based on tree transducers. ACL-2013: Proceedings of the 51st Meeting of
the Association for Computational Linguistics, System demonstrations,
(2013) Ilnar Salimzyanov, Jonathan North Washington,
& Francis Morton Tyers: A
free/open-source Kazakh-Tatar machine translation system. Proceedings of the XIV Machine Translation
Summit, Nice, September 2-6, 2013; ed. K.Sima’an, M.L.Forcada, D.Grasmick,
H.Depraetere, A.Way; pp.175-182. [PDF, 327KB]
(2013) Lucia Specia, Kashif Shah, Jose G.C.de Souza,
& Trevor Cohn: QuEst – a translation quality
estimation framework. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, System demonstrations,
(2013) MosesCore:
Moses open source evaluation and support co-ordination for outreach and
exploitation. Proceedings of the XIV
Machine Translation
(2012) Wilker Aziz & Lucia Specia: PET: a standalone tool for assessing machine
translation through post-editing. [Aslib 2012] Translating and the Computer 34, 29-30 November 2012, One Birdcage
Walk, London, UK; 5pp. [PDF, 446KB]; presentation
by Lucia Specia: 54 slides [PDF, 694KB]
(2012) Jan Berka, Ondřej Bojar, Mark Fishel, Maja
Popović, & Daniel Zeman: Automatic MT error
analysis: Hjerson helping Addicter. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Christian Federmann: Appraise: an open-source toolkit for manual
evaluation of MT output. Prague
Bulletin of Mathematical Linguistics 98, October 2012; pp.25-35. [PDF,
487KB]
(2012) Christian Federmann: Appraise. Machine Translation Marathon 2012 September 3-8,
(2012) Christian Federmann, Ioanna Giannopoulou,
Christian Girardi, Olivier Hamon, Dimitris Mavroeidis, Salvatore Minutoli,
& Marc Schröder: META-SHARE v2: an open
network of repositories for language resources including data and tools. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012)
(2012) Matthias Huck,
Jan-Thorsten Peter, Markus Freitag, Stephan Peitz, & Hermann Ney: Hierarchical phrase-based translation with Jane 2.
Prague Bulletin of Mathematical
Linguistics 98, October 2012; pp.37-50. [PDF, 168KB]; presentation, 23 slides [PDF of PPT,
242KB]
(2012) Philipp Koehn & Hieu Hoang: Open source
statistical machine translation: Moses, machine translation with open source
sofware. [Tutorial at] AMTA-2012: the
Tenth Biennial Conference of the Association for Machine Translation in the
(2012) Hai-Son Le, Thomas
Lavergne, Alexandre Allauzen, Marianna Apidianaki, Li Gong, Aurélien Max, Artem
Sokolov, Guillaume Wisniewski, & François Yvon: LIMSI
@ WMT’12. WMT 2012: 7th Workshop on Statistical Machine Translation.
Proceedings of the workshop, June 7-8, 2012,
(2012) Aingeru Mayor, Mans Hulden, & Gorka Labaka:
Developing an open-source FST grammar for verb
chain transfer in a Spanish-Basque MT system. Proceedings of the 10th International Workshop on Finite State Methods
and Natural Language Processing, Donostia-San Sebastián, July 23-25, 2012;
pp.65-69. [PDF, 157KB]
(2012) Maja Popović: rgbF: an open source tool for n-gram based
automatic evaluation of machine translation output. Prague Bulletin of Mathematical Linguistics 98, October 2012;
pp.99-108. [PDF, 126KB]
(2012) V.M.Sánchez-Cartagena, F.Sánchez-Martínez,
& J.A.Pérez-Ortiz: An
open-source toolkit for integrating shallow-transfer rules into phrase-based
statistical machine translation. In: Free/Open-Source
Rule-Based Machine Translation, ed.Cristina Espańa-Bonet and Aarne Ranta. Proceedings of a Workshop held in
Gothenburg, 14-15 June, 2012; pp.41-54. [PDF, 620KB]
(2012) Max Silberztein, Tamás Váradi, & Marko
Tadić: Open source multi-platform
NooJ for NLP. Proceedings of COLING
2012: Demonstration Papers, Mumbai, December 2012; pp. 401-408. [PDF,
557KB]
(2012) Andrejs Vasiļjevs,
Markus Forsberg, Tatiana Gornostay, Dorte H.Hansen, Kristin M.Jóhannsdóttir,
Krister Lindén, Gunn I.Lyse, Lene Offersgaard, Ville Oksanen, Sussi Olsen,
Bolette S.Pedersen, Eiríkur Rögnvaldsson, Roberts Rozis, Inguna Skadiņa,
& Koenraad De Smet: Creation of an open
shared language resource repository in the Nordic and Baltic countries. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Xianchao Wu, Takuya
Matsuzaki, & Jun’ichi Tsujii: Akamon: an open source
toolkit for tree/forest-based statistical machine translation. [ACL 2012] Proceedings of the 50th Annual
Meeting of the Association for Computational Linguistics, Jeju,
(2012) Joern Wuebker, Matthias Huck, Stephan Peitz,
Malte Nuhn, Markus Freitag, Jan-Thorsten Peter, Saab Mansour, & Hermann
Ney: Jane 2: open source phrase-based and
hierarchical statistical machine translation. Proceedings of COLING 2012: Demonstration Papers, Mumbai, December
2012; pp. 483-491. [PDF, 160KB]
(2012) Tong Xiao, Jingbo Zhu,
Hao Zhang, & Qiang Li: NiuTrans: an open
source toolkit for phrase-based and syntax-based machine translation. [ACL 2012] Proceedings of the 50th Annual
Meeting of the Association for Computational Linguistics, Jeju,
(2012) MosesCore:
Moses open source evaluation and support co-ordination for outreach and
exploitation. [Project paper at] EAMT 2012: Proceedings of the 16th Annual
Conference of the European Association for Machine Translation, Trento,
Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; p.201. [PDF, 76KB]
(2011) Steven Abney & Steven Bird: Towards
a data model for the universal corpus. ACL
2011: Proceedings of the Fourth Workshop on Building and Using Comparable
Corpora,
(2011) Martha Dís Brandt, Hrafn Loftsson, Hlynur
Sigurţórsson, & Francis M.Tyers: Apertium-IceNLP:
a rule-based Icelandic to English machine translation system. [EAMT 2011]: proceedings of the 15th
conference of the European Association for Machine Translation, 30-31 May
2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent
Vandeghinste; pp.217-224. [PDF, 332KB]; presentation,
29 slides [PDF]
(2011) Josep M.Crego, François Yvon,
& José B.Marińo: Ncode: an open source
bilingual n-gram SMT toolkit. Sixth Machine Translation Marathon, 5-10 September 2011,
(2011) Philipp Koehn: Moses statistical machine translation system. META-FORUM
2011: Solutions for multilingual
Europe, June 27/28 2011, Hotel Marriott,
(2011) Aaron B.Phillips & Ralf D.Brown: Training machine translation with a second-order
Taylor approximation of weighted translation instances. MT Summit XIII: the Thirteenth Machine
Translation Summit [organized by the] Asia-Pacific Association
for Machine Translation (AAMT), 19-23 September 2011,
(2011) Stelios Piperidis: META-SHARE: an open resource exchange
infrastructure for stimulating research and innovation. META-FORUM
2011: Solutions for multilingual
Europe, June 27/28 2011, Hotel Marriott,
(2011) Maja Popović: Hjerson: an open source tool for automatic error
classification of machine translation output. Sixth Machine Translation
Marathon, 5-10
September 2011,
(2011) Felipe Sánchez-Martínez and Juan Antonio
Pérez-Ortiz (eds.): Proceedings of the Second
International Workshop on Free/Open-Source Rule-Based Machine Translation,
20-21 January 2011,
(2011) Daniel Stein, David Vilar, Stephan
Peitz, Markus Freitag, Matthias Huck, & Hermann Ney: A guide to Jane, an open source hierarchical
translation toolkit. Prague Bulletin of Mathematical Linguistics, no.95, April 2011; pp.5-18.
[PDF, 192KB]
(2011) Andrejs Vasiļjevs, Raivis Skadiņš,
& Jörg Tiedemann: LetsMT!: cloud-based
platform for building user tailored machine translation engines. MT Summit XIII: the Thirteenth Machine
Translation Summit [organized by the] Asia-Pacific Association for Machine
Translation (AAMT), 19-23 September 2011, Xiamen, China; pp.507-511. [PDF,
211KB]
(2011) Jonathan Weese, Juri Ganitkevitch, Chris Callison-Burch, Matt
Post, & Adam Lopez: Joshua 3.0: syntax-based
machine translation with the Thrax grammar extractor. [WMT 2011] Proceedings of the 6th Workshop on
Statistical Machine Translation,
(2011) Omar F.Zaidan: MAISE: a flexible,
configurable, extensible open source package for mass AI system evaluation.
[WMT 2011] Proceedings of the 6th
Workshop on Statistical Machine Translation,
(2011) [LIHMT 2011] Introductions, and About the OpenMT-2 project;
1p. [PDF, 124KB]
(2010) proceedings of Fourth Machine Translation Marathon “Open Source Tools for Machine Translation”,
25-30 January, Dublin, Ireland; Prague Bulletin of Mathematical Linguistics,
no.93, January 2010.
(2010) Loďc Barrault: MANY: open source MT system
combination at WMT’10. ACL 2010:
Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR.
Proceedings of the workshop, 15-16 July 2010,
(2010)
Nicola Bertoldi: IRSTLM toolkit.
Presentation at Fifth Machine Translation
Marathon, 13-18 September 2010,
(2010) Anton Bryl & Josef van Genabith: f-align: an open-source alignment tool for LFG
f-structures. AMTA 2010: the Ninth
conference of the Association for Machine Translation in the Americas,
Denver, Colorado, October 31 – November 4, 2010; 8pp. [PDF, 229KB]
(2010) Dana Dannélls & John J.Camilleri:
Verb morphology of Hebrew and Maltese –
towards an open source type theoretical resource grammar in GF. LREC 2010: Workshop on Language Resources
and Human Language Technology for Semitic Languages,
(2010) Jo Drugan & Bogdan Babych: Shared resources, shared values? Ethical
implications of sharing translation resources. JEC 2010: Second joint
EM+/CNGL Workshop “
(2010) Chris Dyer, Adam
Lopez, Juri Ganitkevitch, Jonathan Weese, Ferhan Ture, Phil Blunsom, Hendra
Setiawan, Vladimir Eidelman, & Philip Resnik: cdec:
a decoder, alignment, and learning framework for finite-state and context-free
translation models. Proceedings of
the ACL 2010 System Demonstrations,
(2010) Christian
Federmann: Appraise: an open-source toolkit
for manual phrase-based evaluation of translations. LREC 2010: proceedings of the
seventh international conference on Language Resources and Evaluation, 17-23 May 2010,
(2010)
Christian Federmann & Andreas Eisele: MT
Server Land: an open-source MT architecture. Fifth Machine Translation Marathon, 13-18 September,
(2010) Mikel L.Forcada, Boyan
Ivanov Bonev, Sergio Ortiz Rojas, Juan Antonio Pérez Ortiz, Gema Ramírez
Sánchez, Felipe Sánchez Martínez, Carme Armentano-Oller, Marco A.Montava, &
Francis M.Tyers: Documentation of the open-source shallow-transfer
machine translation platform Apertium; ed. Mireia Ginestí Rosell.
Departament de Llenguatges i Sistemes Informŕtics, Universitat d’Alacant, March
10, 2010; 214pp. [PDF, 700KB]
(2010) Mikel L.Forcada: Free/open-source machine
translation: the Apertium platform. Translingual
Europe 2010,
(2010)
Philipp Koehn & Hieu Hoang: Moses: machine
translation with open source software. Tutorial at AMTA 2010: the Ninth conference of the
Association for Machine Translation in the Americas, Denver, Colorado,
November 4, 2010; 29 slides [PDF of PPT, 520KB]
(2010) Zhifei Li, Chris
Callison-Burch, Chris Dyer, Juri Ganitkevitch, Ann Irvine, Sanjeev Khudanpur,
Lane Schwartz, Wren N.G.Thornton, Ziyuan Wang, Jonathan Weese, & Omar
F.Zaidan: Joshua 2.0: a toolkit for parsing-based machine
translation with syntax, semirings, discriminative training and other goodies.
ACL 2010: Joint Fifth Workshop on
Statistical Machine Translation and MetricsMATR. Proceedings of the
workshop, 15-16 July 2010,
(2010) Aaron B.Phillips:
The Cunei machine translation platform for
WMT’10. ACL 2010: Joint Fifth
Workshop on Statistical Machine Translation and MetricsMATR. Proceedings of
the workshop, 15-16 July 2010,
(2010) Stelios Piperidis: META-SHARE: the open resource exchange
facility. META-FORUM 2010:
Challenges for multilingual Europe, November 17/18 2010,
(2010) Ting Qian, Kristy Hollingshead, Su-youn
Yoon, Kyoung-young Kim, & Richard Sproat: A
Python toolkit for universal transliteration. LREC 2010: proceedings of the
seventh international conference on Language Resources and Evaluation,
17-23 May 2010,
(2010) Aarne Ranta,
Krasimir Angelov, & Thomas Hallgren: Tools for
multilingual grammar-based translation on the web. Proceedings of the ACL 2010 System Demonstrations,
(2010) Manny Rayner, Pierrette Bouillon, Nikos
Tsourakis, Johanna Gerlach, Maria Georgescul, Yukie Nakao, & Claudia Baur: A multilingual CALL game based on speech
translation. LREC 2010: proceedings
of the seventh international conference
on Language Resources and Evaluation, 17-23 May 2010,
(2010) Antoine Rey: GlobalSight MT integration. Translingual Europe 2010,
(2010) Achim Ruopp: The
Moses for Localization open source project. AMTA 2010: the Ninth conference of the Association for Machine
Translation in the Americas, Denver, Colorado, October 31 – November 4,
2010; 4pp. [PDF, 111KB]
(2010 Lane Schwartz: Reproducible results in parsing-based machine
translation: the JHU shared task submission. ACL
2010: Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR.
Proceedings of the workshop, 15-16 July 2010,
(2010)
Daniel Stein, David Vilar, Stephan Peitz, & Hermann Ney: Jane: a guide to RWTH’s hierarchical
machine translation toolkit. Presentation at Fifth Machine Translation Marathon, 13-18 September,
(2010) Josef van
Genabith: EuroMatrixPlus –
evaluation, localisation, open source. Translingual
Europe 2010,
(2010) David Vilar,
Daniel Stein, Matthias Huck, & Hermann Ney: Jane:
open source hierarchical translation, extended with reordering and lexicon
models. ACL 2010: Joint Fifth
Workshop on Statistical Machine Translation and MetricsMATR. Proceedings of
the workshop, 15-16 July 2010,
(2010) E.Yuste, M.Herranz, A.-L.Lagarda, L.Tarazón, I.Sánchez-Cortina,
& F.Casacuberta: PangeaMT – putting open
standards to work…well. AMTA 2010:
the Ninth conference of the Association for Machine Translation in the Americas,
Denver, Colorado, October 31 – November 4, 2010; 8pp. [PDF, 480KB]
Parallel
text corpora see Bilingual corpora,
Multilingual corpora
Pruning
see
Cleaning and filtering
Scarce
resources (see also Language resources,
Rapid development of MT)
(2014) Burak Aydın & Arzucan Özgür: Expanding machine
translation training data with an out-of-domain corpus using language modeling
based vocabulary saturation. AMTA
2014: proceedings of the eleventh conference of the Association for Machine
Translation in the Americas, Vancouver, BC, October 22-26; pp.180-192. [PDF,
523KB]
(2014) Peter Baumann & Janet Pierrehumbert: Using resource-rich languages to improve
morphological analysis of under-resourced languages. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.3355-3359. [PDF, 143KB]
(2014) Daniel Beck, Kashif Shah, & Lucia Specia: SHEF-Lite 2.0: sparse multi-task Gaussian processes
for translation quality estimation. [WMT 2014] Proceedings of the Ninth Workshop on Statistical Machine Translation,
(2014) Alex Rudnick, Taylor Skidmore, Alberto
Samaniego, & Michael Gasser: Guampa: a
toolkit for collaborative translation. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.1659-1663. [PDF, 284KB]
(2014) Raphael Rubino, Antonio Toral, Nikola
Ljubešić, & Gema Ramírez-Sánchez: Quality
estimation for synthetic parallel data generation. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.1843-1849. [PDF, 248KB]
(2014) Raivis Skadiņš, Mārcis Pinnis,
Andrejs Vasiļjevs, Inguna Skadiņa, & Tomas Hudik: Application of machine translation in localization
into low-resourced languages.
Proceedings of the 17th annual conference of the European Association
for Machine Translation, EAMT 2014, Dubrovnik, Croatia, 16th-18th June 2014,
edited by Marko Tadić, Philipp Koehn, Johann Roturier, Andy Way; pp.209-216.
[PDF, 871KB]
(2013) Haithem Afli, Loďc Barrault
& Holger Schwenk: Multimodal comparable
corpora as resources for extracting parallel data: parallel phrases extraction.
International Joint Conference on Natural
Language Processing,
(2013) Qing Dou & Kevin Knight: Dependency-based decipherment for resource-limited
machine translation. [EMNLP 2013]
Proceedings of the 2013 Conference on Empirical Methods in Natural Language
Processing, Seattle, Washington, USA, 18-21 October 2013; pp.1668-1676.
[PDF, 153KB]
(2013) Mirela-Ştefania Duma & Cristina
Vertan: Integration of machine translation in
on-line multilingual applications – domain adaptation. Translation: Computation, Corpora,
Cognition 3 (1), June 2013; pp.67-74. [PDF, 395KB]
(2013) Ahmed El Kholy, Nizar Habash, Gregor Leusch,
Evgeny Matusov, & Hassan Sawaf: Language
independent connectivity strength features for phrasal pivot statistical
machine translation. ACL-2013: Proceedings of the 51st Meeting of
the Association for Computational Linguistics, Short papers, Sofia, Bulgaria,
August 4-9 2013; pp.412-418. [PDF, 200KB]; revised
version.
(2013) Ankur Gandhe & Rashmi
Gangadharaiah: Hypothesis refinement using
agreement constraints in machine translation. International Joint Conference on Natural Language Processing,
(2013) Ann Irvine
& Chris Callison-Burch: Combining bilingual and comparable corpora for low
resource machine translation. WMT
2013: 8th Workshop on Statistical Machine Translation, Proceedings of the
Workshop, August 8-9, 2013,
(2013) Ann Irvine: Statistical machine translation in low
resource settings. [NAACL-HLT 2013]
Proceedings of the NAACL HLT 2013 Student Research Workshop, 13 June 2013,
(2013) Igor Leturia, Kepa Sarasola, Xabier Arregi,
Arantza Diaz de Ilarraza, Eva Navas, Ińaki Sainz, Arantza del Pozo, David
Baranda, & Urtza Iturraspe: The BerbaTek
project for Basque: promoting a less-resourced language via language technology
for translation, content management and learning. Translation: Computation, Corpora,
Cognition 3 (1), June 2013; pp.119-135. [PDF, 785KB]
(2013) Khang Nhut Lam & Jugal Kalita: Creating reverse bilingual dictionaries. [NAACL-HLT 2013] The 2013 conference of the
North American Chapter of the Association for Computational Linguistics: Human
Language Technologies, 9-14 June 2013,
(2013) Lian Tze Lim, Lay-Ki Soon, Tek Yong Lim, Enya
Kong Tang, & Bali Ranaivo-Malançon: Context-dependent
multilingual lexical lookup for under-resourced languages. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.294-299. [PDF,
373KB]
(2013) Oscar Täckström, Ryan McDonald, & Joakim
Nivre: Target language adaptation of
discriminative transfer parsers. [NAACL-HLT
2013] The 2013 conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, 9-14 June 2013,
(2013) Oscar Täckström, Dipanjan Das, Slav Petrov,
Ryan McDonald, & Joakim Nivre: Token and
type constraints for cross-lingual part-of-speech tagging. Transactions of the Association for
Computational Linguistics 1 (2013); pp.1-12 [PDF, 3217KB]
(2013) Jörg Tiedemann & Preslav
Nakov: Analyzing the use of character-level
translation with sparse and noisy datasets. Proceedings of Recent Advances in Natural Language Processing, Hissar, Bulgaria,
7-13 September 2013; pp.676-684. [PDF, 133KB]
(2013) Mo Yu, Tiejun Zhao, Yalong Bai, Hao Tian, &
Dianhai Yu: Cross-lingual projections between
languages from differenr families. ACL-2013:
Proceedings of the 51st Meeting of the Association for Computational
Linguistics, Short papers, Sofia, Bulgaria, August 4-9 2013; pp.312-317.
[PDF, 187KB]
(2012) Khan Md.Anwarus Salam, Setsuo Yamada, &
Tetsuro Nishino: Sublexical translations for
low-resource language. COLING 2012:
Proceedings of the Workshop on Machine Translation and Parsing in Indian
Languages (MTPIL-2012), Mumbai, December 2012; pp.39-51. [PDF, 514KB]
(2012) Damir Cávar: Bootstrapping NLP and MT resources
for under-resourced languages. In: Crosslingual Language Technology in service of an integrated
multilingual Europe, 4-5 May 2012,
(2012) Sherri Condon, Luis Hernandez, Dan Parvaz,
Mohammad S.Khan, & Hazrat Jahed: Producing
data for under-resourced languages: a Dari-English parallel corpus of
multi-genre text. AMTA-2012: the
Tenth Biennial Conference of the Association for Machine Translation in the
(2012) Doren Singh, Thoudam: Addressing some issues of data sparsity
towards improving English-Manipuri SMT using morphological information. AMTA-2012: Monolingual machine translation-2012
workshop. Proceedings,
(2012) Greg Durrett, Adam
Pauls, & Dan Klein: Syntactic transfer
using a bilingual lexicon. EMNLP-CoNLL
2012: Joint Conference on Empirical Methods in Natural Language Processing and
Computational Natural Language Learning, Proceedings of the conference,
July 12-14, Jeju Island, Korea; pp.1-11. [PDF, 286KB]
(2012) Georgi Iliev & Angel Genov: Expanding parallel resources for medium-density
languages for free. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Anoop Kunchukuttan, Shourya Roy,
(2012) Takanori Kusumoto & Tomoyosi Akiba: Statistical machine translation without a
source-side parallel corpus using word lattice and phrase extension. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Septina Dian Larasati: Improving word alignment by exploiting adapted
word similarity. AMTA-2012:
Monolingual machine translation-2012 workshop. Proceedings,
(2012) William D.Lewis & Phong Yang: Building MT for a severely under-resourced language:
White Hmong. AMTA-2012: the Tenth
Biennial Conference of the Association for Machine Translation in the
(2012) Preslav Nakov & Hwee Tou Ng: Improving statistical machine translation for a
resource-poor language using related resource-rich languages. Journal of Artificial Intelligence Research
44 (2012); pp.179-222. [PDF, 421KB]
(2012) Lene Offersgaard &
Dorte Haltrup Hansen: SMT systems for
less-resourced languages based on domain-specific data. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Matt Post, Chris
Callison-Burch, & Miles Osborne: Constructing
parallel corpora for six Indian languages via crowdsourcing. WMT 2012: 7th Workshop on Statistical
Machine Translation. Proceedings of the workshop, June 7-8, 2012,
(2012) Xabier Saralegi, Iker
Manterola, & Ińaki San Vicente: Building a
Basque-Chinese dictionary by using English as pivot. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Inguna Skadiņa: Analysis and evaluation
of comparable corpora for under-resourced areas of machine translation. [BUCC 2012] The 5th Workshop on Building and Using Comparable Corpora: “Language
Resources for Machine Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) Sokratis
(2012) Fangzhong Su & Bogdan Babych: Measuring comparability of documents in non-parallel
corpora for efficient extraction of (semi-)parallel translation equivalents.
EACL Joint Workshop on Exploiting
Synergies between Information Retrieval and Machine Translation (ESIRMT) and
Hybrid Approaches to Machine Translation (HyTra): Proceedings of the
workshop, 23-24 April 2012, Avignon, France; pp.10-19. [PDF, 188KB]
(2012) Feifei Zhai, Jiajun Zhang, Yu Zhou, &
Chengqing Zong: Tree-based translation without
using parse trees. Proceedings of
COLING 2012: Technical Papers, Mumbai, December 2012; pp.3037-3054. [PDF,
1800KB]
(2012) Jörg Tiedemann: Character-based pivot translation for
under-resourced languages and domains. [EACL
2012] Proceedings of the 13th Conference of the European Chapter of the
Association for Computational Linguistics,
(2012) Antonio Toral: Pivot-based
machine translation between statistical and black box systems. EAMT 2012: Proceedings of the 16th Annual
Conference of the European Association for Machine Translation, Trento,
Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; pp.321-328. [PDF, 187KB]
(2012) Andrius Utka: Multilingual resources and their
application for the Lithuanian language [abstract]. In: Crosslingual Language Technology in service
of an integrated multilingual Europe, 4-5 May 2012,
(2012)
(2012) Ping Xu & Pascale
Fung: Cross-lingual language modelling with
syntactic reordering for low-resource speech recognition. EMNLP-CoNLL 2012: Joint Conference on
Empirical Methods in Natural Language Processing and Computational Natural
Language Learning, Proceedings of the conference, July 12-14, Jeju Island,
Korea; pp.766-776. [PDF, 275KB]
(2012) Xiaoning Zhu, Yiming Cui, Conghui Zhu, Tiejun
Zhao, & Hailong Cao: The HIT-LTRC machine
translation system for IWSLT 2012. IWSLT-2012:
9th International Workshop on Spoken Language Translation,
(2012) ACCURAT: Analysis
and evaluation of comparable corpora for under resourced areas of machine
translation. [Project paper at] EAMT
2012: Proceedings of the 16th Annual Conference of the European Association for
Machine Translation, Trento, Italy, May 28-30 2012, ed. Mauro Cettolo,
Marcello Federico, Lucia
Specia, Andy Way;
p.205. [PDF, 72KB]
(2012) [BUCC 2012] The 5th Workshop
on Building and Using Comparable Corpora: “Language Resources for Machine
Translation in Less-Resourced Languages and Domains”, LREC 2012 Workshop, 26 May 2012,
(2012) proceedings of SALTMIL 2012,
“Language technology for normalisation of less-resourced languages”,
(2011) Vamshi Ambati, Sanjika Hewavitharana, Stephan Vogel, & Jaime
Carbonell: Active learning with multiple
annotations for comparable data classification task. ACL 2011: Proceedings of the Fourth Workshop on Building and Using
Comparable Corpora,
(2011) Vamshi Ambati,
Stephan Vogel, & Jaime Carbonell: Multi-strategy
approaches to active learning for statistical machine translation. MT
Summit XIII: the Thirteenth Machine Translation Summit [organized by the]
Asia-Pacific Association for Machine Translation (AAMT), 19-23 September 2011,
(2011) Sankaranarayanan Ananthakrishnan, Shiv Vitaladevuni,
Rohit Prasad, & Prem Natarajan: Source
error-projection for sample selection in phrase-based SMT for resource-poor
languages. [IJCNLP 2011] Proceedings
of the 5th International Joint Conference on Natural Language Processing,
(2011) Khan Md. Anwarus
Salam, Setsuo Yamada, & Tetsuro Nishino: Example-based machine translation for
low-resource language using chunk-string templates. MT Summit XIII: the Thirteenth Machine Translation Summit
[organized by the] Asia-Pacific Association for Machine Translation (AAMT),
19-23 September 2011,
(2011) Ondřej Bojar & Aleš Tamchyna: Improving translation model by monolingual data
. [WMT 2011] Proceedings of the 6th
Workshop on Statistical Machine Translation,
(2011) Alexandru Ceauşu & Dan Tufiş: Addressing SMT data sparseness when translating
into morphologically-rich languages. Proceedings
of the 8th international NLPSC workshop. Special theme: Human-machine
interaction in translation, Copenhagen Business School, 20-21 August 2011;
ed.Bernadette Sharp, Michael Zock, Michael Carl, Arnt Lykke Jakobsen
(Copenhagen Studies in Language 41), Frederiksberg: Samfundslitteratur, 2011;
pp.57-68. [PDF, 1886KB]
(2011) Marta R.Costa-jussŕ, Carlos Henríquez, &
Rafael E.Banchs: Enhancing
scarce-resource language translation through pivot combinations. [IJCNLP
2011] Proceedings of the 5th
International Joint Conference on Natural Language Processing,
(2011) Sandipan Dandapat, Sara Morrissey,
(2011) Vladimir Eidelman, Kristy Hollingshead, & Philip Resnik: Noisy SMS machine translation in low-density
languages. [WMT 2011] Proceedings of
the 6th Workshop on Statistical Machine Translation,
(2011) Monica
Gavrila: Constrained recombination in an
example-based machine translation system. [EAMT 2011]: proceedings of the 15th conference of the European
Association for Machine Translation, 30-31 May 2011, Leuven, Belgium; eds.
M ikel L.Forcada, Heidi Depraetere, Vincent Vandeghinste; pp.193-200. [PDF,
417KB]; presentation, 36 slides [PDF]
(2011) Monica Gavrila & Natalia Elita: Experiments with small-sized corpora in CBMT.
[RANLP 2011] Proceedings of the Student
Research Workshop associated with RANLP 2011,
(2011) Sanjika Hewavitharana, Nguyen Bach, Qin Gao, Vamshi Ambati,
& Stephan Vogel: CMU Haitian
Creole-English translation system for WMT 2011. [WMT 2011] Proceedings of the 6th Workshop on
Statistical Machine Translation,
(2011) Deirdre Hogan, Jennifer Foster, & Josef van Genabith: Decreasing lexical data sparsity in statistical
syntactic parsing – experiments with named entities. Proceedings of the Workshop on Multiword Expressions: from Parsing and
Generation to the Real World (MWE 2011),
(2011) Suhel Jaber, Sara Tonelli, & Rodolfo Delmonte: Venetan to English machine translation: issues and
possible solutions. Proceedings of
the 8th international NLPSC workshop. Special theme: Human-machine interaction
in translation, Copenhagen Business School, 20-21 August 2011;
ed.Bernadette Sharp, Michael Zock, Michael Carl, Arnt Lykke Jakobsen
(Copenhagen Studies in Language 41), Frederiksberg: Samfundslitteratur, 2011;
pp.69-80. [PDF, 873KB]
(2011) Mitesh M.Khapra, Salil Joshi, Arindam Chatterjee, & Pushpak
Bhattacharyya: Together we can:
bilingual bootstrapping for WSD. ACL-HLT 2011: Proceedings of the 49th Annual Meeting of the Association
for Computational Linguistics,
(2011) Kiran Kumar N,
(2011) William D.Lewis, Robert Munro, & Stephan Vogel: Crisis MT: developing a cookbook for MT in crisis
situations. [WMT 2011] Proceedings of
the 6th Workshop on Statistical Machine Translation,
(2011) Jeff Ma, Spyros Matsoukas, & Richard
Schwartz: Improving low-resource statistical
machine translation with a novel semantic word clustering algorithm. MT Summit XIII: the Thirteenth Machine
Translation Summit [organized by the] Asia-Pacific Association for Machine
Translation (AAMT), 19-23 September 2011,
(2011) Victor
M.Sánchez-Cartagnena, Felipe Sánchez-Martínez, & Juan Antonio Pérez-Ortiz: Enriching a statistical machine
translation system trained on small parallel corpora with rule-based bilingual
phrases. [RANLP 2011] Proceedings of
Recent Advances in Natural Language Processing, Hissar, Bulgaria, 12-14
September 2011; pp.90-96. [PDF, 125KB]
(2011) Reshef Shilon,
Nizar Habash, Alon Lavie, & Shuly Wintner: Machine
translation between Hebrew and Arabic: needs, challenges and preliminary
solutions. Machine Translation and
Morphologically- rich Languages: Research Workshop of the Israel Science
Foundation,
(2011) Raivis Skadiņš, Maris Puriņš, Inguna
Skadiņa, & Andrejs Vasiļjevs: Evaluation
of SMT in localization to under-resourced inflected language. [EAMT 2011]: proceedings of the 15th
conference of the European Association for Machine Translation, 30-31 May
2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent Vandeghinste;
pp.35-40. [PDF, 287KB]; presentation,
17 slides [PDF, 796KB]
(2011) Zhiyang Wang, Yajuan Lü, & Qun Liu:
Multi-granularity word alignment and decoding for
agglutinative language translation. MT
Summit XIII: the Thirteenth Machine Translation Summit [organized by the]
Asia-Pacific Association for Machine Translation (AAMT), 19-23 September 2011,
(2011) Jia Xu & Weiwei
Sun: Generating virtual parallel corpus: a
compatibility centric method. MT
Summit XIII: the Thirteenth Machine Translation Summit [organized by the]
Asia-Pacific Association for Machine Translation (AAMT), 19-23 September 2011,
(2011) ACCURAT:
analysis and evaluation of comparable corpora for under resourced areas of
machine translation. (European Machine Translation Projects.) [EAMT 2011]: proceedings of the 15th
conference of the European Association for Machine Translation, 30-31 May
2011, Leuven, Belgium; eds. Mikel L.Forcada, Heidi Depraetere, Vincent
Vandeghinste; p.323. [PDF, 287KB]
(2011) LetsMT! Platform
for online sharing of training data and building user tailored MT.
(European Machine Translation Projects.) [EAMT
2011]: proceedings of the 15th conference of the European Association for
Machine Translation, 30-31 May 2011, Leuven, Belgium; eds. Mikel L.Forcada,
Heidi Depraetere, Vincent Vandeghinste; p.337. [PDF, 66KB]
(2010) proceedings of 7th SaLTMiL Workshop on Creation and use of
basic lexical resources for less-resourced languages, LREC-2010: proceedings of the seventh international conference on Language Resources
and Evaluation,
(2010) Sisay Adugna
& Andreas Eisele: English-Oromo machine
translation: an experiment using a statistical approach. LREC 2010: proceedings of the seventh
international conference on Language Resources and Evaluation, 17-23 May
2010,
(2010) Vamshi Ambati,
Stephen Vogel, & Jaime Carbonell: Active
learning and crowd-sourcing for machine translation. LREC 2010: proceedings of the seventh international conference on
Language Resources and Evaluation, 17-23 May 2010,
(2010) Sankaranarayanan
Ananthakrishnan, Rohit Prasad, David Stallard, & Prem Natarajan: Discriminative sample selection for
statistical machine translation.
[EMNLP 2010] Proceedings of the
2010 Conference on Empirical Methods in Natural Language Processing, MIT,
Massachusetts, USA, 9-11 October 2010; pp.626-635. [PDF, 329KB]
(2010) Alberto Barrón-Cedeńo, Paolo Rosso, Eneko
Agirre, & Gorka Labaka: Plagiarism
detection across distant language pairs. Coling 2010: 23rd International Conference on Computational Linguistics.
Proceedings of the conference, 23-27 August 2010,
(2010) Ondřej Bojar, Pavel Straňák, & Daniel Zeman: Data issues in English-to-Hindi machine
translation. LREC 2010: proceedings
of the seventh international conference
on Language Resources and Evaluation, 17-23 May 2010,
(2010) Ondřej Bojar, Kamil Kos, & David Mareček: Tackling sparse data issue in machine translation
evaluation. ACL 2010: the 48th Annual
Meeting of the Association for Computational Linguistics,
(2010) Bill Dolan: Building
partnerships with language communities: the importance of shared technology and
shared data. META-FORUM 2010:
Challenges for multilingual Europe, November 17/18 2010,
(2010) Jinhua Du, Jie Jiang, &
(2010) Andreas Eisele & Jia Xu: Improving
machine translation performance using comparable corpora. [LREC 2010] Proceedings of the 3rd
Workshop on Building and Using Comparable Corpora,
(2010) Jan Hajic: Building
bridges using innovative approaches in machine translation. META-FORUM 2010: Challenges for
multilingual Europe, November 17/18 2010,
(2010) Md.Zahurul Islam, Jörg Tiedemann & Andreas Eisele: English to Bangla phrase-based machine translation. EAMT
2010: Proceedings of the 14th Annual conference of the European Association for
Machine Translation, 27-28 May 2010,
(2010) Sittichai
Jiampojamarn, Kenneth Dwyer, Shane Bergsma, Aditya Bhargava, Qing Dou, Mi-Young
Kim, & Grzegorz Kondrak: Transliteration
generation and mining with limited training resources. NEWS 2010: Proceedings of the
2010 Named Entities Workshop, ACL 2010,
(2010) Jae-Hee Lee,
Seung-Wook Lee, Gumwon Hong, Young-Sook Hwang, Sang-Bum Kim, & Hae-Chang
Rim: A post-processing approach to statistical
word alignment reflecting alignment tendency between part-of-speeches. Coling 2010: 23rd International Conference
on Computational Linguistics, 23-27 August 2010, Beijing International
Convention Center, Beijing, China, Posters
volume; pp.623-629. [PDF, 438KB]
(2010) William Lewis: Haitian Creole: developing MT for a
low data language. Translingual
Europe 2010,
(2010) William D.Lewis: Haitian Creole: how to build and ship an MT engine
from scratch in 4 days, 17 hours, & 30 minutes. EAMT 2010: Proceedings of the 14th Annual conference of the European
Association for Machine Translation, 27-28 May 2010,
(2010) Jan Niehues &
Alex Waibel: Domain adaptation in statistical
machine translation using factored translation models. EAMT 2010: Pro ceedings of the 14th Annual conference of the European
Association for Machine Translation, 27-28 May 2010,
(2010) Reinhard Rapp
& Michael Zock: The noisier the better:
identifying multilingual word translations using a single monolingual corpus.
[Coling 2010] Proceedings of the 4th
Workshop on Cross Lingual Information Access,
(2010) Reinhard Rapp
& Michael Zock: Utilizing citations of
foreign words in corpus-based dictionary generation. [Coling 2010] Proceedings of the
Second Workshop on NLP Challenges in the Information Explosion Era,
(2010) Tanja
Schultz & Alan W.Black: Multilingual
speech processing – rapid language adaptation tools and technologies. Interspeech 2010,
(2010) Libin Shen, Bing Zhang,
Spyros Matsoukas, Jinxi Xu, & Ralph Weischedel: Statistical machine translation with a factorized
grammar. [EMNLP 2010] Proceedings of the 2010 Conference on
Empirical Methods in Natural Language Processing, MIT, Massachusetts, USA,
9-11 October 2010; pp.616-625. [PDF, 147KB]
(2010) Reshef Shilon, Nizar Habash, Alon Lavie, &
Shuly Wintner: Machine translation between
Hebrew and Arabic: needs, challenges and preliminary solutions. AMTA 2010: the Ninth conference of the
Association for Machine Translation in the Americas, Denver, Colorado,
October 31 – November 4, 2010; 10pp. [PDF, 141KB]
(2010) Inguna Skadiņa, Andrejs Vasiļjevs, Raivis
Skadiņš, Robert Gaizauskas, Dan Tufiş, & Tatiana Gornostay: Analysis and evaluation of comparable corpora for
under resourced areas of machine translation. [LREC 2010] Proceedings of
the 3rd Workshop on Building and Using Comparable Corpora,
(2010) Daniel Stein, Christoph Schmidt, &
Hermann Ney: Sign language machine translation
overkill. Proceedings of the 7th
International Workshop on Spoken Language Translation, 2-3 December 2010,
(2010) Yulia Tsvetkov & Shuly Wintner: Automatic acquisition of parallel corpora from websites
with dynamic content. LREC 2010: proceedings of the seventh international conference on Language
Resources and Evaluation, 17-23 May 2010,
(2010) Francis M.Tyers: Rule-based Breton to French machine translation.
EAMT 2010: Proceedings of the 14th Annual
conference of the European Association for Machine Translation, 27-28 May
2010,
(2010) Andrejs Vasiljevs: LetsMT! – towards cloud based
service for MT generation. Translingual
Europe 2010,
(2010) Karthik
Visweswariah, Vijil Chenthamarakshan, & Nandakishore Kambhatla: Urdu and Hindi: translation and sharing
of linguistic resources. Coling 2010:
23rd International Conference on Computational Linguistics, 23-27 August
2010, Beijing International Convention Center, Beijing, China, Posters volume; pp.1283-1291. [PDF,
217KB]
(2010) Bing Xiang,
Yonggang Deng, & Bowen Zhou: Diversify and
combine: improving word alignment for machine translation on low-resource
languages. ACL 2010: the 48th Annual
Meeting of the Association for Computational Linguistics,
Software resources
(2013) Valeria Aliperta: Streamlining your workflow:
useful desktop software and mobile applications for the interpreting and
translation industry [abstract]. [Aslib 2013] Translating and the Computer 35, 28-29 November 2013, etc.venues,
Paddington,
(2011) Liang Tian, Fai Wong,
& Sam Chao: Word alignment using GIZA++ on
Windows. MT Summit XIII: the
Thirteenth Machine Translation Summit [organized by the] Asia-Pacific
Association for Machine Translation (AAMT), 19-23 September 2011,
Sparse data see Scarce resources
Spoken language resources
(2014) Eunah Cho, Sarah Fünfer, Sebastian Stüker,
& Alex Waibel: A corpus of spontaneous speech
in lectures: the KIT lecture corpus for spoken language processing and
translation. LREC 2014: Ninth International Conference on Language Resources and
Evaluation, May 26-31, 2014 Harpa Concert Hall and Conference Center,
Reykjavik, Iceland; ed. Nicoletta Calzolari et al.; pp.1554-1559. [PDF, 146KB]
(2013) Pierrette Bouillon, Johanna Gerlach, Ulrich
Germann, Barry Haddow & Manny Rayner: Two
approaches to correcting homophone confusion in a hybrid machine translation
system. Proceedings of the Second
Workshop on Hybrid Approaches to Translation,
(2013) Matt Post, Gaurav Kumar, Adam Lopez, Damianos
Karakos, Chris Callison-Burch, & Sanjeev Khudanpur: Improved speech-to-text translation with the Fisher
and Callhome Spanish-English speech translation corpus. [IWSLT 2013] Proceedings of the 10th International Workshop on Spoken Language
Translation,
(2013) Hiroaki
(2012) Sebastian Stüker,
Florian Kraft, Christian Mohr, Teresa Herrmann, Eunah Cho, & Alex Waibel: The KIT lecture corpus for speech translation. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2010) Tanja Schultz & Alan W.Black: Multilingual speech processing – rapid
language adaptation tools and technologies. Interspeech 2010,
Translation memory see Index of aids and toools
Treebanks
(see
also Semantic analysis and representation,
Thesaurus method)
(2014) Ann Bies, Justin Mott, Seth Kulick, Jennifer
Garland, & Colin Warner: Incorporating alternate
translations into English Translation Treebank. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.1863-1868. [PDF, 339KB]
(2013) Jiří Mírovský,
Kateřina Rysová, Magdaléna Rysová, & Eva Hajičová: (Pre-)annotation of topic-focus articulation in
Prague Czech-English dependency treebank. International Joint Conference on Natural Language Processing,
(2012) Ondřej Bojar, Zdeněk Žabokrtský,
Ondřej Dušek, Petra Galuščáková, Martin Majliš, David Mareček,
Jiří Maršík, Michal Novák, Martin Popel, & Aleš Tamchyna: The joy of parallelism with CzEng 1.0. LREC
2012: Eighth international conference on Language Resources and Evaluation,
21-27 May 2012,
(2012) Cristina Bosco, Manuela Sanguinetti, &
Leonardo Lesmo: The Parallel-TUT: a multilingual
and multiformat treebank. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Masood Ghayoomi: From grammar rule extraction to treebanking: a
bootstrapping approach. LREC 2012:
Eighth international conference on Language Resources and Evaluation, 21-27
May 2012,
(2012) Jan Hajič, Eva Hajičová,
(2012) Gideon Kotzé, Vincent Vandeghinste, Scott
Martens, & Jörg Tiedemann: Large aligned
treebanks for syntax-based machine translation. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Xuansong Li, Stephanie Strassel, Stephen
Grimes, Safa Ismael, Mohamed Maamouri, Ann Bies, & Nianwen Xue: Parallel aligned treebanks at LDC: new challenges interfacing
existing infrastructures. LREC
2012: Eighth international conference on Language Resources and Evaluation, 21-27 May 2012,
(2012) Annette Rios & Anne
Göhring: A tree is a Baum is an árbol is a sach’a: creating a trilingual treebank. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012)
(2012) Kateřina
Veselovská, Nguy Giang Linh, &
(2011) Manuela Sanguinetti & Cristina Bosco: Building the multilingual TUT parallel
treebank. AEPC 2011: proceedings of
the Second Workshop on Annotation and Exploitation of Parallel Corpora,
associated with the 8th International Conference on Recent Advances in Natural
Language Processing (RANLP 2011), 15th September 2011,
(2011) Kiril Simov, Petya Osenova, Laska Laskova,
Aleksandar Savkov, & Stanislava Kancheva: Bulgarian-English
parallel treebank: word and semantic level alignment. AEPC 2011: proceedings of the Second Workshop on Annotation and
Exploitation of Parallel Corpora, associated with the 8th International
Conference on Recent Advances in Natural Language Processing (RANLP 2011), 15th
September 2011,
(2011) Martin Volk, Torsten Marek, & Yvonne
Samuelsson: Building and querying parallel
treebanks. Translation: Computation,
Corpora, Cognition 1 (1), December
2011; pp.7-28. [PDF, 824KB]
(2010) Tagyoung Chung & Daniel Gildea: Effects of empty categories on machine translation. [EMNLP 2010] Proceedings of the 2010 Conference on Empirical Methods in Natural
Language Processing, MIT, Massachusetts, USA, 9-11 October 2010;
pp.636-645. [PDF, 198KB]
(2010) Stephen Grimes, Xuansong Li, Ann Bies, Seth
Kulick, Xiaoyi Ma, & Stephanie Strassel: Creating
Arabic-English parallel word-aligned treebank corpora at LDC. LREC 2010: Workshop on Language Resources and
Human Language Technology for Semitic Languages,
(2010) Jun Sun, Min
Zhang, & Chew Lim Tan: Exploring syntactic structural
features for sub-tree alignment using bilingual tree kernels. ACL 2010: the 48th Annual Meeting of the
Association for Computational Linguistics,
Wikis
(2012) CoSyne, a
project on multilingual content synchronization with wikis. [Project paper at] EAMT 2012: Proceedings of the 16th Annual
Conference of the European Association for Machine Translation, Trento,
Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; p.206. [PDF, 87KB]
(2011) Federico Gaspari, Antonio Toral & Sudip
Kumar Naskar: User-focused task-oriented MT
evaluation for wikis: a case study. Proceedings
of the Third Joint EM+/CNGL Workshop “
(2011) CoSyne, a
project on multi-lingual content synchronization with wikis. (European
Machine Translation Projects.) [EAMT
2011]: proceedings of the 15th conference of the European Association for
Machine Translation, 30-31 May 2011, Leuven, Belgium; eds. Mikel L.Forcada,
Heidi Depraetere, Vincent Vandeghinste; p.329. [PDF, 46KB]
Wikis
(2015) Takashi Tsunakawa & Hiroyuki Kaji: Towards
cross-lingual patent wikification. MT
Summit XV, October 30 – November 3, 2015, Miami, Florida, USA. Proceedings
of MT Summit XV: Sixth Workshop on Patent and Scientific Literature Translation
(PSLT6); pp.89-95. [PDF, 1165KB]
Wiktionary
(2015) Zied Elloumi, Hervé Blanchon, Gilles Serasset,
& Laurent Besacier: METEOR for multiple
target languages using DBnary. MT
Summit XV, October 30 – November 3, 2015, Miami, Florida, USA. Proceedings
of MT Summit XV: vol.1: MT Researchers’ Track; pp.80-89. [PDF, 569KB]
(2015) Gerard de Melo: Wiktionary-based word
embeddings. MT Summit XV, October 30
– November 3, 2015, Miami, Florida, USA. Proceedings of MT Summit XV:
vol.1: MT Researchers’ Track; pp.346-359. [PDF, 709KB]
Wordnets
(see also WordNet in index of systems)
(2014) Tatiana Erekhinskaya, Meghana Satpute, &
Dan Moldovan: Multilingual eXtended Word
Net Knowledge Base semantic parsing and translation of glosses. LREC
2014: Ninth International Conference on Language Resources and Evaluation,
May 26-31, 2014 Harpa Concert Hall and Conference Center, Reykjavik, Iceland;
ed. Nicoletta Calzolari et al.; pp.2990-2994. [PDF, 82KB]
(2014) Able-to-Include:
Improving accessibility for people with intellectual disabilities.
Proceedings of the 17th annual conference of the European Association for
Machine Translation, EAMT 2014, Dubrovnik, Croatia, 16th-18th June 2014, edited
by Marko Tadić, Philipp Koehn, Johann Roturier, Andy Way; p.134. [PDF]
(2013) Dhouha Bouamor, Nasredine Semmar, & Pierre
Zweigenbaum: Using WordNet and semantic
similarity for bilingual terminology mining from comparable corpora. Proceedings of the 6th Workshop on Building
and Using Comparable Corpora,
(2012) Darja Fišer: Language resources and tools for
semantically enhanced processing of Slovene [abstract]. In: Crosslingual
Language Technology in service of an integrated multilingual Europe,
4-5 May 2012,
(2012) Salil Joshi, Arindam Chatterjee, Arun Karthikeyan
Karra, & Pushpak Bhattacharyya: Eating your own cooking: automatically linking wordnet synsets of two languages.
Proceedings of COLING 2012: Demonstration
Papers, Mumbai, December 2012; pp. 239-246. [PDF, 1527KB]
(2012) Gerard de Melo &
Gerhard Weikum: UWN: a large multilingual lexical
knowledge base. [ACL 2012]
Proceedings of the 50th Annual Meeting of the Association for Computational
Linguistics, Jeju,
(2012) Jyrki Niemi & Krister Lindén: Representing the translation relation in a bilingual
wordnet. LREC 2012: Eighth international
conference on Language Resources and Evaluation, 21-27 May 2012,
(2012)
(2012) Špela Vintar, Darja Fišer, & Aljoša
Vrščaj: Were the clocks striking or
surprising? Using WSD to improve MT performance. EACL Joint Workshop on Exploiting Synergies between Information
Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine
Translation (HyTra): Proceedings of the workshop, 23-24 April 2012,
Avignon, France; pp.87-92. [PDF, 283KB]
(2011) Pushpak Bhattacharyya: IndoWordNet
and multilingual resource conscious word sense disambiguation. Proceedings of the 8th international NLPSC
workshop. Special theme: Human-machine interaction in translation,
Copenhagen Business School, 20-21 August 2011; ed.Bernadette Sharp, Michael
Zock, Michael Carl, Arnt Lykke Jakobsen (Copenhagen Studies in Language 41),
Frederiksberg: Samfundslitteratur, 2011; pp.29-30. [PDF, 677KB]
(2011) Špela Vintar & Darja Fišer: Enriching Slovene WordNet with domain-specific terms.
Translation: Computation, Corpora, Cognition 1 (1), December 2011; pp.29-44.
[PDF, 631KB]
(2010) Pushpak Bhattacharyya:
IndoWordnet. LREC 2010: proceedings of the
seventh international conference on Language Resources and Evaluation,
17-23 May 2010,
World
Wide Web [see also Internet, Semantic Web,
Wikis]
(2014) Vicent Alabau & Luis A.Leiva: Collaborative web UI localization, or how to build
feature-rich multilingual datasets. Proceedings of the 17th annual
conference of the European Association for Machine Translation, EAMT 2014,
Dubrovnik, Croatia, 16th-18th June 2014, edited by Marko Tadić, Philipp
Koehn, Johann Roturier, Andy Way; pp..151-154. [PDF, 352KB]
(2014) Yi Lu, Longyue Wang, Derek F.Wong, Lidia
S.Chao, Yiming Wang, & Francisco Oliveira: Domain
adaptation for medical text translation using web resources. [WMT 2014] Proceedings of the Ninth Workshop on
Statistical Machine Translation,
(2014) Antonio Toral, Raphael Rubino, Miquel
Esplŕ-Gomis, Tommi Pirinen, Andy Way, & Gema Ramírez-Sánchez: Extrinsic
evaluation of web-crawlers in machine translation: a study on Croatian-English
for the tourism domain. Proceedings of the 17th annual conference of the
European Association for Machine Translation, EAMT 2014, Dubrovnik, Croatia,
16th-18th June 2014, edited by Marko Tadić, Philipp Koehn, Johann
Roturier, Andy Way; pp. 221-224. [PDF, 345KB]
(2013) Vassilis Papavassiliou, Prokopis Prokopidis,
& Gregor Thurmair: A modular
open-source focused crawler for mining monolingual and bilingual corpora from
the web. Proceedings of the 6th Workshop on Building and Using Comparable
Corpora,
(2013) Felix Sasaki: Metadata
for the multilingual web. Translation:
Computation, Corpora, Cognition 3
(1), June 2013; pp.19-26. [PDF, 124KB]
(2013) Felix Sasaki: Metadata
for the multilingual web: introducing internationalization tag set (ITS) 2.0.
Proceedings of the XIV Machine
Translation
(2013) Jason R.Smith, Herve Saint-Amand, Magdalena
Plamada, Philipp Koehn, Chris Callison-Burch, & Adam Lopez: Dirt cheap web-scale parallel text from the Common
Crawl. ACL-2013: Proceedings of the 51st Meeting of the Association for
Computational Linguistics,
(2013) Chengzhi Zhang, Xuchen Yao & Chunyu Kit: Finding more bilingual webpages with high
credibility via link analysis. Proceedings
of the 6th Workshop on Building and Using Comparable Corpora,
(2012) Ahmet Aker, Evangelos Kanoulas, & Robert
Gaizauskas: A light way to collect comparable
corpora from the Web. LREC 2012:
Eighth international conference on Language Resources and Evaluation, 21-27
May 2012,
(2012) Anelia Belogay, Diman Karagyozov, Svetla Koeva,
Cristina Vertan, Adam Przepiórkowski, Polivios Raxis, & Dan Cristea: Harnessing NLP technologies in the processes of
multilingual content management. [EACL 2012] Proceedings of the
Demonstrations at the 13th Conference of the European Chapter of the
Association for Computational Linguistics,
(2012) Valeria Caruso & Anna De Meo: What else can databases do to assist translators?
Illustrating a rated inventory of Web dictionaries. [Aslib 2012] Translating and the Computer 34, 29-30
November 2012, One Birdcage Walk, London, UK; 12pp. [PDF, 848KB], presentation by Martin Thomas: 50 slides
[PDF, 3336KB]
(2012) Pavel Pecina, Antonio Toral, Vassilis Papavassiliou, Prokopis
Prokopidis, & Josef van Genabith: Domain
adaptation of statistical machine translation using web-crawled resources: a
case study. EAMT 2012: Proceedings of
the 16th Annual Conference of the European Association for Machine Translation,
Trento, Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; pp.145-152. [PDF, 201KB]
(2012) Feiliang Ren: A
practical Chinese-English ON translation method based on ON’s distribution
characteristics on the web. Proceedings
of COLING 2012: Demonstration Papers, Mumbai, December 2012; pp. 239-246.
[PDF, 102KB]
(2012) Stephen D.Richardson: Using the Microsoft Translator Hub at the
Church of Jesus Christ of Latter-day Saints. AMTA-2012: the Tenth Biennial Conference of the Association for Machine
Translation in the
(2012) Ińaki San Vicente &
Iker Manterola: PaCo2: a fully automated
tool for gathering parallel corpora from the Web. LREC 2012: Eighth international conference
on Language Resources and Evaluation, 21-27 May 2012,
(2012) Embedding machine
translation in ATLAS content management system. [Project paper at] EAMT 2012: Proceedings of the 16th Annual
Conference of the European Association for Machine Translation, Trento,
Italy, May 28-30 2012, ed. Mauro Cettolo, Marcello Federico, Lucia Specia, Andy Way; p.96. [PDF, 209KB]
(2011) Takeshi Abekawa & Kyo Kageura: Using seed terms for crawling bilingual
terminology lists on the Web. Translating and the Computer 33, 17-18 November 2011,
(2011) Sven Christian Andrä
& Jörg Schütz: The semantically-enriched
translation interoperability protocol. [IJCNLP 2011] Proceedings of Workshop on Language Resources, Technology and Services
in the Sharing Paradigm,
(2011) Duo Ding: Integrate
multilingual web search results using cross-lingual topic models. [IJCNLP
2011] Proceedings of the 5th
International Joint Conference on Natural Language Processing,
(2011) Théo Hoffenberg & Christophe Brun-Franc: An innovative platform to allow full
translation of internet sites. [EAMT
2011]: proceedings of the 15th conference of the European Association for
Machine Translation, 30-31 May 2011, Leuven, Belgium; eds. Mikel L.Forcada,
Heidi Depraetere, Vincent Vandeghinste; pp.41-45. [PDF, 316KB]
(2011) Richard Ishida: The multilingual web: latest
developments at the W3C/IETF. Translating and the Computer 33, 17-18 November 2011,
(2011) Miguel A.Jiménez-Crespo: To adapt or not to adapt in web localization: a
contrastive genre-based study of original and localised legal sections in
corporate websites. Journal of Specialised Translation 15 (January
2011); pp.2-27. [PDF, 237KB]
(2011) Wang Ling, Pável Calado, Bruno Martins, Isabel
Trancoso, Alan Black, & Luísa Coheur: Named
entity translation using anchor texts. IWSLT
2011: Proceedings of the International Workshop on Spoken Language Translation,
San Francisco, December 8-9, 2011, ed. Marcello Federico, Mei-Yuh Hwang, Margit
Rödder, Sebastian Stüker;
pp.206-213. [PDF, 442KB]
(2011) Spencer
Rarrick, Chris Quirk, & Will Lewis: MT
detection in web-scraped parallel corpora. MT Summit XIII: the Thirteenth Machine Translation Summit
[organized by the] Asia-Pacific Association for Machine Translation (AAMT),
19-23 September 2011, Xiamen, China; pp.422-429. [PDF, 323KB]
(2011) Simon Shi, Pascale Fung, Emmanuel Prochasson,
Chi-kiu Lo, & Dekai Wu: Mining parallel
documents using low bandwidth and high precision CLIR from the heterogeneous
web. [IJCNLP 2011] Proceedings of the
5th International Joint Conference on Natural Language Processing,
(2011) Johanka Spoustová & Miroslav Spousta: Comparable fora. ACL 2011: Proceedings of the Fourth Workshop on Building and Using
Comparable Corpora,
(2011) Cristina Vertan & Monica Gavrila: Using manual and parallel aligned corpora for
machine translation services within an on-line content management system. AEPC 2011: proceedings of the Second
Workshop on Annotation and Exploitation of Parallel Corpora, associated
with the 8th International Conference on Recent Advances in Natural Language
Processing (RANLP 2011), 15th September 2011, Hissar, Bulgaria; pp.53-58. [PDF,
361KB]
(2011) Arnaud Vié, Luis Villarejo Muńoz, Mireia Farrús Cabeceran, &
Jimmy O’Regan: Apertium advanced web interface:
a first step towards interactivity and language tools convergence. Proceedings of the Second International
Workshop on Free/Open-Source Rule-Based Machine Translation, Barcelona,
Spain, January 20-21, 2011, ed. F.Sánchez-Martínez and J.A.Pérez-Ortiz;
pp.45-51. [PDF, 280KB]
(2011) Cesare Zanca: Developing translation strategies and cultural
awareness using corpora and the web.
Tralogy,
(2011) LIWP – EU language industry web platform. (European
Machine Translation Projects.) [EAMT
2011]: proceedings of the 15th conference of the European Association for
Machine Translation, 30-31 May 2011, Leuven, Belgium; eds. Mikel L.Forcada,
Heidi Depraetere, Vincent Vandeghinste; p.339. [PDF, 40KB]
(2010) Ahmet Aker &
Robert Gaizauskas: Model summaries for
location-related images. LREC 2010:
proceedings of the seventh international conference on Language Resources and
Evaluation, 17-23 May 2010,
(2010) José Joăo Almeida & Alberto Simőes: Automatic parallel corpora and bilingual
terminology extraction from parallel websites. [LREC 2010] Proceedings
of the 3rd Workshop on Building and Using Comparable Corpora,
(2010) Christian Boitet, Huynh Cong Phap, Nguyen Hong Thai, &
Valérie Bellynck: The iMAG concept: multilingual
access gateway to an elected web sites with incremental quality increase
through collaborative post-edition of MT pretranslations. TALN 2010. Proceedings of Traitement
Automatique du Langage Naturel, 19-23 juillet 2010.
(2010) Sean Colbath: Terminology management for web
monitoring. AMTA 2010: the Ninth
conference of the Association for Machine Translation in the Americas,
(2010) Dmitry Davidov
& Ari Rappoport: Automated translation of
semantic relationships. Coling 2010:
23rd International Conference on Computational Linguistics. Proceedings of
the conference, 23-27 August 2010,
(2010) Alain Désilets: WeBiText: multilingual
concordancer built from public high quality web content. AMTA 2010: the Ninth conference of the Association for Machine
Translation in the Americas,
(2010) Miquel
Esplŕ-Gomis & Mikel L.Forcada: Combining
content-based and URL-based heuristics to harvest aligned bitexts from
multilingual sites with Bitextor. Fourth Machine Translation Marathon “Open Source Tools for
Machine Translation”, 25-30 January,
(2010) Yanhui Feng, Yu
Hong, Zhenxiang Yan, Jianmin Yao, & Qiaoming Zhu: A novel method for bilingual web page acquisition
from search engine web records. Coling
2010: 23rd International Conference on Computational Linguistics, 23-27
August 2010, Beijing International Convention Center, Beijing, China, Posters volume; pp.294-302. [PDF, 184KB]
(2010) Pascale Fung, Emmanuel Prochasson, & Simon Shi: Trillions of comparable documents. [LREC 2010] Proceedings
of the 3rd Workshop on Building and Using Comparable Corpora,
(2010) Aarne Ranta,
Krasimir Angelov, & Thomas Hallgren: Tools for
multilingual grammar-based translation on the web. Proceedings of the ACL 2010 System Demonstrations,
(2010) Osamuyimen Stewart, David Lubensky, Scott
Macdonald, & Julie Marcotte: Using machine
translation for localization of electronic support content: evaluating end-user
satisfaction. AMTA 2010: the Ninth
conference of the Association for Machine Translation in the Americas,
Denver, Colorado, October 31 – November 4, 2010; 6pp. [PDF, 39KB]
(2010) Yulia Tsvetkov & Shuly Wintner: Automatic acquisition of parallel corpora from
websites with dynamic content. LREC 2010: proceedings of the seventh international conference on Language Resources
and Evaluation, 17-23 May 2010,
(2010) Jakob Uszkoreit,
Jay M.Ponte, Ashok C.Popat, & Moshe Dubiner: Large scale parallel document mining for
machine translation. Coling 2010:
23rd International Conference on Computational Linguistics. Proceedings of
the conference, 23-27 August 2010,