Disambiguation of English PP attachment using multilingual aligned data

Lee Schwartz; Takako Aikawa; Chris Quirk

Disambiguation of English PP attachment using multilingual aligned data

Lee Schwartz, Takako Aikawa, Chris Quirk

Abstract

Prepositional phrase attachment (PP attachment) is a major source of ambiguity in English. It poses a substantial challenge to Machine Translation (MT) between English and languages that are not characterized by PP attachment ambiguity. In this paper we present an unsupervised, bilingual, corpus-based approach to the resolution of English PP attachment ambiguity. As data we use aligned linguistic representations of the English and Japanese sentences from a large parallel corpus of technical texts. The premise of our approach is that with large aligned, parsed, bilingual (or multilingual) corpora, languages can learn non-trivial linguistic information from one another with high accuracy. We contend that our approach can be extended to linguistic phenomena other than PP attachment.

Anthology ID:: 2003.mtsummit-papers.44
Volume:: Proceedings of Machine Translation Summit IX: Papers
Month:: September 23-27
Year:: 2003
Address:: New Orleans, USA
Venue:: MTSummit
SIG:
Publisher:
Note:
Pages:
Language:
URL:: https://aclanthology.org/2003.mtsummit-papers.44/
DOI:
Bibkey:
Cite (ACL):: Lee Schwartz, Takako Aikawa, and Chris Quirk. 2003. Disambiguation of English PP attachment using multilingual aligned data. In Proceedings of Machine Translation Summit IX: Papers, New Orleans, USA.
Cite (Informal):: Disambiguation of English PP attachment using multilingual aligned data (Schwartz et al., MTSummit 2003)
Copy Citation:
PDF:: https://aclanthology.org/2003.mtsummit-papers.44.pdf

PDF Cite Search Fix data