Shinjini Ghosh


pdf bib
Alignment via Mutual Information
Shinjini Ghosh | Yoon Kim | Ramon Fernandez Astudillo | Tahira Naseem | Jacob Andreas
Proceedings of the 27th Conference on Computational Natural Language Learning (CoNLL)

Many language learning tasks require learners to infer correspondences between data in two modalities. Often, these alignments are many-to-many and context-sensitive. For example, translating into morphologically rich languages requires learning not just how words, but morphemes, should be translated; words and morphemes may have different meanings (or groundings) depending on the context in which they are used. We describe an information-theoretic approach to context-sensitive, many-to-many alignment. Our approach first trains a masked sequence model to place distributions over missing spans in (source, target) sequences. Next, it uses this model to compute pointwise mutual information between source and target spans conditional on context. Finally, it aligns spans with high mutual information. We apply this approach to two learning problems: character-based word translation (using alignments for joint morphological segmentation and lexicon learning) and visually grounded reference resolution (using alignments to jointly localize referents and learn word meanings). In both cases, our proposed approach outperforms both structured and neural baselines, showing that conditional mutual information offers an effective framework for formalizing alignment problems in general domains.