The Cambridge language survey

Paul Procter


Abstract
The Cambridge Language Survey is a research project whose activities centre around the use of an Integrated Language Database, whereby a computerised dictionary is used for intelligent cross-reference during corpus analysis - searching for example for all the inflections of a verb rather than just the base form. Types of grammatical coding and semantic categorisation appropriate to such a computerised dictionary are discussed, as are software tools for parsing, finding collocations, and performing sense-tagging. The weighted evaluation of semantic, grammatical, and collocational information to discriminate between word senses is described in some detail. Mention is made of several branches of research including the development of parallel corpora, semantic interpretation by sense-tagging, and the use of a Learner Corpus for the analysis of errors made by non-native-speakers. Sense-tagging is identified as an under-exploited approach to language analysis and one for which great opportunities for product development exist.
Anthology ID:
1993.eamt-1.5
Volume:
Third International EAMT Workshop: Machine Translation and the Lexicon
Month:
April 26–28
Year:
1993
Address:
Heidelberg, Germany
Editors:
Robert E. Frederking, Kathryn B. Taylor
Venue:
EAMT
SIG:
Publisher:
Springer Berlin Heidelberg
Note:
Pages:
77–84
Language:
URL:
https://aclanthology.org/1993.eamt-1.5
DOI:
Bibkey:
Cite (ACL):
Paul Procter. 1993. The Cambridge language survey. In Third International EAMT Workshop: Machine Translation and the Lexicon, pages 77–84, Heidelberg, Germany. Springer Berlin Heidelberg.
Cite (Informal):
The Cambridge language survey (Procter, EAMT 1993)
Copy Citation: