Sudha Verma


2012

pdf bib
Foundations of a Multilayer Annotation Framework for Twitter Communications During Crisis Events
William J. Corvey | Sudha Verma | Sarah Vieweg | Martha Palmer | James H. Martin
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In times of mass emergency, vast amounts of data are generated via computer-mediated communication (CMC) that are difficult to manually collect and organize into a coherent picture. Yet valuable information is broadcast, and can provide useful insight into time- and safety-critical situations if captured and analyzed efficiently and effectively. We describe a natural language processing component of the EPIC (Empowering the Public with Information in Crisis) Project infrastructure, designed to extract linguistic and behavioral information from tweet text to aid in the task of information integration. The system incorporates linguistic annotation, in the form of Named Entity Tagging, as well as behavioral annotations to capture tweets contributing to situational awareness and analyze the information type of the tweet content. We show classification results and describe future integration of these classifiers in the larger EPIC infrastructure.