Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules

Xiaoshi Zhong; Aixin Sun; Erik Cambria

doi:10.18653/v1/P17-1039

Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules

Abstract

Extracting time expressions from free text is a fundamental task for many applications. We analyze the time expressions from four datasets and find that only a small group of words are used to express time information, and the words in time expressions demonstrate similar syntactic behaviour. Based on the findings, we propose a type-based approach, named SynTime, to recognize time expressions. Specifically, we define three main syntactic token types, namely time token, modifier, and numeral, to group time-related regular expressions over tokens. On the types we design general heuristic rules to recognize time expressions. In recognition, SynTime first identifies the time tokens from raw text, then searches their surroundings for modifiers and numerals to form time segments, and finally merges the time segments to time expressions. As a light-weight rule-based tagger, SynTime runs in real time, and can be easily expanded by simply adding keywords for the text of different types and of different domains. Experiment on benchmark datasets and tweets data shows that SynTime outperforms state-of-the-art methods.

Anthology ID:: P17-1039
Volume:: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2017
Address:: Vancouver, Canada
Editors:: Regina Barzilay, Min-Yen Kan
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 420–429
Language:
URL:: https://aclanthology.org/P17-1039/
DOI:: 10.18653/v1/P17-1039
Bibkey:
Cite (ACL):: Xiaoshi Zhong, Aixin Sun, and Erik Cambria. 2017. Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 420–429, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):: Time Expression Analysis and Recognition Using Syntactic Token Types and General Heuristic Rules (Zhong et al., ACL 2017)
Copy Citation:
PDF:: https://aclanthology.org/P17-1039.pdf
Presentation:: P17-1039.Presentation.pdf

PDF Cite Search Presentation Fix data