Massively Increasing TIMEX3 Resources: A Transduction Approach

Leon Derczynski, Héctor Llorens, Estela Saquete


Abstract
Automatic annotation of temporal expressions is a research challenge of great interest in the field of information extraction. Gold standard temporally-annotated resources are limited in size, which makes research using them difficult. Standards have also evolved over the past decade, so not all temporally annotated data is in the same format. We vastly increase available human-annotated temporal expression resources by converting older format resources to TimeML/TIMEX3. This task is difficult due to differing annotation methods. We present a robust conversion tool and a new, large temporal expression resource. Using this, we evaluate our conversion process by using it as training data for an existing TimeML annotation tool, achieving a 0.87 F1 measure - better than any system in the TempEval-2 timex recognition exercise.
Anthology ID:
L12-1237
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3754–3761
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/451_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Leon Derczynski, Héctor Llorens, and Estela Saquete. 2012. Massively Increasing TIMEX3 Resources: A Transduction Approach. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3754–3761, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Massively Increasing TIMEX3 Resources: A Transduction Approach (Derczynski et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/451_Paper.pdf