Comparing PTB and UD information for PDTB discourseconnective identification

Kelvin Han, Phyllicia Leavitt, Srilakshmi Balard


Abstract
Our work on the automatic detection of English discourse connectives in the Penn Discourse Treebank (PDTB) shows that syntactic information from the Universal Dependencies (UD) framework is a viable alternative to that from the Penn Treebank (PTB) framework. In fact, we found minor increases when comparing between the use of gold standard PTB part-of-speech (POS) tag information and automatically parsed UD information. The former has traditionally been used for the task but there are now much more UD corpora and in many more languages than that available in the PTB framework. As such, this finding is promising for areas in discourse parsing such as in multilingual as well as under production settings, where gold standard PTB information may be scarce.
Anthology ID:
2020.jeptalnrecital-recital.10
Volume:
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 3 : Rencontre des Étudiants Chercheurs en Informatique pour le TAL
Month:
6
Year:
2020
Address:
Nancy, France
Editors:
Christophe Benzitoun, Chloé Braud, Laurine Huber, David Langlois, Slim Ouni, Sylvain Pogodalla, Stéphane Schneider
Venue:
JEP/TALN/RECITAL
SIG:
Publisher:
ATALA et AFCP
Note:
Pages:
123–136
Language:
URL:
https://aclanthology.org/2020.jeptalnrecital-recital.10
DOI:
Bibkey:
Cite (ACL):
Kelvin Han, Phyllicia Leavitt, and Srilakshmi Balard. 2020. Comparing PTB and UD information for PDTB discourseconnective identification. In Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 3 : Rencontre des Étudiants Chercheurs en Informatique pour le TAL, pages 123–136, Nancy, France. ATALA et AFCP.
Cite (Informal):
Comparing PTB and UD information for PDTB discourseconnective identification (Han et al., JEP/TALN/RECITAL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.jeptalnrecital-recital.10.pdf