2016
pdf
bib
abs
Yes, We Care! Results of the Ethics and Natural Language Processing Surveys
Karën Fort
|
Alain Couillault
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
We present here the context and results of two surveys (a French one and an international one) concerning Ethics and NLP, which we designed and conducted between June and September 2015. These surveys follow other actions related to raising concern for ethics in our community, including a Journée d’études, a workshop and the Ethics and Big Data Charter. The concern for ethics shows to be quite similar in both surveys, despite a few differences which we present and discuss. The surveys also lead to think there is a growing awareness in the field concerning ethical issues, which translates into a willingness to get involved in ethics-related actions, to debate about the topic and to see ethics be included in major conferences themes. We finally discuss the limits of the surveys and the means of action we consider for the future. The raw data from the two surveys are freely available online.
pdf
bib
abs
Covering various Needs in Temporal Annotation: a Proposal of Extension of ISO TimeML that Preserves Upward Compatibility
Anaïs Lefeuvre-Halftermeyer
|
Jean-Yves Antoine
|
Alain Couillault
|
Emmanuel Schang
|
Lotfi Abouda
|
Agata Savary
|
Denis Maurel
|
Iris Eshkol
|
Delphine Battistelli
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
This paper reports a critical analysis of the ISO TimeML standard, in the light of several experiences of temporal annotation that were conducted on spoken French. It shows that the norm suffers from weaknesses that should be corrected to fit a larger variety of needs inNLP and in corpus linguistics. We present our proposition of some improvements of the norm before it will be revised by the ISO Committee in 2017. These modifications concern mainly (1) Enrichments of well identified features of the norm: temporal function of TIMEX time expressions, additional types for TLINK temporal relations; (2) Deeper modifications concerning the units or features annotated: clarification between time and tense for EVENT units, coherence of representation between temporal signals (the SIGNAL unit) and TIMEX modifiers (the MOD feature); (3) A recommendation to perform temporal annotation on top of a syntactic (rather than lexical) layer (temporal annotation on a treebank).
2014
pdf
bib
abs
Evaluating corpora documentation with regards to the Ethics and Big Data Charter
Alain Couillault
|
Karën Fort
|
Gilles Adda
|
Hugues de Mazancourt
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
The authors have written the Ethic and Big Data Charter in collaboration with various agencies, private bodies and associations. This Charter aims at describing any large or complex resources, and in particular language resources, from a legal and ethical viewpoint and ensuring the transparency of the process of creating and distributing such resources. We propose in this article an analysis of the documentation coverage of the most frequently mentioned language resources with regards to the Charter, in order to show the benefit it offers.