Rachel Panckhurst


2017

La tâche de normalisation automatique des messages issus de la communication électronique médiée requiert une étape préalable consistant à identifier les phénomènes linguistiques. Dans cet article, nous proposons deux typologies pour l’annotation de textes non standard en français, relevant respectivement des niveaux morpho-lexical et morpho-syntaxique. Ces typologies ont été développées en conciliant les typologies existantes et en les faisant évoluer en parallèle d’une annotation manuelle de tweets et de SMS.

2014

In this paper, we propose a method for aligning text messages (entitled AlignSMS) in order to automatically build an SMS dictionary. An extract of 100 text messages from the 88milSMS corpus (Panckhurst el al., 2013, 2014) was used as an initial test. More than 90,000 authentic text messages in French were collected from the general public by a group of academics in the south of France in the context of the sud4science project (http://www.sud4science.org). This project is itself part of a vast international SMS data collection project, entitled sms4science (http://www.sms4science.org, Fairon et al. 2006, Cougnon, 2014). After corpus collation, pre-processing and anonymisation (Accorsi et al., 2012, Patel et al., 2013), we discuss how “raw” anonymised text messages can be transcoded into normalised text messages, using a statistical alignment method. The future objective is to set up a hybrid (symbolic/statistic) approach based on both grammar rules and our statistical AlignSMS method.

2008

We outline a methodological classification for evaluation approaches of software in general. This classification was initiated partly owing to involvement in a biennial European competition (the European Academic Software Award, EASA) which was held for over a decade. The evaluation grid used in EASA gradually became obsolete and inappropriate in recent years, and therefore needed to be revised. In order to do this, it was important to situate the competition in relation to other software evaluation procedures. A methodological perspective for the classification is adopted rather than a conceptual one, since a number of difficulties arise with the latter. We focus on three main questions: What to evaluate? How to evaluate? and Who does evaluate? The classification is therefore hybrid: it allows one to account for the most common evaluation approaches and is also an observatory. Two main approaches are differentiated: system and usage. We conclude that any evaluation always constructs its own object, and the objects to be evaluated only partially determine the evaluation which can be applied to them. Generally speaking, this allows one to begin apprehending what type of knowledge is objectified when one or another approach is chosen.