Overview of the 2014 NLP Unshared Task in PoliInformatics

We describe a research activity carried out during January‐April 2014, seeking to increase engagement between the natural language processing research community and social science scholars. In this activity, participants were offered a corpus of text relevant to the 2007‐8 financial crisis and an open-ended prompt. Their responses took the form of a short paper and an optional demonstration, to which a panel of judges will respond with the goal of identifying efforts with the greatest potential for future interdisciplinary collaboration.


Introduction
In recent years, numerous interdisciplinary research meetings have sought to bring together computer scientists with expertise in automated text data analysis and scholars with substantive interests that might make use of text data. The latter group has included political scientists, economists, and communications scholars. An NSF Research Coordination Network grant to encourage research using open government data was awarded to co-authors Washington and Wilkerson in 2013. The network for Political Informatics, or PoliInformatics, brought together a steering committee from diverse research backgrounds that convened in February 2013. At that meeting, a substantive focus on the 2007-8 nancial crisis was selected.
Drawing inspiration from the "shared task" model that has been successful in the natural language processing community, we designed a research competition for computer scientists. In a shared task, a gold-standard dataset is created in advance of the competition, inputs and outputs are defined by the organizers, typically creating a supervised learning setup with held-out data used for evaluation. Constraints on the resources that may be used are typically set in place as well, to focus the energies of participants on a core problem, and the official evaluation scores are published, usually as open-source software. Final systems (or system output) is submitted by a deadline and judged automatically against the goldstandard. Participants report on their systems in short papers, typically presented at a meeting associated with a conference or workshop.
With neither a clear definition of what the final outcome might be, nor the resources to create the necessary gold-standard data, we developed a more open-ended competition. A text corpus was collected and made available, and a prompt was offered. Participants were given freedom in how to respond; competition entries took the form of short research papers and optional demonstrations of the results of the projects. Rather than an objective score, a panel of judges organized by the PoliInformatics steering committee offered public reviews of the work, with an emphasis on potential for future interdisciplinary research efforts that might stem from these preliminary projects.

Setup
The prompts offered to participants were: Who was the financial crisis? We seek to understand the participants in the lawmaking and regulatory processes that formed the government's response to the crisis: the individuals, industries, and professionals targeted by those policies; the agencies and organizations responsible for implementing them; and the lobbyists, witnesses, advocates, and politicians who were actively involved-and the connections among them.
What was the financial crisis? We seek to understand the cause(s) of the crisis, proposals for reform, advocates for those proposals, arguments for and against, policies ultimately adopted by the government, and the impact of those policies.
The set of datasets made available is listed in Table 1. Several additional datasets were suggested on the website, 1 but were not part of the official data.

Response
Forty teams initially registered to participate in the unshared task; ten submitted papers. The teams came from a variety of institutions spread across six countries. Half of the teams included links to online demonstrations or browsable system output. At this writing, the papers are under review by the panel of judges. We provide a very brief summary of the contributions of each team.
3.1 Who was the financial crisis? Bordea et al. (2014) inferred importance and hierarchy of topics along with expertise mining to find which participants in the discourse might be experts (e.g., Paul Volcker and "proprietary trading") based on FOMC, FCIC, and Congressional hearing and report data. Baerg et al. (2014) considered transcripts of the FOMC, developing a method for scaling the preferences of its members with respect to inflation (hawks to doves); the method incorporates automatic dimensionality reduction and expert topic interpretation. Zirn et al. (2014) also focused on the transcripts, distinguishing between position-taking statements and shorter "discussion elements" that express agreement or disagreement rather than substance, and used this analysis to quantify similarity among FOMC members and take first steps toward extraction of sub-dialogues among them. Bourreau and Poibeau (2014) focused on the FCIC report and the two Congressional reports, identifying named entities and then visualizing correlations among mentions both statically (as networks) and dynamically. Clark et al. (2014) considered Congressional hearings, applying a reasoning model that integrates analysis of social roles and relationships with analysis of individual beliefs in hope of detecting opinion shifts and signs of influence.
With an eye toward substantive hypotheses about dependencies among banks' access to 1 https://sites.google.com/site/ unsharedtask2014 bailout funds relating to underlying social connections, Morales et al. (2014) automatically extracted a social network from the corpus alongside structured data in Freebase.

What was the financial crisis?
Miller and McCoy (2014) considered FOMC transcripts, applying topic models for dimensionality reduction and viewing topic proportions as time series.
In a study of the TARP, Dodd-Frank, and the health reform bills, Li et al. (2014) explored the ideas expressed in those bills, applying models of text reuse from bills introduced in the 110th and 111th Congresses. Wang et al. (2014) implemented a queryfocused summarization system for FOMC and FCIC meeting transcripts and Congressional hearings, incorporating topic and expertise measures into the score, and queried the corpus with candidate causes for the crisis, derived from Wikipedia (e.g., "subprime lending" and "growth housing bubble"). Kleinnijenhuis et al. (2014) considered Congressional hearings alongside news text from the United States and the United Kingdom, carrying out keyword analysis to compare and measure directional effects between the two, on different dimensions.

Conclusion
The unshared task was successful in attracting the interest of forty participants working on ten teams. A highly diverse range of activities ensued, each of which is being reviewed at this writing by a panel of judges. Reviews and final outcomes will be posted at the https://sites.google. com/site/unsharedtask2014 as soon as they are available, and a presentation summarizing the competition will be part of the ACL 2014 Workshop on Language Technologies and Computational Social Science.