Advances in Debating Technologies: Building AI That Can Debate Humans

The tutorial focuses on Debating Technologies, a sub-field of computational argumentation defined as “computational technologies developed directly to enhance, support, and engage with human debating” (Gurevych et al., 2016). A recent milestone in this field is Project Debater, which was revealed in 2019 as the first AI system that can debate human experts on complex topics. Project Debater is the third in the series of IBM Research AI’s grand challenges, following Deep Blue and Watson. It has been developed for over six years by a large team of researchers and engineers, and its live demonstration in February 2019 received massive media attention. This research effort has resulted in more than 50 scientific papers to date, and many datasets freely available for research purposes. We discuss the scientific challenges that arise when building such a system, including argument mining, argument quality assessment, stance classification, principled argument detection, narrative generation, and rebutting a human opponent. Many of the underlying capabilities of Project Debater have been made freely available for academic research, and the tutorial will include a detailed explanation of how to use and leverage these tools. In addition to discussing individual components, the tutorial also provides a holistic view of a debating system. Such a view is largely missing in the academic literature, where each paper typically addresses a specific problem in isolation. We present a complete pipeline of a debating system, and discuss the information flow and the interaction between the various components. Finally, we discuss practical applications and future challenges of debating technologies.

1 Tutorial Description

Background and Goals
Argumentation and debating are fundamental capabilities of human intelligence. They are essential for a wide range of everyday activities that involve reasoning, decision making or persuasion. Computational Argumentation is defined as "the application of computational methods for analyzing and synthesizing argumentation and human debate" . Over the last few years, this field has been rapidly evolving, as evident by the growing research community, and the increasing number of publications in top NLP and AI conferences.
The tutorial focuses on Debating Technologies, a sub-field of computational argumentation defined as "computational technologies developed directly to enhance, support, and engage with human debating" . A recent milestone in this field is Project Debater, which was revealed in 2019 as the first AI system that can debate human experts on complex topics. 1 Project Debater is the third in the series of IBM Research AI's grand challenges, following Deep Blue and Watson. It has been developed for over six years by a large team of researchers and engineers, and its live demonstration in February 2019 received massive media attention. This research effort has resulted in more than 50 scientific papers to date, and many datasets freely available for research purposes.
In this tutorial, we aim to answer the question: "what does it take to build a system that can debate humans"? Our main focus is on the scientific problems such system must tackle. Some of these intriguing problems include argument retrieval for a given debate topic, argument quality assessment and stance classification, identifying relevant prin-1 https://www.research.ibm.com/ artificial-intelligence/project-debater/ cipled arguments to be used in conjunction with corpus-mined arguments, organizing the arguments into a compelling narrative, recognizing the arguments made by the human opponent and making a rebuttal. For each of these problems we will present relevant scientific work from various research groups as well as our own. Many of the underlying capabilities of Project Debater have been made freely available for academic research, and the tutorial will include a detailed explanation of how to use and leverage these tools.
A complementary goal of the tutorial is to provide a holistic view of a debating system. Such a view is largely missing in the academic literature, where each paper typically addresses a specific problem in isolation. We present a complete pipeline of a debating system, and discuss the information flow and the interaction between the various components. We will also share our experience and lessons learned from developing such a complex, large scale NLP system. Finally, the tutorial will discuss practical applications and future challenges of debating technologies.

Contents
In this section we provide more details about the contents of the tutorial. The tutorial outline and estimated schedule are listed in Section 3.
Introduction. The tutorial first provides an introduction to computational argumentation. It then introduces the Project Debater grand challenge and provides a high-level view of the building blocks that comprise a debating system. The next parts of the tutorial describe each of these building blocks in depth.
Argument mining. The core of a debating system is argument mining -finding relevant arguments and argument components (claim/conclusion, evidence/premise) for a given debate topic, either in a given article, or in a large corpus.
Argument evaluation and analysis. The next tasks in the pipeline involve analysis of the extracted arguments. Argument quality assessment aims to select the more convincing arguments. Stance classification aims to distinguish between arguments that support our side in the debate and those supporting the opponent's side.
Modeling human dilemma. A complementary source for argumentation that is widely used by professional human debaters is principled arguments, which are relevant for a wide variety of topics. A common example is the black market argument, potentially relevant in the context of debates on banning a specific product or a service (e.g., "we should ban alcohol"). By this argument, imposing a ban leads to the creation of a black market, which in turn makes products or services obtained therein less safe, leads to exploitation, attracts criminal elements, and so on. We discuss recent work on creating a taxonomy of common principled arguments and automatically matching relevant arguments from this taxonomy to a given debate topic.
Listening comprehension and rebuttal. In addition to presenting one side of the debate, engaging in a competitive debate further requires a debating system to effectively rebut arguments raised by the human opponent. The system must listen to an argumentative speech in real-time, understand the main arguments, and produce persuasive counterarguments.
The nature of the argumentation domain and the characteristics of competitive debates make the understanding of such spoken content challenging. Expressed ideas often span multiple, nonconsecutive sentences and many arguments are alluded to rather than explicitly stated. Further difficulty stems from the requirement to identify and rebut the most important parts of a speech that is several minutes long. This contrasts with today's conversational agents, which aim at understanding a single functional command from short inputs.
Core NLP capabilities. This section describes several core NLP capabilities developed as part of Project Debater, including thematic clustering, highly scalable Wikification and semantic similarity for phrases and Wikipedia concepts.
From arguments to narrative. A debating system must arrange the arguments obtained from various sources (arguments mined from a corpus, principled arguments, and counter arguments for rebuttal) into a coherent and persuasive narrative that would keep the audience's attention for several minutes. This section describes the various steps in the narrative generation pipeline. We also discuss the role of humor in keeping a debate lively.
Moving forward -applications and implications. In this part we discuss possible applications and future directions for debating technologies. As an example, we present Speech by Crowd, a platform for crowdsourcing decision support. This platform collects arguments from large audiences on debatable topics and generates meaningful narratives summarizing the arguments for each side of the debate. We also discuss Key Point Analysis, a novel method for extracting the main points in a large collection of arguments, and quantifying the prevalence of each point in the data.
Demo session -using debating technologies in your application. Many of the Project Debater components presented in this tutorial have been recently released as cloud APIs, and are freely available for academic use. 2 In the final part of the tutorial, we provide an overview of these APIs, and demonstrate their use for building practical applications.

Relevance to the Computational Linguistics Community
The tutorial is relevant to a broad audience of NLP researchers and practitioners, working on problems related to argumentation mining, stance classification, discourse analysis, text summarization, NLG, dialogue systems, and more.

Tutorial Type
This is a cutting-edge tutorial. The main difference between this tutorial and previous tutorials on computational argumentation or argument mining Budzynska and Reed, 2019) is that we focus on the science behind debating systems -systems that can engage in a live debate with humans. Accordingly, a large portion of the tutorial's topics, e.g., listening comprehension, rebuttal, narrative generation and modeling human dilemma, was not covered in previous tutorials. Some of the topics, like argument mining, argument quality and stance classification were previously discussed in the tutorial of , however we will mostly focus on more recent advancements in these areas. The tutorial of Budzynska and Reed (2019) focused on argument structure parsing based on argumentation theory, which can be viewed as complementary to the content of the current tutorial.

Prerequisites
The tutorial will be self-contained. We assume basic knowledge of NLP and machine learning, at the level of introductory courses in these areas.