2021
pdf
bib
abs
Relying on Discourse Analysis to Answer Complex Questions by Neural Machine Reading Comprehension
Boris Galitsky
|
Dmitry Ilvovsky
|
Elizaveta Goncharova
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Machine reading comprehension (MRC) is one of the most challenging tasks in natural language processing domain. Recent state-of-the-art results for MRC have been achieved with the pre-trained language models, such as BERT and its modifications. Despite the high performance of these models, they still suffer from the inability to retrieve correct answers from the detailed and lengthy passages. In this work, we introduce a novel scheme for incorporating the discourse structure of the text into a self-attention network, and, thus, enrich the embedding obtained from the standard BERT encoder with the additional linguistic knowledge. We also investigate the influence of different types of linguistic information on the model’s ability to answer complex questions that require deep understanding of the whole text. Experiments performed on the SQuAD benchmark and more complex question answering datasets have shown that linguistic enhancing boosts the performance of the standard BERT model significantly.
pdf
bib
abs
Correcting Texts Generated by Transformers using Discourse Features and Web Mining
Alexander Chernyavskiy
|
Dmitry Ilvovsky
|
Boris Galitsky
Proceedings of the Student Research Workshop Associated with RANLP 2021
Recent transformer-based approaches to NLG like GPT-2 can generate syntactically coherent original texts. However, these generated texts have serious flaws: global discourse incoherence and meaninglessness of sentences in terms of entity values. We address both of these flaws: they are independent but can be combined to generate original texts that will be both consistent and truthful. This paper presents an approach to estimate the quality of discourse structure. Empirical results confirm that the discourse structure of currently generated texts is inaccurate. We propose the research directions to correct it using discourse features during the fine-tuning procedure. The suggested approach is universal and can be applied to different languages. Apart from that, we suggest a method to correct wrong entity values based on Web Mining and text alignment.
2020
pdf
bib
abs
Controlling Chat Bot Multi-Document Navigation with the Extended Discourse Trees
Dmitry Ilvovsky
|
Alexander Kirillovich
|
Boris Galitsky
Proceedings of the Fourth International Conference on Computational Linguistics in Bulgaria (CLIB 2020)
In this paper we learn how to manage a dialogue relying on discourse of its utterances. We define extended discourse trees, introduce means to manipulate with them, and outline scenarios of multi-document navigation to extend the abilities of the interactive information retrieval-based chat bot. We also provide evaluation results of the comparison between conventional search and chat bot enriched with the multi-document navigation.
pdf
bib
Automatic planning of the dialogue between human and machine using discourse trees
Boris Galitsky
|
Dmitry Ilvovsky
Proceedings of the Workshop on Discourse Theories for Text Planning
pdf
bib
abs
Interrupt me Politely: Recommending Products and Services by Joining Human Conversation
Boris Galitsky
|
Dmitry Ilvovsky
Proceedings of Workshop on Natural Language Processing in E-Commerce
We propose a novel way of conversational recommendation, where instead of asking questions to the user to acquire their preferences; the recommender tracks their conversation with other people, including customer support agents (CSA), and joins the conversation only when it is time to introduce a recommendation. Building a recommender that joins a human conversation (RJC), we propose information extraction, discourse and argumentation analyses, as well as dialogue management techniques to compute a recommendation for a product and service that is needed by the customer, as inferred from the conversation. A special case of such conversations is considered where the customer raises his problem with CSA in an attempt to resolve it, along with receiving a recommendation for a product with features addressing this problem. We evaluate performance of RJC is in a number of human-human and human-chat bot dialogues, and demonstrate that RJC is an efficient and less intrusive way to provide high relevance and persuasive recommendations.
pdf
bib
abs
On a Chatbot Navigating a User through a Concept-Based Knowledge Model
Boris Galitsky
|
Dmitry Ilvovsky
|
Elizaveta Goncharova
Proceedings of Workshop on Natural Language Processing in E-Commerce
Information retrieval chatbots are widely used as assistants, to help users formulate their requirements about the products they want to purchase, and navigate to the set of items that satisfies their requirements in the best way. The work of the modern chatbots is based mostly on the deep learning theory behind the knowledge model that can improve the performance of the system. In our work, we are developing a concept-based knowledge model that encapsulates objects and their common descriptions. The leveraging of the concept-based knowledge model allows the system to refine the initial users’ requests and lead them to the set of objects with the maximal variability of parameters that matters less to them. Introducing the additional textual characteristics allows users to formulate their initial query as a phrase in natural language, rather than as some standard request in the form of, “Attribute - value”.
2019
pdf
bib
abs
Two Discourse Tree - Based Approaches to Indexing Answers
Boris Galitsky
|
Dmitry Ilvovsky
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
We explore anatomy of answers with respect to which text fragments from an answer are worth matching with a question and which should not be matched. We apply the Rhetorical Structure Theory to build a discourse tree of an answer and select elementary discourse units that are suitable for indexing. Manual rules for selection of these discourse units as well as automated classification based on web search engine mining are evaluated con-cerning improving search accuracy. We form two sets of question-answer pairs for FAQ and community QA search domains and use them for evaluation of the proposed indexing methodology, which delivers up to 16 percent improvement in search recall.
pdf
bib
abs
Discourse-Based Approach to Involvement of Background Knowledge for Question Answering
Boris Galitsky
|
Dmitry Ilvovsky
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
We introduce a concept of a virtual discourse tree to improve question answering (Q/A) recall for complex, multi-sentence questions. Augmenting the discourse tree of an answer with tree fragments obtained from text corpora playing the role of ontology, we obtain on the fly a canonical discourse representation of this answer that is independent of the thought structure of a given author. This mechanism is critical for finding an answer that is not only relevant in terms of questions entities but also in terms of inter-relations between these entities in an answer and its style. We evaluate the Q/A system enabled with virtual discourse trees and observe a substantial increase of performance answering complex questions such as Yahoo! Answers and www.2carpros.com.
pdf
bib
abs
On a Chatbot Providing Virtual Dialogues
Boris Galitsky
|
Dmitry Ilvovsky
|
Elizaveta Goncharova
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
We present a chatbot that delivers content in the form of virtual dialogues automatically produced from the plain texts that are extracted and selected from the documents. This virtual dialogue content is provided in the form of answers derived from the found and selected documents split into fragments, and questions that are automatically generated for these answers based on the initial text.
pdf
bib
On a Chatbot Conducting a Virtual Dialogue in Financial Domain
Boris Galitsky
|
Dmitry Ilvovsky
Proceedings of the First Workshop on Financial Technology and Natural Language Processing
pdf
bib
abs
On a Chatbot Conducting Dialogue-in-Dialogue
Boris Galitsky
|
Dmitry Ilvovsky
|
Elizaveta Goncharova
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue
We demo a chatbot that delivers content in the form of virtual dialogues automatically produced from plain texts extracted and selected from documents. This virtual dialogue content is provided in the form of answers derived from the found and selected documents split into fragments, and questions are automatically generated for these answers.
2018
pdf
bib
abs
Building Dialogue Structure from Discourse Tree of a Question
Boris Galitsky
|
Dmitry Ilvovsky
Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI
In this section we propose a reasoning-based approach to a dialogue management for a customer support chat bot. To build a dialogue scenario, we analyze the discourse tree (DT) of an initial query of a customer support dialogue that is frequently complex and multi-sentence. We then enforce rhetorical agreement between DT of the initial query and that of the answers, requests and responses. The chat bot finds answers, which are not only relevant by topic but also suitable for a given step of a conversation and match the question by style, communication means, experience level and other domain-independent attributes. We evaluate a performance of proposed algorithm in car repair domain and observe a 5 to 10% improvement for single and three-step dialogues respectively, in comparison with baseline approaches to dialogue management.
2017
pdf
bib
abs
Chatbot with a Discourse Structure-Driven Dialogue Management
Boris Galitsky
|
Dmitry Ilvovsky
Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics
We build a chat bot with iterative content exploration that leads a user through a personalized knowledge acquisition session. The chat bot is designed as an automated customer support or product recommendation agent assisting a user in learning product features, product usability, suitability, troubleshooting and other related tasks. To control the user navigation through content, we extend the notion of a linguistic discourse tree (DT) towards a set of documents with multiple sections covering a topic. For a given paragraph, a DT is built by DT parsers. We then combine DTs for the paragraphs of documents to form what we call extended DT, which is a basis for interactive content exploration facilitated by the chat bot. To provide cohesive answers, we use a measure of rhetoric agreement between a question and an answer by tree kernel learning of their DTs.
pdf
bib
abs
On a Chat Bot Finding Answers with Optimal Rhetoric Representation
Boris Galitsky
|
Dmitry Ilvovsky
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
We demo a chat bot with the focus on complex, multi-sentence questions that enforce what we call rhetoric agreement of answers with questions. Chat bot finds answers which are not only relevant by topic but also match the question by style, argumentation patterns, communication means, experience level and other attributes. The system achieves rhetoric agreement by learning pairs of discourse trees (DTs) for question (Q) and answer (A). We build a library of best answer DTs for most types of complex questions. To better recognize a valid rhetoric agreement between Q and A, DTs are extended with the labels for communicative actions. An algorithm for finding the best DT for an A, given a Q, is evaluated.
2016
pdf
bib
abs
A Tool for Efficient Content Compilation
Boris Galitsky
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations
We build a tool to assist in content creation by mining the web for information relevant to a given topic. This tool imitates the process of essay writing by humans: searching for topics on the web, selecting content frag-ments from the found document, and then compiling these fragments to obtain a coherent text. The process of writing starts with automated building of a table of content by obtaining the list of key entities for the given topic extracted from web resources such as Wikipedia. Once a table of content is formed, each item forms a seed for web mining. The tool builds a full-featured structured Word document with table of content, section structure, images and captions and web references for all mined text fragments. Two linguistic technologies are employed: for relevance verification, we use similarity computed as a tree similarity between parse trees for a seed and candidate text fragment. For text coherence, we use a measure of agreement between a given and consecutive paragraph by tree kernel learning of their discourse trees. The tool is available at
http://animatronica.io/submit.html.
2015
pdf
bib
Rhetoric Map of an Answer to Compound Queries
Boris Galitsky
|
Dmitry Ilvovsky
|
Sergey O. Kuznetsov
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
pdf
bib
Text Classification into Abstract Classes Based on Discourse Structure
Boris Galitsky
|
Dmitry Ilvovsky
|
Sergey O. Kuznetsov
Proceedings of the International Conference Recent Advances in Natural Language Processing
pdf
bib
News clustering approach based on discourse text structure
Tatyana Makhalova
|
Dmitry Ilvovsky
|
Boris Galitsky
Proceedings of the First Workshop on Computing News Storylines
2013
pdf
bib
Matching sets of parse trees for answering multi-sentence questions
Boris Galitsky
|
Dmitry Ilvovsky
|
Sergei O. Kuznetsov
|
Fedor Strok
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013
2007
pdf
bib
Book Reviews: Commonsense Reasoning, by Erik T. Mueller
Boris Galitsky
Computational Linguistics, Volume 33, Number 1, March 2007