@inproceedings{lichtefeld-etal-2024-redacted,
title = "Redacted Contextual Question Answering with Generative Large Language Models",
author = "Lichtefeld, Jacob and
Cecil, Joe A. and
Hedges, Alex and
Abramson, Jeremy and
Freedman, Marjorie",
editor = "Mitkov, Ruslan and
Ezzini, Saad and
Ranasinghe, Tharindu and
Ezeani, Ignatius and
Khallaf, Nouran and
Acarturk, Cengiz and
Bradbury, Matthew and
El-Haj, Mo and
Rayson, Paul",
booktitle = "Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security",
month = jul,
year = "2024",
address = "Lancaster, UK",
publisher = "International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security",
url = "https://aclanthology.org/2024.nlpaics-1.25/",
pages = "230--237",
abstract = "Many contexts, such as medicine, finance, and cybersecurity, require controlled release of private or internal information. Traditionally, manually redacting sensitive information for release is an arduous and costly process, and while generative Large Language Models (gLLM) show promise at document-based ques- tion answering and summarization, their ability to do so while redacting sensitive information has not been widely explored. To address this, we introduce a new task, called redacted contextual question answering (RC-QA). This explores a gLLM{'}s ability to collaborate with a trusted user in a question-answer task as a proxy for drafting a public release informed by the redaction of potentially sensitive information, presented here in the form of constraints on the answers. We introduce a sample question-answer dataset for this task using publicly available data with four sample constraints. We present evaluation results for five language models and two refined models. Our results show that most models{---}especially open-source models{---}struggle to accurately answer questions under these constraints. We hope that these preliminary results help catalyze further exploration into this topic, and to that end, we make our code and data avail- able at https://github.com/isi-vista/ redacted-contextual-question-answering."
}<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="lichtefeld-etal-2024-redacted">
<titleInfo>
<title>Redacted Contextual Question Answering with Generative Large Language Models</title>
</titleInfo>
<name type="personal">
<namePart type="given">Jacob</namePart>
<namePart type="family">Lichtefeld</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Joe</namePart>
<namePart type="given">A</namePart>
<namePart type="family">Cecil</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Alex</namePart>
<namePart type="family">Hedges</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Jeremy</namePart>
<namePart type="family">Abramson</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Marjorie</namePart>
<namePart type="family">Freedman</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<originInfo>
<dateIssued>2024-07</dateIssued>
</originInfo>
<typeOfResource>text</typeOfResource>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security</title>
</titleInfo>
<name type="personal">
<namePart type="given">Ruslan</namePart>
<namePart type="family">Mitkov</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Saad</namePart>
<namePart type="family">Ezzini</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Tharindu</namePart>
<namePart type="family">Ranasinghe</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Ignatius</namePart>
<namePart type="family">Ezeani</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Nouran</namePart>
<namePart type="family">Khallaf</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Cengiz</namePart>
<namePart type="family">Acarturk</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Matthew</namePart>
<namePart type="family">Bradbury</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Mo</namePart>
<namePart type="family">El-Haj</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Paul</namePart>
<namePart type="family">Rayson</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<originInfo>
<publisher>International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security</publisher>
<place>
<placeTerm type="text">Lancaster, UK</placeTerm>
</place>
</originInfo>
<genre authority="marcgt">conference publication</genre>
</relatedItem>
<abstract>Many contexts, such as medicine, finance, and cybersecurity, require controlled release of private or internal information. Traditionally, manually redacting sensitive information for release is an arduous and costly process, and while generative Large Language Models (gLLM) show promise at document-based ques- tion answering and summarization, their ability to do so while redacting sensitive information has not been widely explored. To address this, we introduce a new task, called redacted contextual question answering (RC-QA). This explores a gLLM’s ability to collaborate with a trusted user in a question-answer task as a proxy for drafting a public release informed by the redaction of potentially sensitive information, presented here in the form of constraints on the answers. We introduce a sample question-answer dataset for this task using publicly available data with four sample constraints. We present evaluation results for five language models and two refined models. Our results show that most models—especially open-source models—struggle to accurately answer questions under these constraints. We hope that these preliminary results help catalyze further exploration into this topic, and to that end, we make our code and data avail- able at https://github.com/isi-vista/ redacted-contextual-question-answering.</abstract>
<identifier type="citekey">lichtefeld-etal-2024-redacted</identifier>
<location>
<url>https://aclanthology.org/2024.nlpaics-1.25/</url>
</location>
<part>
<date>2024-07</date>
<extent unit="page">
<start>230</start>
<end>237</end>
</extent>
</part>
</mods>
</modsCollection>
%0 Conference Proceedings
%T Redacted Contextual Question Answering with Generative Large Language Models
%A Lichtefeld, Jacob
%A Cecil, Joe A.
%A Hedges, Alex
%A Abramson, Jeremy
%A Freedman, Marjorie
%Y Mitkov, Ruslan
%Y Ezzini, Saad
%Y Ranasinghe, Tharindu
%Y Ezeani, Ignatius
%Y Khallaf, Nouran
%Y Acarturk, Cengiz
%Y Bradbury, Matthew
%Y El-Haj, Mo
%Y Rayson, Paul
%S Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security
%D 2024
%8 July
%I International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security
%C Lancaster, UK
%F lichtefeld-etal-2024-redacted
%X Many contexts, such as medicine, finance, and cybersecurity, require controlled release of private or internal information. Traditionally, manually redacting sensitive information for release is an arduous and costly process, and while generative Large Language Models (gLLM) show promise at document-based ques- tion answering and summarization, their ability to do so while redacting sensitive information has not been widely explored. To address this, we introduce a new task, called redacted contextual question answering (RC-QA). This explores a gLLM’s ability to collaborate with a trusted user in a question-answer task as a proxy for drafting a public release informed by the redaction of potentially sensitive information, presented here in the form of constraints on the answers. We introduce a sample question-answer dataset for this task using publicly available data with four sample constraints. We present evaluation results for five language models and two refined models. Our results show that most models—especially open-source models—struggle to accurately answer questions under these constraints. We hope that these preliminary results help catalyze further exploration into this topic, and to that end, we make our code and data avail- able at https://github.com/isi-vista/ redacted-contextual-question-answering.
%U https://aclanthology.org/2024.nlpaics-1.25/
%P 230-237
Markdown (Informal)
[Redacted Contextual Question Answering with Generative Large Language Models](https://aclanthology.org/2024.nlpaics-1.25/) (Lichtefeld et al., NLPAICS 2024)
ACL
- Jacob Lichtefeld, Joe A. Cecil, Alex Hedges, Jeremy Abramson, and Marjorie Freedman. 2024. Redacted Contextual Question Answering with Generative Large Language Models. In Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security, pages 230–237, Lancaster, UK. International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security.