Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study

Renzo Alva Principe; Marco Viviani; Nicola Chiarini

Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study

Renzo Alva Principe, Marco Viviani, Nicola Chiarini

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Information Extraction (IE) is a key task in Natural Language Processing (NLP) that transforms unstructured text into structured data. This study compares human annotation, rule-based systems, and Large Language Models (LLMs) for domain-specific IE, focusing on real estate auction documents. We assess each method in terms of accuracy, scalability, and cost-efficiency, highlighting the associated trade-offs. Our findings provide valuable insights into the effectiveness of using LLMs for the considered task and, more broadly, offer guidance on how organizations can balance automation, maintainability, and performance when selecting the most suitable IE solution.

Anthology ID:: 2025.ldk-1.25
Volume:: Proceedings of the 5th Conference on Language, Data and Knowledge
Month:: September
Year:: 2025
Address:: Naples, Italy
Editors:: Mehwish Alam, Andon Tchechmedjiev, Jorge Gracia, Dagmar Gromann, Maria Pia di Buono, Johanna Monti, Maxim Ionov
Venue:: LDK
SIG:
Publisher:: Unior Press
Note:
Pages:: 243–254
Language:
URL:: https://aclanthology.org/2025.ldk-1.25/
DOI:
Bibkey:
Cite (ACL):: Renzo Alva Principe, Marco Viviani, and Nicola Chiarini. 2025. Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study. In Proceedings of the 5th Conference on Language, Data and Knowledge, pages 243–254, Naples, Italy. Unior Press.
Cite (Informal):: Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study (Principe et al., LDK 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.ldk-1.25.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{principe-etal-2025-enhancing,
    title = "Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study",
    author = "Principe, Renzo Alva  and
      Viviani, Marco  and
      Chiarini, Nicola",
    editor = "Alam, Mehwish  and
      Tchechmedjiev, Andon  and
      Gracia, Jorge  and
      Gromann, Dagmar  and
      di Buono, Maria Pia  and
      Monti, Johanna  and
      Ionov, Maxim",
    booktitle = "Proceedings of the 5th Conference on Language, Data and Knowledge",
    month = sep,
    year = "2025",
    address = "Naples, Italy",
    publisher = "Unior Press",
    url = "https://aclanthology.org/2025.ldk-1.25/",
    pages = "243--254",
    ISBN = "978-88-6719-333-2",
    abstract = "Information Extraction (IE) is a key task in Natural Language Processing (NLP) that transforms unstructured text into structured data. This study compares human annotation, rule-based systems, and Large Language Models (LLMs) for domain-specific IE, focusing on real estate auction documents. We assess each method in terms of accuracy, scalability, and cost-efficiency, highlighting the associated trade-offs. Our findings provide valuable insights into the effectiveness of using LLMs for the considered task and, more broadly, offer guidance on how organizations can balance automation, maintainability, and performance when selecting the most suitable IE solution."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="principe-etal-2025-enhancing">
    <titleInfo>
        <title>Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Renzo</namePart>
        <namePart type="given">Alva</namePart>
        <namePart type="family">Principe</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Marco</namePart>
        <namePart type="family">Viviani</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Nicola</namePart>
        <namePart type="family">Chiarini</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2025-09</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 5th Conference on Language, Data and Knowledge</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Mehwish</namePart>
            <namePart type="family">Alam</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Andon</namePart>
            <namePart type="family">Tchechmedjiev</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jorge</namePart>
            <namePart type="family">Gracia</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Dagmar</namePart>
            <namePart type="family">Gromann</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Maria</namePart>
            <namePart type="given">Pia</namePart>
            <namePart type="family">di Buono</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Johanna</namePart>
            <namePart type="family">Monti</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Maxim</namePart>
            <namePart type="family">Ionov</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Unior Press</publisher>
            <place>
                <placeTerm type="text">Naples, Italy</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
        <identifier type="isbn">978-88-6719-333-2</identifier>
    </relatedItem>
    <abstract>Information Extraction (IE) is a key task in Natural Language Processing (NLP) that transforms unstructured text into structured data. This study compares human annotation, rule-based systems, and Large Language Models (LLMs) for domain-specific IE, focusing on real estate auction documents. We assess each method in terms of accuracy, scalability, and cost-efficiency, highlighting the associated trade-offs. Our findings provide valuable insights into the effectiveness of using LLMs for the considered task and, more broadly, offer guidance on how organizations can balance automation, maintainability, and performance when selecting the most suitable IE solution.</abstract>
    <identifier type="citekey">principe-etal-2025-enhancing</identifier>
    <location>
        <url>https://aclanthology.org/2025.ldk-1.25/</url>
    </location>
    <part>
        <date>2025-09</date>
        <extent unit="page">
            <start>243</start>
            <end>254</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study
%A Principe, Renzo Alva
%A Viviani, Marco
%A Chiarini, Nicola
%Y Alam, Mehwish
%Y Tchechmedjiev, Andon
%Y Gracia, Jorge
%Y Gromann, Dagmar
%Y di Buono, Maria Pia
%Y Monti, Johanna
%Y Ionov, Maxim
%S Proceedings of the 5th Conference on Language, Data and Knowledge
%D 2025
%8 September
%I Unior Press
%C Naples, Italy
%@ 978-88-6719-333-2
%F principe-etal-2025-enhancing
%X Information Extraction (IE) is a key task in Natural Language Processing (NLP) that transforms unstructured text into structured data. This study compares human annotation, rule-based systems, and Large Language Models (LLMs) for domain-specific IE, focusing on real estate auction documents. We assess each method in terms of accuracy, scalability, and cost-efficiency, highlighting the associated trade-offs. Our findings provide valuable insights into the effectiveness of using LLMs for the considered task and, more broadly, offer guidance on how organizations can balance automation, maintainability, and performance when selecting the most suitable IE solution.
%U https://aclanthology.org/2025.ldk-1.25/
%P 243-254

Download as File

Markdown (Informal)

[Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study](https://aclanthology.org/2025.ldk-1.25/) (Principe et al., LDK 2025)

Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study (Principe et al., LDK 2025)

ACL

Renzo Alva Principe, Marco Viviani, and Nicola Chiarini. 2025. Enhancing Information Extraction with Large Language Models: A Comparison with Human Annotation and Rule-Based Methods in a Real Estate Case Study. In Proceedings of the 5th Conference on Language, Data and Knowledge, pages 243–254, Naples, Italy. Unior Press.