DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module

Krish Sharma; Niyar R. Barman; Akshay Chaturvedi; Nicholas Asher

DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module

Krish Sharma, Niyar R. Barman, Akshay Chaturvedi, Nicholas Asher

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

We look at reasoning on GSM8k, a dataset of short texts presenting primary school, math problems. We find, with Mirzadeh et al (2024), that current LLM progress on the data set may not be explained by better reasoning but by exposure to a broader pretraining data distribution. We then introduce a novel information source for helping models with less data or inferior training reason better: discourse structure. We show that discourse structure improves performance for models like Llama2 13b by up to 160%. Even for models that have most likely memorized the data set, adding discourse structural information to the model still improves predictions and dramatically improves large model performance on out of distribution examples.

Anthology ID:: 2025.sigdial-1.24
Volume:: Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:: August
Year:: 2025
Address:: Avignon, France
Editors:: Frédéric Béchet, Fabrice Lefèvre, Nicholas Asher, Seokhwan Kim, Teva Merlin
Venue:: SIGDIAL
SIG:: SIGDIAL
Publisher:: Association for Computational Linguistics
Note:
Pages:: 294–310
Language:
URL:: https://aclanthology.org/2025.sigdial-1.24/
DOI:
Bibkey:
Cite (ACL):: Krish Sharma, Niyar R. Barman, Akshay Chaturvedi, and Nicholas Asher. 2025. DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module. In Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 294–310, Avignon, France. Association for Computational Linguistics.
Cite (Informal):: DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module (Sharma et al., SIGDIAL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.sigdial-1.24.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{sharma-etal-2025-dimsum,
    title = "{DIMSUM}: Discourse in Mathematical Reasoning as a Supervision Module",
    author = "Sharma, Krish  and
      Barman, Niyar R.  and
      Chaturvedi, Akshay  and
      Asher, Nicholas",
    editor = "B{\'e}chet, Fr{\'e}d{\'e}ric  and
      Lef{\`e}vre, Fabrice  and
      Asher, Nicholas  and
      Kim, Seokhwan  and
      Merlin, Teva",
    booktitle = "Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue",
    month = aug,
    year = "2025",
    address = "Avignon, France",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.sigdial-1.24/",
    pages = "294--310",
    abstract = "We look at reasoning on GSM8k, a dataset of short texts presenting primary school, math problems. We find, with Mirzadeh et al (2024), that current LLM progress on the data set may not be explained by better reasoning but by exposure to a broader pretraining data distribution. We then introduce a novel information source for helping models with less data or inferior training reason better: discourse structure. We show that discourse structure improves performance for models like Llama2 13b by up to 160{\%}. Even for models that have most likely memorized the data set, adding discourse structural information to the model still improves predictions and dramatically improves large model performance on out of distribution examples."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="sharma-etal-2025-dimsum">
    <titleInfo>
        <title>DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Krish</namePart>
        <namePart type="family">Sharma</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Niyar</namePart>
        <namePart type="given">R</namePart>
        <namePart type="family">Barman</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Akshay</namePart>
        <namePart type="family">Chaturvedi</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Nicholas</namePart>
        <namePart type="family">Asher</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2025-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Frédéric</namePart>
            <namePart type="family">Béchet</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Fabrice</namePart>
            <namePart type="family">Lefèvre</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Nicholas</namePart>
            <namePart type="family">Asher</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Seokhwan</namePart>
            <namePart type="family">Kim</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Teva</namePart>
            <namePart type="family">Merlin</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Avignon, France</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>We look at reasoning on GSM8k, a dataset of short texts presenting primary school, math problems. We find, with Mirzadeh et al (2024), that current LLM progress on the data set may not be explained by better reasoning but by exposure to a broader pretraining data distribution. We then introduce a novel information source for helping models with less data or inferior training reason better: discourse structure. We show that discourse structure improves performance for models like Llama2 13b by up to 160%. Even for models that have most likely memorized the data set, adding discourse structural information to the model still improves predictions and dramatically improves large model performance on out of distribution examples.</abstract>
    <identifier type="citekey">sharma-etal-2025-dimsum</identifier>
    <location>
        <url>https://aclanthology.org/2025.sigdial-1.24/</url>
    </location>
    <part>
        <date>2025-08</date>
        <extent unit="page">
            <start>294</start>
            <end>310</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module
%A Sharma, Krish
%A Barman, Niyar R.
%A Chaturvedi, Akshay
%A Asher, Nicholas
%Y Béchet, Frédéric
%Y Lefèvre, Fabrice
%Y Asher, Nicholas
%Y Kim, Seokhwan
%Y Merlin, Teva
%S Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue
%D 2025
%8 August
%I Association for Computational Linguistics
%C Avignon, France
%F sharma-etal-2025-dimsum
%X We look at reasoning on GSM8k, a dataset of short texts presenting primary school, math problems. We find, with Mirzadeh et al (2024), that current LLM progress on the data set may not be explained by better reasoning but by exposure to a broader pretraining data distribution. We then introduce a novel information source for helping models with less data or inferior training reason better: discourse structure. We show that discourse structure improves performance for models like Llama2 13b by up to 160%. Even for models that have most likely memorized the data set, adding discourse structural information to the model still improves predictions and dramatically improves large model performance on out of distribution examples.
%U https://aclanthology.org/2025.sigdial-1.24/
%P 294-310

Download as File

Markdown (Informal)

[DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module](https://aclanthology.org/2025.sigdial-1.24/) (Sharma et al., SIGDIAL 2025)

DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module (Sharma et al., SIGDIAL 2025)

ACL

Krish Sharma, Niyar R. Barman, Akshay Chaturvedi, and Nicholas Asher. 2025. DIMSUM: Discourse in Mathematical Reasoning as a Supervision Module. In Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 294–310, Avignon, France. Association for Computational Linguistics.