Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity

So Fukuda; Hayato Ogawa; Kaito Horio; Daisuke Kawahara; Tomohide Shibata

doi:10.18653/v1/2025.acl-srw.69

Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity

So Fukuda, Hayato Ogawa, Kaito Horio, Daisuke Kawahara, Tomohide Shibata

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

To evaluate the creativity of large language models (LLMs) in Japanese, we construct three benchmarks: Japanese Creativity Questions (JCQ), Divergent Association Task (DAT), and Story Alteration Task (SAT). JCQ comprehensively evaluates creativity using LLMs. Meanwhile, DAT and SAT measure specific aspects of creative ability using embeddings. We also analyze correlations between JCQ and DAT, JCQ and SAT, and DAT and SAT. While JCQ provides comprehensive evaluation, it is relatively time and resource intensive. In contrast, DAT and SAT offer lower comprehensiveness but enable quick, low-cost assessment. Additionally, we investigate whether training with DAT contributes to enhancing LLM creativity.

Anthology ID:: 2025.acl-srw.69
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Jin Zhao, Mingyang Wang, Zhu Liu
Venues:: ACL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 939–957
Language:
URL:: https://aclanthology.org/2025.acl-srw.69/
DOI:: 10.18653/v1/2025.acl-srw.69
Bibkey:
Cite (ACL):: So Fukuda, Hayato Ogawa, Kaito Horio, Daisuke Kawahara, and Tomohide Shibata. 2025. Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 939–957, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity (Fukuda et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-srw.69.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{fukuda-etal-2025-building,
    title = "Building {J}apanese Creativity Benchmarks and Applying them to Enhance {LLM} Creativity",
    author = "Fukuda, So  and
      Ogawa, Hayato  and
      Horio, Kaito  and
      Kawahara, Daisuke  and
      Shibata, Tomohide",
    editor = "Zhao, Jin  and
      Wang, Mingyang  and
      Liu, Zhu",
    booktitle = "Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.acl-srw.69/",
    doi = "10.18653/v1/2025.acl-srw.69",
    pages = "939--957",
    ISBN = "979-8-89176-254-1",
    abstract = "To evaluate the creativity of large language models (LLMs) in Japanese, we construct three benchmarks: Japanese Creativity Questions (JCQ), Divergent Association Task (DAT), and Story Alteration Task (SAT). JCQ comprehensively evaluates creativity using LLMs. Meanwhile, DAT and SAT measure specific aspects of creative ability using embeddings. We also analyze correlations between JCQ and DAT, JCQ and SAT, and DAT and SAT. While JCQ provides comprehensive evaluation, it is relatively time and resource intensive. In contrast, DAT and SAT offer lower comprehensiveness but enable quick, low-cost assessment. Additionally, we investigate whether training with DAT contributes to enhancing LLM creativity."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="fukuda-etal-2025-building">
    <titleInfo>
        <title>Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">So</namePart>
        <namePart type="family">Fukuda</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Hayato</namePart>
        <namePart type="family">Ogawa</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Kaito</namePart>
        <namePart type="family">Horio</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Daisuke</namePart>
        <namePart type="family">Kawahara</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Tomohide</namePart>
        <namePart type="family">Shibata</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2025-07</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Jin</namePart>
            <namePart type="family">Zhao</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Mingyang</namePart>
            <namePart type="family">Wang</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Zhu</namePart>
            <namePart type="family">Liu</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Vienna, Austria</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
        <identifier type="isbn">979-8-89176-254-1</identifier>
    </relatedItem>
    <abstract>To evaluate the creativity of large language models (LLMs) in Japanese, we construct three benchmarks: Japanese Creativity Questions (JCQ), Divergent Association Task (DAT), and Story Alteration Task (SAT). JCQ comprehensively evaluates creativity using LLMs. Meanwhile, DAT and SAT measure specific aspects of creative ability using embeddings. We also analyze correlations between JCQ and DAT, JCQ and SAT, and DAT and SAT. While JCQ provides comprehensive evaluation, it is relatively time and resource intensive. In contrast, DAT and SAT offer lower comprehensiveness but enable quick, low-cost assessment. Additionally, we investigate whether training with DAT contributes to enhancing LLM creativity.</abstract>
    <identifier type="citekey">fukuda-etal-2025-building</identifier>
    <identifier type="doi">10.18653/v1/2025.acl-srw.69</identifier>
    <location>
        <url>https://aclanthology.org/2025.acl-srw.69/</url>
    </location>
    <part>
        <date>2025-07</date>
        <extent unit="page">
            <start>939</start>
            <end>957</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity
%A Fukuda, So
%A Ogawa, Hayato
%A Horio, Kaito
%A Kawahara, Daisuke
%A Shibata, Tomohide
%Y Zhao, Jin
%Y Wang, Mingyang
%Y Liu, Zhu
%S Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
%D 2025
%8 July
%I Association for Computational Linguistics
%C Vienna, Austria
%@ 979-8-89176-254-1
%F fukuda-etal-2025-building
%X To evaluate the creativity of large language models (LLMs) in Japanese, we construct three benchmarks: Japanese Creativity Questions (JCQ), Divergent Association Task (DAT), and Story Alteration Task (SAT). JCQ comprehensively evaluates creativity using LLMs. Meanwhile, DAT and SAT measure specific aspects of creative ability using embeddings. We also analyze correlations between JCQ and DAT, JCQ and SAT, and DAT and SAT. While JCQ provides comprehensive evaluation, it is relatively time and resource intensive. In contrast, DAT and SAT offer lower comprehensiveness but enable quick, low-cost assessment. Additionally, we investigate whether training with DAT contributes to enhancing LLM creativity.
%R 10.18653/v1/2025.acl-srw.69
%U https://aclanthology.org/2025.acl-srw.69/
%U https://doi.org/10.18653/v1/2025.acl-srw.69
%P 939-957

Download as File

Markdown (Informal)

[Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity](https://aclanthology.org/2025.acl-srw.69/) (Fukuda et al., ACL 2025)

Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity (Fukuda et al., ACL 2025)

ACL

So Fukuda, Hayato Ogawa, Kaito Horio, Daisuke Kawahara, and Tomohide Shibata. 2025. Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 939–957, Vienna, Austria. Association for Computational Linguistics.