A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment

Jing Wang; Mohit Bansal; Kevin Gimpel; Brian D. Ziebart; Clement T. Yu

doi:10.1162/tacl_a_00122

A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment

Jing Wang, Mohit Bansal, Kevin Gimpel, Brian D. Ziebart, Clement T. Yu

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Word sense induction (WSI) seeks to automatically discover the senses of a word in a corpus via unsupervised methods. We propose a sense-topic model for WSI, which treats sense and topic as two separate latent variables to be inferred jointly. Topics are informed by the entire document, while senses are informed by the local context surrounding the ambiguous word. We also discuss unsupervised ways of enriching the original corpus in order to improve model performance, including using neural word embeddings and external corpora to expand the context of each data instance. We demonstrate significant improvements over the previous state-of-the-art, achieving the best results reported to date on the SemEval-2013 WSI task.

Anthology ID:: Q15-1005
Volume:: Transactions of the Association for Computational Linguistics, Volume 3
Month:
Year:: 2015
Address:: Cambridge, MA
Editors:: Michael Collins, Lillian Lee
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 59–71
Language:
URL:: https://aclanthology.org/Q15-1005/
DOI:: 10.1162/tacl_a_00122
Bibkey:
Cite (ACL):: Jing Wang, Mohit Bansal, Kevin Gimpel, Brian D. Ziebart, and Clement T. Yu. 2015. A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment. Transactions of the Association for Computational Linguistics, 3:59–71.
Cite (Informal):: A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment (Wang et al., TACL 2015)
Copy Citation:
PDF:: https://aclanthology.org/Q15-1005.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@article{wang-etal-2015-sense,
    title = "A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment",
    author = "Wang, Jing  and
      Bansal, Mohit  and
      Gimpel, Kevin  and
      Ziebart, Brian D.  and
      Yu, Clement T.",
    editor = "Collins, Michael  and
      Lee, Lillian",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "3",
    year = "2015",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q15-1005/",
    doi = "10.1162/tacl_a_00122",
    pages = "59--71",
    abstract = "Word sense induction (WSI) seeks to automatically discover the senses of a word in a corpus via unsupervised methods. We propose a sense-topic model for WSI, which treats sense and topic as two separate latent variables to be inferred jointly. Topics are informed by the entire document, while senses are informed by the local context surrounding the ambiguous word. We also discuss unsupervised ways of enriching the original corpus in order to improve model performance, including using neural word embeddings and external corpora to expand the context of each data instance. We demonstrate significant improvements over the previous state-of-the-art, achieving the best results reported to date on the SemEval-2013 WSI task."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="wang-etal-2015-sense">
    <titleInfo>
        <title>A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Jing</namePart>
        <namePart type="family">Wang</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Mohit</namePart>
        <namePart type="family">Bansal</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Kevin</namePart>
        <namePart type="family">Gimpel</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Brian</namePart>
        <namePart type="given">D</namePart>
        <namePart type="family">Ziebart</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Clement</namePart>
        <namePart type="given">T</namePart>
        <namePart type="family">Yu</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2015</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <genre authority="bibutilsgt">journal article</genre>
    <relatedItem type="host">
        <titleInfo>
            <title>Transactions of the Association for Computational Linguistics</title>
        </titleInfo>
        <originInfo>
            <issuance>continuing</issuance>
            <publisher>MIT Press</publisher>
            <place>
                <placeTerm type="text">Cambridge, MA</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">periodical</genre>
        <genre authority="bibutilsgt">academic journal</genre>
    </relatedItem>
    <abstract>Word sense induction (WSI) seeks to automatically discover the senses of a word in a corpus via unsupervised methods. We propose a sense-topic model for WSI, which treats sense and topic as two separate latent variables to be inferred jointly. Topics are informed by the entire document, while senses are informed by the local context surrounding the ambiguous word. We also discuss unsupervised ways of enriching the original corpus in order to improve model performance, including using neural word embeddings and external corpora to expand the context of each data instance. We demonstrate significant improvements over the previous state-of-the-art, achieving the best results reported to date on the SemEval-2013 WSI task.</abstract>
    <identifier type="citekey">wang-etal-2015-sense</identifier>
    <identifier type="doi">10.1162/tacl_a_00122</identifier>
    <location>
        <url>https://aclanthology.org/Q15-1005/</url>
    </location>
    <part>
        <date>2015</date>
        <detail type="volume"><number>3</number></detail>
        <extent unit="page">
            <start>59</start>
            <end>71</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Journal Article
%T A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment
%A Wang, Jing
%A Bansal, Mohit
%A Gimpel, Kevin
%A Ziebart, Brian D.
%A Yu, Clement T.
%J Transactions of the Association for Computational Linguistics
%D 2015
%V 3
%I MIT Press
%C Cambridge, MA
%F wang-etal-2015-sense
%X Word sense induction (WSI) seeks to automatically discover the senses of a word in a corpus via unsupervised methods. We propose a sense-topic model for WSI, which treats sense and topic as two separate latent variables to be inferred jointly. Topics are informed by the entire document, while senses are informed by the local context surrounding the ambiguous word. We also discuss unsupervised ways of enriching the original corpus in order to improve model performance, including using neural word embeddings and external corpora to expand the context of each data instance. We demonstrate significant improvements over the previous state-of-the-art, achieving the best results reported to date on the SemEval-2013 WSI task.
%R 10.1162/tacl_a_00122
%U https://aclanthology.org/Q15-1005/
%U https://doi.org/10.1162/tacl_a_00122
%P 59-71

Download as File

Markdown (Informal)

[A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment](https://aclanthology.org/Q15-1005/) (Wang et al., TACL 2015)

A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment (Wang et al., TACL 2015)

ACL

Jing Wang, Mohit Bansal, Kevin Gimpel, Brian D. Ziebart, and Clement T. Yu. 2015. A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment. Transactions of the Association for Computational Linguistics, 3:59–71.