Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing

Xindi Wang; Robert E. Mercer

doi:10.18653/v1/W19-5018

Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use ... for bold, ... for italic, ... for underline, <sc>...</sc> for small-caps, <tt>...<tt> for typewriter text, <url>...</url> for URLs, <a href=...> for hyperlinks, and <par/> for paragraph breaks.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

The goal of text classification is to automatically assign categories to documents. Deep learning automatically learns effective features from data instead of adopting human-designed features. In this paper, we focus specifically on biomedical document classification using a deep learning approach. We present a novel multichannel TextCNN model for MeSH term indexing. Beyond the normal use of the text from the abstract and title for model training, we also consider figure and table captions, as well as paragraphs associated with the figures and tables. We demonstrate that these latter text sources are important feature sources for our method. A new dataset consisting of these text segments curated from 257,590 full text articles together with the articles’ MEDLINE/PubMed MeSH terms is publicly available.

Anthology ID:: W19-5018
Volume:: Proceedings of the 18th BioNLP Workshop and Shared Task
Month:: August
Year:: 2019
Address:: Florence, Italy
Editors:: Dina Demner-Fushman, Kevin Bretonnel Cohen, Sophia Ananiadou, Junichi Tsujii
Venue:: BioNLP
SIG:: SIGBIOMED
Publisher:: Association for Computational Linguistics
Note:
Pages:: 165–175
Language:
URL:: https://aclanthology.org/W19-5018/
DOI:: 10.18653/v1/W19-5018
Bibkey:
Cite (ACL):: Xindi Wang and Robert E. Mercer. 2019. Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing. In Proceedings of the 18th BioNLP Workshop and Shared Task, pages 165–175, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing (Wang & Mercer, BioNLP 2019)
Copy Citation:
PDF:: https://aclanthology.org/W19-5018.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{wang-mercer-2019-incorporating,
    title = "Incorporating Figure Captions and Descriptive Text in {M}e{SH} Term Indexing",
    author = "Wang, Xindi  and
      Mercer, Robert E.",
    editor = "Demner-Fushman, Dina  and
      Cohen, Kevin Bretonnel  and
      Ananiadou, Sophia  and
      Tsujii, Junichi",
    booktitle = "Proceedings of the 18th BioNLP Workshop and Shared Task",
    month = aug,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W19-5018/",
    doi = "10.18653/v1/W19-5018",
    pages = "165--175",
    abstract = "The goal of text classification is to automatically assign categories to documents. Deep learning automatically learns effective features from data instead of adopting human-designed features. In this paper, we focus specifically on biomedical document classification using a deep learning approach. We present a novel multichannel TextCNN model for MeSH term indexing. Beyond the normal use of the text from the abstract and title for model training, we also consider figure and table captions, as well as paragraphs associated with the figures and tables. We demonstrate that these latter text sources are important feature sources for our method. A new dataset consisting of these text segments curated from 257,590 full text articles together with the articles' MEDLINE/PubMed MeSH terms is publicly available."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="wang-mercer-2019-incorporating">
    <titleInfo>
        <title>Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Xindi</namePart>
        <namePart type="family">Wang</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Robert</namePart>
        <namePart type="given">E</namePart>
        <namePart type="family">Mercer</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2019-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 18th BioNLP Workshop and Shared Task</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Dina</namePart>
            <namePart type="family">Demner-Fushman</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Kevin</namePart>
            <namePart type="given">Bretonnel</namePart>
            <namePart type="family">Cohen</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sophia</namePart>
            <namePart type="family">Ananiadou</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Junichi</namePart>
            <namePart type="family">Tsujii</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Florence, Italy</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>The goal of text classification is to automatically assign categories to documents. Deep learning automatically learns effective features from data instead of adopting human-designed features. In this paper, we focus specifically on biomedical document classification using a deep learning approach. We present a novel multichannel TextCNN model for MeSH term indexing. Beyond the normal use of the text from the abstract and title for model training, we also consider figure and table captions, as well as paragraphs associated with the figures and tables. We demonstrate that these latter text sources are important feature sources for our method. A new dataset consisting of these text segments curated from 257,590 full text articles together with the articles’ MEDLINE/PubMed MeSH terms is publicly available.</abstract>
    <identifier type="citekey">wang-mercer-2019-incorporating</identifier>
    <identifier type="doi">10.18653/v1/W19-5018</identifier>
    <location>
        <url>https://aclanthology.org/W19-5018/</url>
    </location>
    <part>
        <date>2019-08</date>
        <extent unit="page">
            <start>165</start>
            <end>175</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing
%A Wang, Xindi
%A Mercer, Robert E.
%Y Demner-Fushman, Dina
%Y Cohen, Kevin Bretonnel
%Y Ananiadou, Sophia
%Y Tsujii, Junichi
%S Proceedings of the 18th BioNLP Workshop and Shared Task
%D 2019
%8 August
%I Association for Computational Linguistics
%C Florence, Italy
%F wang-mercer-2019-incorporating
%X The goal of text classification is to automatically assign categories to documents. Deep learning automatically learns effective features from data instead of adopting human-designed features. In this paper, we focus specifically on biomedical document classification using a deep learning approach. We present a novel multichannel TextCNN model for MeSH term indexing. Beyond the normal use of the text from the abstract and title for model training, we also consider figure and table captions, as well as paragraphs associated with the figures and tables. We demonstrate that these latter text sources are important feature sources for our method. A new dataset consisting of these text segments curated from 257,590 full text articles together with the articles’ MEDLINE/PubMed MeSH terms is publicly available.
%R 10.18653/v1/W19-5018
%U https://aclanthology.org/W19-5018/
%U https://doi.org/10.18653/v1/W19-5018
%P 165-175

Download as File

Markdown (Informal)

[Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing](https://aclanthology.org/W19-5018/) (Wang & Mercer, BioNLP 2019)

Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing (Wang & Mercer, BioNLP 2019)

ACL

Xindi Wang and Robert E. Mercer. 2019. Incorporating Figure Captions and Descriptive Text in MeSH Term Indexing. In Proceedings of the 18th BioNLP Workshop and Shared Task, pages 165–175, Florence, Italy. Association for Computational Linguistics.