MeSH-based dataset for measuring the relevance of text retrieval

Won Gyu Kim; Lana Yeganova; Donald C. Comeau; W. John Wilbur; Zhiyong Lu

doi:10.18653/v1/W18-2320

MeSH-based dataset for measuring the relevance of text retrieval

Won Gyu Kim, Lana Yeganova, Donald Comeau, W John Wilbur, Zhiyong Lu

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use ... for bold, ... for italic, ... for underline, <sc>...</sc> for small-caps, <tt>...<tt> for typewriter text, <url>...</url> for URLs, <a href=...> for hyperlinks, and <par/> for paragraph breaks.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Creating simulated search environments has been of a significant interest in infor-mation retrieval, in both general and bio-medical search domains. Existing collec-tions include modest number of queries and are constructed by manually evaluat-ing retrieval results. In this work we pro-pose leveraging MeSH term assignments for creating synthetic test beds. We select a suitable subset of MeSH terms as queries, and utilize MeSH term assignments as pseudo-relevance rankings for retrieval evaluation. Using well studied retrieval functions, we show that their performance on the proposed data is consistent with similar findings in previous work. We further use the proposed retrieval evaluation framework to better understand how to combine heterogeneous sources of textual information.

Anthology ID:: W18-2320
Volume:: Proceedings of the BioNLP 2018 workshop
Month:: July
Year:: 2018
Address:: Melbourne, Australia
Editors:: Dina Demner-Fushman, Kevin Bretonnel Cohen, Sophia Ananiadou, Junichi Tsujii
Venue:: BioNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 161–165
Language:
URL:: https://aclanthology.org/W18-2320/
DOI:: 10.18653/v1/W18-2320
Bibkey:
Cite (ACL):: Won Gyu Kim, Lana Yeganova, Donald Comeau, W John Wilbur, and Zhiyong Lu. 2018. MeSH-based dataset for measuring the relevance of text retrieval. In Proceedings of the BioNLP 2018 workshop, pages 161–165, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):: MeSH-based dataset for measuring the relevance of text retrieval (Kim et al., BioNLP 2018)
Copy Citation:
PDF:: https://aclanthology.org/W18-2320.pdf
Note:: W18-2320.Notes.pdf

PDF Cite Search Note Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{kim-etal-2018-mesh,
    title = "{M}e{SH}-based dataset for measuring the relevance of text retrieval",
    author = "Kim, Won Gyu  and
      Yeganova, Lana  and
      Comeau, Donald  and
      Wilbur, W John  and
      Lu, Zhiyong",
    editor = "Demner-Fushman, Dina  and
      Cohen, Kevin Bretonnel  and
      Ananiadou, Sophia  and
      Tsujii, Junichi",
    booktitle = "Proceedings of the {B}io{NLP} 2018 workshop",
    month = jul,
    year = "2018",
    address = "Melbourne, Australia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W18-2320/",
    doi = "10.18653/v1/W18-2320",
    pages = "161--165",
    abstract = "Creating simulated search environments has been of a significant interest in infor-mation retrieval, in both general and bio-medical search domains. Existing collec-tions include modest number of queries and are constructed by manually evaluat-ing retrieval results. In this work we pro-pose leveraging MeSH term assignments for creating synthetic test beds. We select a suitable subset of MeSH terms as queries, and utilize MeSH term assignments as pseudo-relevance rankings for retrieval evaluation. Using well studied retrieval functions, we show that their performance on the proposed data is consistent with similar findings in previous work. We further use the proposed retrieval evaluation framework to better understand how to combine heterogeneous sources of textual information."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="kim-etal-2018-mesh">
    <titleInfo>
        <title>MeSH-based dataset for measuring the relevance of text retrieval</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Won</namePart>
        <namePart type="given">Gyu</namePart>
        <namePart type="family">Kim</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Lana</namePart>
        <namePart type="family">Yeganova</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Donald</namePart>
        <namePart type="family">Comeau</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">W</namePart>
        <namePart type="given">John</namePart>
        <namePart type="family">Wilbur</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Zhiyong</namePart>
        <namePart type="family">Lu</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2018-07</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the BioNLP 2018 workshop</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Dina</namePart>
            <namePart type="family">Demner-Fushman</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Kevin</namePart>
            <namePart type="given">Bretonnel</namePart>
            <namePart type="family">Cohen</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sophia</namePart>
            <namePart type="family">Ananiadou</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Junichi</namePart>
            <namePart type="family">Tsujii</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Melbourne, Australia</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>Creating simulated search environments has been of a significant interest in infor-mation retrieval, in both general and bio-medical search domains. Existing collec-tions include modest number of queries and are constructed by manually evaluat-ing retrieval results. In this work we pro-pose leveraging MeSH term assignments for creating synthetic test beds. We select a suitable subset of MeSH terms as queries, and utilize MeSH term assignments as pseudo-relevance rankings for retrieval evaluation. Using well studied retrieval functions, we show that their performance on the proposed data is consistent with similar findings in previous work. We further use the proposed retrieval evaluation framework to better understand how to combine heterogeneous sources of textual information.</abstract>
    <identifier type="citekey">kim-etal-2018-mesh</identifier>
    <identifier type="doi">10.18653/v1/W18-2320</identifier>
    <location>
        <url>https://aclanthology.org/W18-2320/</url>
    </location>
    <part>
        <date>2018-07</date>
        <extent unit="page">
            <start>161</start>
            <end>165</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T MeSH-based dataset for measuring the relevance of text retrieval
%A Kim, Won Gyu
%A Yeganova, Lana
%A Comeau, Donald
%A Wilbur, W. John
%A Lu, Zhiyong
%Y Demner-Fushman, Dina
%Y Cohen, Kevin Bretonnel
%Y Ananiadou, Sophia
%Y Tsujii, Junichi
%S Proceedings of the BioNLP 2018 workshop
%D 2018
%8 July
%I Association for Computational Linguistics
%C Melbourne, Australia
%F kim-etal-2018-mesh
%X Creating simulated search environments has been of a significant interest in infor-mation retrieval, in both general and bio-medical search domains. Existing collec-tions include modest number of queries and are constructed by manually evaluat-ing retrieval results. In this work we pro-pose leveraging MeSH term assignments for creating synthetic test beds. We select a suitable subset of MeSH terms as queries, and utilize MeSH term assignments as pseudo-relevance rankings for retrieval evaluation. Using well studied retrieval functions, we show that their performance on the proposed data is consistent with similar findings in previous work. We further use the proposed retrieval evaluation framework to better understand how to combine heterogeneous sources of textual information.
%R 10.18653/v1/W18-2320
%U https://aclanthology.org/W18-2320/
%U https://doi.org/10.18653/v1/W18-2320
%P 161-165

Download as File

Markdown (Informal)

[MeSH-based dataset for measuring the relevance of text retrieval](https://aclanthology.org/W18-2320/) (Kim et al., BioNLP 2018)

MeSH-based dataset for measuring the relevance of text retrieval (Kim et al., BioNLP 2018)

ACL

Won Gyu Kim, Lana Yeganova, Donald Comeau, W John Wilbur, and Zhiyong Lu. 2018. MeSH-based dataset for measuring the relevance of text retrieval. In Proceedings of the BioNLP 2018 workshop, pages 161–165, Melbourne, Australia. Association for Computational Linguistics.