Benchmark for Evaluation of Danish Clinical Word Embeddings

Martin Sundahl Laursen; Jannik Skyttegaard Pedersen; Pernille Just Vinholt; Rasmus Søgaard Hansen; Thiusius Rajeeth Savarimuthu

doi:10.3384/nejlt.2000-1533.2023.4132

Benchmark for Evaluation of Danish Clinical Word Embeddings

Martin Sundahl Laursen, Jannik Skyttegaard Pedersen, Pernille Just Vinholt, Rasmus Søgaard Hansen, Thiusius Rajeeth Savarimuthu

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

In natural language processing, benchmarks are used to track progress and identify useful models. Currently, no benchmark for Danish clinical word embeddings exists. This paper describes the development of a Danish benchmark for clinical word embeddings. The clinical benchmark consists of ten datasets: eight intrinsic and two extrinsic. Moreover, we evaluate word embeddings trained on text from the clinical domain, general practitioner domain and general domain on the established benchmark. All the intrinsic tasks of the benchmark are publicly available.

Anthology ID:: 2023.nejlt-1.4
Volume:: Northern European Journal of Language Technology, Volume 9
Month:
Year:: 2023
Address:: Linköping, Sweden
Editor:: Leon Derczynski
Venue:: NEJLT
SIG:
Publisher:: Linköping University Electronic Press
Note:
Pages:
Language:
URL:: https://aclanthology.org/2023.nejlt-1.4/
DOI:: 10.3384/nejlt.2000-1533.2023.4132
Bibkey:
Cite (ACL):: Martin Sundahl Laursen, Jannik Skyttegaard Pedersen, Pernille Just Vinholt, Rasmus Søgaard Hansen, and Thiusius Rajeeth Savarimuthu. 2023. Benchmark for Evaluation of Danish Clinical Word Embeddings. Northern European Journal of Language Technology, 9.
Cite (Informal):: Benchmark for Evaluation of Danish Clinical Word Embeddings (Laursen et al., NEJLT 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.nejlt-1.4.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@article{laursen-etal-2023-benchmark,
    title = "Benchmark for Evaluation of {D}anish Clinical Word Embeddings",
    author = "Laursen, Martin Sundahl  and
      Pedersen, Jannik Skyttegaard  and
      Vinholt, Pernille Just  and
      Hansen, Rasmus S{\o}gaard  and
      Savarimuthu, Thiusius Rajeeth",
    editor = "Derczynski, Leon",
    journal = "Northern European Journal of Language Technology",
    volume = "9",
    year = "2023",
    address = {Link{\"o}ping, Sweden},
    publisher = {Link{\"o}ping University Electronic Press},
    url = "https://aclanthology.org/2023.nejlt-1.4/",
    doi = "10.3384/nejlt.2000-1533.2023.4132",
    abstract = "In natural language processing, benchmarks are used to track progress and identify useful models. Currently, no benchmark for Danish clinical word embeddings exists. This paper describes the development of a Danish benchmark for clinical word embeddings. The clinical benchmark consists of ten datasets: eight intrinsic and two extrinsic. Moreover, we evaluate word embeddings trained on text from the clinical domain, general practitioner domain and general domain on the established benchmark. All the intrinsic tasks of the benchmark are publicly available."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="laursen-etal-2023-benchmark">
    <titleInfo>
        <title>Benchmark for Evaluation of Danish Clinical Word Embeddings</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Martin</namePart>
        <namePart type="given">Sundahl</namePart>
        <namePart type="family">Laursen</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Jannik</namePart>
        <namePart type="given">Skyttegaard</namePart>
        <namePart type="family">Pedersen</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Pernille</namePart>
        <namePart type="given">Just</namePart>
        <namePart type="family">Vinholt</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Rasmus</namePart>
        <namePart type="given">Søgaard</namePart>
        <namePart type="family">Hansen</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Thiusius</namePart>
        <namePart type="given">Rajeeth</namePart>
        <namePart type="family">Savarimuthu</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2023</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <genre authority="bibutilsgt">journal article</genre>
    <relatedItem type="host">
        <titleInfo>
            <title>Northern European Journal of Language Technology</title>
        </titleInfo>
        <originInfo>
            <issuance>continuing</issuance>
            <publisher>Linköping University Electronic Press</publisher>
            <place>
                <placeTerm type="text">Linköping, Sweden</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">periodical</genre>
        <genre authority="bibutilsgt">academic journal</genre>
    </relatedItem>
    <abstract>In natural language processing, benchmarks are used to track progress and identify useful models. Currently, no benchmark for Danish clinical word embeddings exists. This paper describes the development of a Danish benchmark for clinical word embeddings. The clinical benchmark consists of ten datasets: eight intrinsic and two extrinsic. Moreover, we evaluate word embeddings trained on text from the clinical domain, general practitioner domain and general domain on the established benchmark. All the intrinsic tasks of the benchmark are publicly available.</abstract>
    <identifier type="citekey">laursen-etal-2023-benchmark</identifier>
    <identifier type="doi">10.3384/nejlt.2000-1533.2023.4132</identifier>
    <location>
        <url>https://aclanthology.org/2023.nejlt-1.4/</url>
    </location>
    <part>
        <date>2023</date>
        <detail type="volume"><number>9</number></detail>
    </part>
</mods>
</modsCollection>

Download as File

%0 Journal Article
%T Benchmark for Evaluation of Danish Clinical Word Embeddings
%A Laursen, Martin Sundahl
%A Pedersen, Jannik Skyttegaard
%A Vinholt, Pernille Just
%A Hansen, Rasmus Søgaard
%A Savarimuthu, Thiusius Rajeeth
%J Northern European Journal of Language Technology
%D 2023
%V 9
%I Linköping University Electronic Press
%C Linköping, Sweden
%F laursen-etal-2023-benchmark
%X In natural language processing, benchmarks are used to track progress and identify useful models. Currently, no benchmark for Danish clinical word embeddings exists. This paper describes the development of a Danish benchmark for clinical word embeddings. The clinical benchmark consists of ten datasets: eight intrinsic and two extrinsic. Moreover, we evaluate word embeddings trained on text from the clinical domain, general practitioner domain and general domain on the established benchmark. All the intrinsic tasks of the benchmark are publicly available.
%R 10.3384/nejlt.2000-1533.2023.4132
%U https://aclanthology.org/2023.nejlt-1.4/
%U https://doi.org/10.3384/nejlt.2000-1533.2023.4132

Download as File

Markdown (Informal)

[Benchmark for Evaluation of Danish Clinical Word Embeddings](https://aclanthology.org/2023.nejlt-1.4/) (Laursen et al., NEJLT 2023)

Benchmark for Evaluation of Danish Clinical Word Embeddings (Laursen et al., NEJLT 2023)

ACL

Martin Sundahl Laursen, Jannik Skyttegaard Pedersen, Pernille Just Vinholt, Rasmus Søgaard Hansen, and Thiusius Rajeeth Savarimuthu. 2023. Benchmark for Evaluation of Danish Clinical Word Embeddings. Northern European Journal of Language Technology, 9.