Adapting the TTL Romanian POS Tagger to the Biomedical Domain

Maria Mitrofan; Radu Ion

doi:10.26615/978-954-452-044-1_002

Adapting the TTL Romanian POS Tagger to the Biomedical Domain

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

This paper presents the adaptation of the Hidden Markov Models-based TTL part-of-speech tagger to the biomedical domain. TTL is a text processing platform that performs sentence splitting, tokenization, POS tagging, chunking and Named Entity Recognition (NER) for a number of languages, including Romanian. The POS tagging accuracy obtained by the TTL POS tagger exceeds 97% when TTL’s baseline model is updated with training information from a Romanian biomedical corpus. This corpus is developed in the context of the CoRoLa (a reference corpus for the contemporary Romanian language) project. Informative description and statistics of the Romanian biomedical corpus are also provided.

Anthology ID:: W17-8002
Volume:: Proceedings of the Biomedical NLP Workshop associated with RANLP 2017
Month:: September
Year:: 2017
Address:: Varna, Bulgaria
Editors:: Svetla Boytcheva, Kevin Bretonnel Cohen, Guergana Savova, Galia Angelova
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd.
Note:
Pages:: 8–14
Language:
URL:: https://doi.org/10.26615/978-954-452-044-1_002
DOI:: 10.26615/978-954-452-044-1_002
Bibkey:
Cite (ACL):: Maria Mitrofan and Radu Ion. 2017. Adapting the TTL Romanian POS Tagger to the Biomedical Domain. In Proceedings of the Biomedical NLP Workshop associated with RANLP 2017, pages 8–14, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):: Adapting the TTL Romanian POS Tagger to the Biomedical Domain (Mitrofan & Ion, RANLP 2017)
Copy Citation:
PDF:: https://doi.org/10.26615/978-954-452-044-1_002

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{mitrofan-ion-2017-adapting,
    title = "Adapting the {TTL} {R}omanian {POS} Tagger to the Biomedical Domain",
    author = "Mitrofan, Maria  and
      Ion, Radu",
    editor = "Boytcheva, Svetla  and
      Cohen, Kevin Bretonnel  and
      Savova, Guergana  and
      Angelova, Galia",
    booktitle = "Proceedings of the Biomedical {NLP} Workshop associated with {RANLP} 2017",
    month = sep,
    year = "2017",
    address = "Varna, Bulgaria",
    publisher = "INCOMA Ltd.",
    url = "https://aclanthology.org/W17-8002/",
    doi = "10.26615/978-954-452-044-1_002",
    pages = "8--14",
    abstract = "This paper presents the adaptation of the Hidden Markov Models-based TTL part-of-speech tagger to the biomedical domain. TTL is a text processing platform that performs sentence splitting, tokenization, POS tagging, chunking and Named Entity Recognition (NER) for a number of languages, including Romanian. The POS tagging accuracy obtained by the TTL POS tagger exceeds 97{\%} when TTL{'}s baseline model is updated with training information from a Romanian biomedical corpus. This corpus is developed in the context of the CoRoLa (a reference corpus for the contemporary Romanian language) project. Informative description and statistics of the Romanian biomedical corpus are also provided."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="mitrofan-ion-2017-adapting">
    <titleInfo>
        <title>Adapting the TTL Romanian POS Tagger to the Biomedical Domain</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Maria</namePart>
        <namePart type="family">Mitrofan</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Radu</namePart>
        <namePart type="family">Ion</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2017-09</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Biomedical NLP Workshop associated with RANLP 2017</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Svetla</namePart>
            <namePart type="family">Boytcheva</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Kevin</namePart>
            <namePart type="given">Bretonnel</namePart>
            <namePart type="family">Cohen</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Guergana</namePart>
            <namePart type="family">Savova</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Galia</namePart>
            <namePart type="family">Angelova</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>INCOMA Ltd.</publisher>
            <place>
                <placeTerm type="text">Varna, Bulgaria</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>This paper presents the adaptation of the Hidden Markov Models-based TTL part-of-speech tagger to the biomedical domain. TTL is a text processing platform that performs sentence splitting, tokenization, POS tagging, chunking and Named Entity Recognition (NER) for a number of languages, including Romanian. The POS tagging accuracy obtained by the TTL POS tagger exceeds 97% when TTL’s baseline model is updated with training information from a Romanian biomedical corpus. This corpus is developed in the context of the CoRoLa (a reference corpus for the contemporary Romanian language) project. Informative description and statistics of the Romanian biomedical corpus are also provided.</abstract>
    <identifier type="citekey">mitrofan-ion-2017-adapting</identifier>
    <identifier type="doi">10.26615/978-954-452-044-1_002</identifier>
    <location>
        <url>https://aclanthology.org/W17-8002/</url>
    </location>
    <part>
        <date>2017-09</date>
        <extent unit="page">
            <start>8</start>
            <end>14</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Adapting the TTL Romanian POS Tagger to the Biomedical Domain
%A Mitrofan, Maria
%A Ion, Radu
%Y Boytcheva, Svetla
%Y Cohen, Kevin Bretonnel
%Y Savova, Guergana
%Y Angelova, Galia
%S Proceedings of the Biomedical NLP Workshop associated with RANLP 2017
%D 2017
%8 September
%I INCOMA Ltd.
%C Varna, Bulgaria
%F mitrofan-ion-2017-adapting
%X This paper presents the adaptation of the Hidden Markov Models-based TTL part-of-speech tagger to the biomedical domain. TTL is a text processing platform that performs sentence splitting, tokenization, POS tagging, chunking and Named Entity Recognition (NER) for a number of languages, including Romanian. The POS tagging accuracy obtained by the TTL POS tagger exceeds 97% when TTL’s baseline model is updated with training information from a Romanian biomedical corpus. This corpus is developed in the context of the CoRoLa (a reference corpus for the contemporary Romanian language) project. Informative description and statistics of the Romanian biomedical corpus are also provided.
%R 10.26615/978-954-452-044-1_002
%U https://aclanthology.org/W17-8002/
%U https://doi.org/10.26615/978-954-452-044-1_002
%P 8-14

Download as File

Markdown (Informal)

[Adapting the TTL Romanian POS Tagger to the Biomedical Domain](https://aclanthology.org/W17-8002/) (Mitrofan & Ion, RANLP 2017)

Adapting the TTL Romanian POS Tagger to the Biomedical Domain (Mitrofan & Ion, RANLP 2017)

ACL

Maria Mitrofan and Radu Ion. 2017. Adapting the TTL Romanian POS Tagger to the Biomedical Domain. In Proceedings of the Biomedical NLP Workshop associated with RANLP 2017, pages 8–14, Varna, Bulgaria. INCOMA Ltd..