@inproceedings{radev-etal-2004-cst,
    title = "{CST} Bank: A Corpus for the Study of Cross-document Structural Relationships",
    author = "Radev, Dragomir  and
      Otterbacher, Jahna  and
      Zhang, Zhu",
    editor = "Lino, Maria Teresa  and
      Xavier, Maria Francisca  and
      Ferreira, F{\'a}tima  and
      Costa, Rute  and
      Silva, Raquel",
    booktitle = "Proceedings of the Fourth International Conference on Language Resources and Evaluation ({LREC}{'}04)",
    month = may,
    year = "2004",
    address = "Lisbon, Portugal",
    publisher = "European Language Resources Association (ELRA)",
    url = "https://aclanthology.org/L04-1239/",
    abstract = "Clusters of multiple news stories related to the same topic exhibit a number of interesting properties. For example, when documents have been published at various points in time or by different authors or news agencies, one finds many instances of paraphrasing, information overlap and even contradiction. The current paper presents the Cross-document Structure Theory (CST) Bank, a collection of multi-document clusters in which pairs of sentences from different documents have been annotated for cross-document structure theory relationships. We will describe how we built the corpus, including our method for reducing the number of sentence pairs to be annotated by our hired judges, using lexical similarity measures. Finally, we will describe how CST and the CST Bank can be applied to different research areas such as multi-document summarization."
}<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="radev-etal-2004-cst">
    <titleInfo>
        <title>CST Bank: A Corpus for the Study of Cross-document Structural Relationships</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Dragomir</namePart>
        <namePart type="family">Radev</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Jahna</namePart>
        <namePart type="family">Otterbacher</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Zhu</namePart>
        <namePart type="family">Zhang</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2004-05</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Maria</namePart>
            <namePart type="given">Teresa</namePart>
            <namePart type="family">Lino</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Maria</namePart>
            <namePart type="given">Francisca</namePart>
            <namePart type="family">Xavier</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Fátima</namePart>
            <namePart type="family">Ferreira</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Rute</namePart>
            <namePart type="family">Costa</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Raquel</namePart>
            <namePart type="family">Silva</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>European Language Resources Association (ELRA)</publisher>
            <place>
                <placeTerm type="text">Lisbon, Portugal</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>Clusters of multiple news stories related to the same topic exhibit a number of interesting properties. For example, when documents have been published at various points in time or by different authors or news agencies, one finds many instances of paraphrasing, information overlap and even contradiction. The current paper presents the Cross-document Structure Theory (CST) Bank, a collection of multi-document clusters in which pairs of sentences from different documents have been annotated for cross-document structure theory relationships. We will describe how we built the corpus, including our method for reducing the number of sentence pairs to be annotated by our hired judges, using lexical similarity measures. Finally, we will describe how CST and the CST Bank can be applied to different research areas such as multi-document summarization.</abstract>
    <identifier type="citekey">radev-etal-2004-cst</identifier>
    <location>
        <url>https://aclanthology.org/L04-1239/</url>
    </location>
    <part>
        <date>2004-05</date>
    </part>
</mods>
</modsCollection>
%0 Conference Proceedings
%T CST Bank: A Corpus for the Study of Cross-document Structural Relationships
%A Radev, Dragomir
%A Otterbacher, Jahna
%A Zhang, Zhu
%Y Lino, Maria Teresa
%Y Xavier, Maria Francisca
%Y Ferreira, Fátima
%Y Costa, Rute
%Y Silva, Raquel
%S Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
%D 2004
%8 May
%I European Language Resources Association (ELRA)
%C Lisbon, Portugal
%F radev-etal-2004-cst
%X Clusters of multiple news stories related to the same topic exhibit a number of interesting properties. For example, when documents have been published at various points in time or by different authors or news agencies, one finds many instances of paraphrasing, information overlap and even contradiction. The current paper presents the Cross-document Structure Theory (CST) Bank, a collection of multi-document clusters in which pairs of sentences from different documents have been annotated for cross-document structure theory relationships. We will describe how we built the corpus, including our method for reducing the number of sentence pairs to be annotated by our hired judges, using lexical similarity measures. Finally, we will describe how CST and the CST Bank can be applied to different research areas such as multi-document summarization.
%U https://aclanthology.org/L04-1239/
Markdown (Informal)
[CST Bank: A Corpus for the Study of Cross-document Structural Relationships](https://aclanthology.org/L04-1239/) (Radev et al., LREC 2004)
ACL