Can You Spot the Semantic Predicate in this Video?

Christopher Reale; Claire Bonial; Heesung Kwon; Clare Voss

Can You Spot the Semantic Predicate in this Video?

Christopher Reale, Claire Bonial, Heesung Kwon, Clare Voss

Correct Metadata for

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

We propose a method to improve human activity recognition in video by leveraging semantic information about the target activities from an expert-defined linguistic resource, VerbNet. Our hypothesis is that activities that share similar event semantics, as defined by the semantic predicates of VerbNet, will be more likely to share some visual components. We use a deep convolutional neural network approach as a baseline and incorporate linguistic information from VerbNet through multi-task learning. We present results of experiments showing the added information has negligible impact on recognition performance. We discuss how this may be because the lexical semantic information defined by VerbNet is generally not visually salient given the video processing approach used here, and how we may handle this in future approaches.

Anthology ID:: W18-4307
Volume:: Proceedings of the Workshop Events and Stories in the News 2018
Month:: August
Year:: 2018
Address:: Santa Fe, New Mexico, U.S.A
Editors:: Tommaso Caselli, Ben Miller, Marieke van Erp, Piek Vossen, Martha Palmer, Eduard Hovy, Teruko Mitamura, David Caswell, Susan W. Brown, Claire Bonial
Venue:: EventStory
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 55–60
Language:
URL:: https://aclanthology.org/W18-4307/
DOI:
Bibkey:
Cite (ACL):: Christopher Reale, Claire Bonial, Heesung Kwon, and Clare Voss. 2018. Can You Spot the Semantic Predicate in this Video?. In Proceedings of the Workshop Events and Stories in the News 2018, pages 55–60, Santa Fe, New Mexico, U.S.A. Association for Computational Linguistics.
Cite (Informal):: Can You Spot the Semantic Predicate in this Video? (Reale et al., EventStory 2018)
Copy Citation:
PDF:: https://aclanthology.org/W18-4307.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{reale-etal-2018-spot,
    title = "Can You Spot the Semantic Predicate in this Video?",
    author = "Reale, Christopher  and
      Bonial, Claire  and
      Kwon, Heesung  and
      Voss, Clare",
    editor = "Caselli, Tommaso  and
      Miller, Ben  and
      van Erp, Marieke  and
      Vossen, Piek  and
      Palmer, Martha  and
      Hovy, Eduard  and
      Mitamura, Teruko  and
      Caswell, David  and
      Brown, Susan W.  and
      Bonial, Claire",
    booktitle = "Proceedings of the Workshop Events and Stories in the News 2018",
    month = aug,
    year = "2018",
    address = "Santa Fe, New Mexico, U.S.A",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W18-4307/",
    pages = "55--60",
    abstract = "We propose a method to improve human activity recognition in video by leveraging semantic information about the target activities from an expert-defined linguistic resource, VerbNet. Our hypothesis is that activities that share similar event semantics, as defined by the semantic predicates of VerbNet, will be more likely to share some visual components. We use a deep convolutional neural network approach as a baseline and incorporate linguistic information from VerbNet through multi-task learning. We present results of experiments showing the added information has negligible impact on recognition performance. We discuss how this may be because the lexical semantic information defined by VerbNet is generally not visually salient given the video processing approach used here, and how we may handle this in future approaches."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="reale-etal-2018-spot">
    <titleInfo>
        <title>Can You Spot the Semantic Predicate in this Video?</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Christopher</namePart>
        <namePart type="family">Reale</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Claire</namePart>
        <namePart type="family">Bonial</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Heesung</namePart>
        <namePart type="family">Kwon</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Clare</namePart>
        <namePart type="family">Voss</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2018-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Workshop Events and Stories in the News 2018</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Tommaso</namePart>
            <namePart type="family">Caselli</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Ben</namePart>
            <namePart type="family">Miller</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Marieke</namePart>
            <namePart type="family">van Erp</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Piek</namePart>
            <namePart type="family">Vossen</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Martha</namePart>
            <namePart type="family">Palmer</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Eduard</namePart>
            <namePart type="family">Hovy</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Teruko</namePart>
            <namePart type="family">Mitamura</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">David</namePart>
            <namePart type="family">Caswell</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Susan</namePart>
            <namePart type="given">W</namePart>
            <namePart type="family">Brown</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Claire</namePart>
            <namePart type="family">Bonial</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Santa Fe, New Mexico, U.S.A</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>We propose a method to improve human activity recognition in video by leveraging semantic information about the target activities from an expert-defined linguistic resource, VerbNet. Our hypothesis is that activities that share similar event semantics, as defined by the semantic predicates of VerbNet, will be more likely to share some visual components. We use a deep convolutional neural network approach as a baseline and incorporate linguistic information from VerbNet through multi-task learning. We present results of experiments showing the added information has negligible impact on recognition performance. We discuss how this may be because the lexical semantic information defined by VerbNet is generally not visually salient given the video processing approach used here, and how we may handle this in future approaches.</abstract>
    <identifier type="citekey">reale-etal-2018-spot</identifier>
    <location>
        <url>https://aclanthology.org/W18-4307/</url>
    </location>
    <part>
        <date>2018-08</date>
        <extent unit="page">
            <start>55</start>
            <end>60</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Can You Spot the Semantic Predicate in this Video?
%A Reale, Christopher
%A Bonial, Claire
%A Kwon, Heesung
%A Voss, Clare
%Y Caselli, Tommaso
%Y Miller, Ben
%Y van Erp, Marieke
%Y Vossen, Piek
%Y Palmer, Martha
%Y Hovy, Eduard
%Y Mitamura, Teruko
%Y Caswell, David
%Y Brown, Susan W.
%Y Bonial, Claire
%S Proceedings of the Workshop Events and Stories in the News 2018
%D 2018
%8 August
%I Association for Computational Linguistics
%C Santa Fe, New Mexico, U.S.A
%F reale-etal-2018-spot
%X We propose a method to improve human activity recognition in video by leveraging semantic information about the target activities from an expert-defined linguistic resource, VerbNet. Our hypothesis is that activities that share similar event semantics, as defined by the semantic predicates of VerbNet, will be more likely to share some visual components. We use a deep convolutional neural network approach as a baseline and incorporate linguistic information from VerbNet through multi-task learning. We present results of experiments showing the added information has negligible impact on recognition performance. We discuss how this may be because the lexical semantic information defined by VerbNet is generally not visually salient given the video processing approach used here, and how we may handle this in future approaches.
%U https://aclanthology.org/W18-4307/
%P 55-60

Download as File

Markdown (Informal)

[Can You Spot the Semantic Predicate in this Video?](https://aclanthology.org/W18-4307/) (Reale et al., EventStory 2018)

Can You Spot the Semantic Predicate in this Video? (Reale et al., EventStory 2018)

ACL

Christopher Reale, Claire Bonial, Heesung Kwon, and Clare Voss. 2018. Can You Spot the Semantic Predicate in this Video?. In Proceedings of the Workshop Events and Stories in the News 2018, pages 55–60, Santa Fe, New Mexico, U.S.A. Association for Computational Linguistics.