Visual TTR - Modelling Visual Question Answering in Type Theory with Records

Ronja Utescher

doi:10.18653/v1/W19-0602

Visual TTR - Modelling Visual Question Answering in Type Theory with Records

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

In this paper, I will describe a system that was developed for the task of Visual Question Answering. The system uses the rich type universe of Type Theory with Records (TTR) to model both the utterances about the image, the image itself and classifications made related to the two. At its most basic, the decision of whether any given predicate can be assigned to an object in the image is delegated to a CNN. Consequently, images can be judged as evidence for propositions. The end result is a model whose application of perceptual classifiers to a given image is guided by the accompanying utterance.

Anthology ID:: W19-0602
Volume:: Proceedings of the 13th International Conference on Computational Semantics - Student Papers
Month:: May
Year:: 2019
Address:: Gothenburg, Sweden
Editors:: Simon Dobnik, Stergios Chatzikyriakidis, Vera Demberg, Kathrein Abu Kwaik, Vladislav Maraev
Venue:: IWCS
SIG:: SIGSEM
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9–14
Language:
URL:: https://aclanthology.org/W19-0602/
DOI:: 10.18653/v1/W19-0602
Bibkey:
Cite (ACL):: Ronja Utescher. 2019. Visual TTR - Modelling Visual Question Answering in Type Theory with Records. In Proceedings of the 13th International Conference on Computational Semantics - Student Papers, pages 9–14, Gothenburg, Sweden. Association for Computational Linguistics.
Cite (Informal):: Visual TTR - Modelling Visual Question Answering in Type Theory with Records (Utescher, IWCS 2019)
Copy Citation:
PDF:: https://aclanthology.org/W19-0602.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{utescher-2019-visual,
    title = "Visual {TTR} - Modelling Visual Question Answering in Type Theory with Records",
    author = "Utescher, Ronja",
    editor = "Dobnik, Simon  and
      Chatzikyriakidis, Stergios  and
      Demberg, Vera  and
      Abu Kwaik, Kathrein  and
      Maraev, Vladislav",
    booktitle = "Proceedings of the 13th International Conference on Computational Semantics - Student Papers",
    month = may,
    year = "2019",
    address = "Gothenburg, Sweden",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W19-0602/",
    doi = "10.18653/v1/W19-0602",
    pages = "9--14",
    abstract = "In this paper, I will describe a system that was developed for the task of Visual Question Answering. The system uses the rich type universe of Type Theory with Records (TTR) to model both the utterances about the image, the image itself and classifications made related to the two. At its most basic, the decision of whether any given predicate can be assigned to an object in the image is delegated to a CNN. Consequently, images can be judged as evidence for propositions. The end result is a model whose application of perceptual classifiers to a given image is guided by the accompanying utterance."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="utescher-2019-visual">
    <titleInfo>
        <title>Visual TTR - Modelling Visual Question Answering in Type Theory with Records</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Ronja</namePart>
        <namePart type="family">Utescher</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2019-05</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 13th International Conference on Computational Semantics - Student Papers</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Simon</namePart>
            <namePart type="family">Dobnik</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Stergios</namePart>
            <namePart type="family">Chatzikyriakidis</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Vera</namePart>
            <namePart type="family">Demberg</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Kathrein</namePart>
            <namePart type="family">Abu Kwaik</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Vladislav</namePart>
            <namePart type="family">Maraev</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Gothenburg, Sweden</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>In this paper, I will describe a system that was developed for the task of Visual Question Answering. The system uses the rich type universe of Type Theory with Records (TTR) to model both the utterances about the image, the image itself and classifications made related to the two. At its most basic, the decision of whether any given predicate can be assigned to an object in the image is delegated to a CNN. Consequently, images can be judged as evidence for propositions. The end result is a model whose application of perceptual classifiers to a given image is guided by the accompanying utterance.</abstract>
    <identifier type="citekey">utescher-2019-visual</identifier>
    <identifier type="doi">10.18653/v1/W19-0602</identifier>
    <location>
        <url>https://aclanthology.org/W19-0602/</url>
    </location>
    <part>
        <date>2019-05</date>
        <extent unit="page">
            <start>9</start>
            <end>14</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Visual TTR - Modelling Visual Question Answering in Type Theory with Records
%A Utescher, Ronja
%Y Dobnik, Simon
%Y Chatzikyriakidis, Stergios
%Y Demberg, Vera
%Y Abu Kwaik, Kathrein
%Y Maraev, Vladislav
%S Proceedings of the 13th International Conference on Computational Semantics - Student Papers
%D 2019
%8 May
%I Association for Computational Linguistics
%C Gothenburg, Sweden
%F utescher-2019-visual
%X In this paper, I will describe a system that was developed for the task of Visual Question Answering. The system uses the rich type universe of Type Theory with Records (TTR) to model both the utterances about the image, the image itself and classifications made related to the two. At its most basic, the decision of whether any given predicate can be assigned to an object in the image is delegated to a CNN. Consequently, images can be judged as evidence for propositions. The end result is a model whose application of perceptual classifiers to a given image is guided by the accompanying utterance.
%R 10.18653/v1/W19-0602
%U https://aclanthology.org/W19-0602/
%U https://doi.org/10.18653/v1/W19-0602
%P 9-14

Download as File

Markdown (Informal)

[Visual TTR - Modelling Visual Question Answering in Type Theory with Records](https://aclanthology.org/W19-0602/) (Utescher, IWCS 2019)

Visual TTR - Modelling Visual Question Answering in Type Theory with Records (Utescher, IWCS 2019)

ACL

Ronja Utescher. 2019. Visual TTR - Modelling Visual Question Answering in Type Theory with Records. In Proceedings of the 13th International Conference on Computational Semantics - Student Papers, pages 9–14, Gothenburg, Sweden. Association for Computational Linguistics.