Leonard Bongard
2023
ML Mob at SemEval-2023 Task 1: Probing CLIP on Visual Word-Sense Disambiguation
Clifton Poth
|
Martin Hentschel
|
Tobias Werner
|
Hannah Sterz
|
Leonard Bongard
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Successful word sense disambiguation (WSD)is a fundamental element of natural languageunderstanding. As part of SemEval-2023 Task1, we investigate WSD in a multimodal setting,where ambiguous words are to be matched withcandidate images representing word senses. Wecompare multiple systems based on pre-trainedCLIP models. In our experiments, we findCLIP to have solid zero-shot performance onmonolingual and multilingual data. By em-ploying different fine-tuning techniques, we areable to further enhance performance. However,transferring knowledge between data distribu-tions proves to be more challenging.
ML Mob at SemEval-2023 Task 5: “Breaking News: Our Semi-Supervised and Multi-Task Learning Approach Spoils Clickbait”
Hannah Sterz
|
Leonard Bongard
|
Tobias Werner
|
Clifton Poth
|
Martin Hentschel
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Online articles using striking headlines that promise intriguing information are often used to attract readers. Most of the time, the information provided in the text is disappointing to the reader after the headline promised exciting news. As part of the SemEval-2023 challenge, we propose a system to generate a spoiler for these headlines. The spoiler provides the information promised by the headline and eliminates the need to read the full article. We consider Multi-Task Learning and generating more data using a distillation approach in our system. With this, we achieve an F1 score up to 51.48% on extracting the spoiler from the articles.
2022
The Legal Argument Reasoning Task in Civil Procedure
Leonard Bongard
|
Lena Held
|
Ivan Habernal
Proceedings of the Natural Legal Language Processing Workshop 2022
We present a new NLP task and dataset from the domain of the U.S. civil procedure. Each instance of the dataset consists of a general introduction to the case, a particular question, and a possible solution argument, accompanied by a detailed analysis of why the argument applies in that case. Since the dataset is based on a book aimed at law students, we believe that it represents a truly complex task for benchmarking modern legal language models. Our baseline evaluation shows that fine-tuning a legal transformer provides some advantage over random baseline models, but our analysis reveals that the actual ability to infer legal arguments remains a challenging open research question.
Search
Fix data
Co-authors
- Martin Hentschel 2
- Clifton Poth 2
- Hannah Sterz 2
- Tobias Werner 2
- Ivan Habernal 1
- show all...