Tomáš Brychcín


2018

pdf bib
UWB at SemEval-2018 Task 10: Capturing Discriminative Attributes from Word Distributions
Tomáš Brychcín | Tomáš Hercig | Josef Steinberger | Michal Konkol
Proceedings of The 12th International Workshop on Semantic Evaluation

We present our UWB system for the task of capturing discriminative attributes at SemEval 2018. Given two words and an attribute, the system decides, whether this attribute is discriminative between the words or not. Assuming Distributional Hypothesis, i.e., a word meaning is related to the distribution across contexts, we introduce several approaches to compare word contextual information. We experiment with state-of-the-art semantic spaces and with simple co-occurrence statistics. We show the word distribution in the corpus has potential for detecting discriminative attributes. Our system achieves F1 score 72.1% and is ranked #4 among 26 submitted systems.

2017

pdf bib
Geographical Evaluation of Word Embeddings
Michal Konkol | Tomáš Brychcín | Michal Nykl | Tomáš Hercig
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Word embeddings are commonly compared either with human-annotated word similarities or through improvements in natural language processing tasks. We propose a novel principle which compares the information from word embeddings with reality. We implement this principle by comparing the information in the word embeddings with geographical positions of cities. Our evaluation linearly transforms the semantic space to optimally fit the real positions of cities and measures the deviation between the position given by word embeddings and the real position. A set of well-known word embeddings with state-of-the-art results were evaluated. We also introduce a visualization that helps with error analysis.

pdf bib
Cross-lingual Flames Detection in News Discussions
Josef Steinberger | Tomáš Brychcín | Tomáš Hercig | Peter Krejzl
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

We introduce Flames Detector, an online system for measuring flames, i.e. strong negative feelings or emotions, insults or other verbal offences, in news commentaries across five languages. It is designed to assist journalists, public institutions or discussion moderators to detect news topics which evoke wrangles. We propose a machine learning approach to flames detection and calculate an aggregated score for a set of comment threads. The demo application shows the most flaming topics of the current period in several language variants. The search functionality gives a possibility to measure flames in any topic specified by a query. The evaluation shows that the flame detection in discussions is a difficult task, however, the application can already reveal interesting information about the actual news discussions.

pdf bib
Pyramid-based Summary Evaluation Using Abstract Meaning Representation
Josef Steinberger | Peter Krejzl | Tomáš Brychcín
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

We propose a novel metric for evaluating summary content coverage. The evaluation framework follows the Pyramid approach to measure how many summarization content units, considered important by human annotators, are contained in an automatic summary. Our approach automatizes the evaluation process, which does not need any manual intervention on the evaluated summary side. Our approach compares abstract meaning representations of each content unit mention and each summary sentence. We found that the proposed metric complements well the widely-used ROUGE metrics.

pdf bib
Unsupervised Dialogue Act Induction using Gaussian Mixtures
Tomáš Brychcín | Pavel Král
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

This paper introduces a new unsupervised approach for dialogue act induction. Given the sequence of dialogue utterances, the task is to assign them the labels representing their function in the dialogue. Utterances are represented as real-valued vectors encoding their meaning. We model the dialogue as Hidden Markov model with emission probabilities estimated by Gaussian mixtures. We use Gibbs sampling for posterior inference. We present the results on the standard Switchboard-DAMSL corpus. Our algorithm achieves promising results compared with strong supervised baselines and outperforms other unsupervised algorithms.

2016

pdf bib
Latent Tree Language Model
Tomáš Brychcín
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
UWB at SemEval-2016 Task 5: Aspect Based Sentiment Analysis
Tomáš Hercig | Tomáš Brychcín | Lukáš Svoboda | Michal Konkol
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
UWB at SemEval-2016 Task 1: Semantic Textual Similarity using Lexical, Syntactic, and Semantic Information
Tomáš Brychcín | Lukáš Svoboda
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
UWB at SemEval-2016 Task 2: Interpretable Semantic Textual Similarity with Distributional Semantics for Chunks
Miloslav Konopík | Ondřej Pražák | David Steinberger | Tomáš Brychcín
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2014

pdf bib
UWB: Machine Learning Approach to Aspect-Based Sentiment Analysis
Tomáš Brychcín | Michal Konkol | Josef Steinberger
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Aspect-Level Sentiment Analysis in Czech
Josef Steinberger | Tomáš Brychcín | Michal Konkol
Proceedings of the 5th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

2013

pdf bib
Unsupervised Improving of Sentiment Analysis Using Global Target Context
Tomáš Brychcín | Ivan Habernal
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013