Lluís Màrquez

Also published as: L. Màrquez, Lluis Marquez, Lluis Màrquez, Lluis Márquez


2024

pdf bib
Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators
Matéo Mahaut | Laura Aina | Paula Czarnowska | Momchil Hardalov | Thomas Müller | Lluis Marquez
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large Language Models (LLMs) tend to be unreliable on fact-based answers.To address this problem, NLP researchers have proposed a range of techniques to estimate LLM’s confidence over facts. However, due to the lack of a systematic comparison, it is not clear how the different methods compare to one other.To fill this gap, we present a rigorous survey and empirical comparison of estimators of factual confidence.We define an experimental framework allowing for fair comparison, covering both fact-verification and QA. Our experiments across a series of LLMs indicate that trained hidden-state probes provide the most reliable confidence estimates; albeit at the expense of requiring access to weights and supervision data. We also conduct a deeper assessment of the methods, in which we measure the consistency of model behavior under meaning-preserving variations in the input. We find that the factual confidence of LLMs is often unstable across semantically equivalent inputs, suggesting there is much room for improvement for the stability of models’ parametric knowledge.

2023

pdf bib
Diable: Efficient Dialogue State Tracking as Operations on Tables
Pietro Lesci | Yoshinari Fujinuma | Momchil Hardalov | Chao Shang | Yassine Benajiba | Lluis Marquez
Findings of the Association for Computational Linguistics: ACL 2023

Sequence-to-sequence state-of-the-art systems for dialogue state tracking (DST) use the full dialogue history as input, represent the current state as a list with all the slots, and generate the entire state from scratch at each dialogue turn. This approach is inefficient, especially when the number of slots is large and the conversation is long. We propose Diable, a new task formalisation that simplifies the design and implementation of efficient DST systems and allows one to easily plug and play large language models. We represent the dialogue state as a table and formalise DST as a table manipulation task. At each turn, the system updates the previous state by generating table operations based on the dialogue context. Extensive experimentation on the MultiWoz datasets demonstrates that Diable (i) outperforms strong efficient DST baselines, (ii) is 2.4x more time efficient than current state-of-the-art methods while retaining competitive Joint Goal Accuracy, and (iii) is robust to noisy data annotations due to the table operations approach.

2019

pdf bib
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Anna Korhonen | David Traum | Lluís Màrquez
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

pdf bib
Book QA: Stories of Challenges and Opportunities
Stefanos Angelidis | Lea Frermann | Diego Marcheggiani | Roi Blanco | Lluís Màrquez
Proceedings of the 2nd Workshop on Machine Reading for Question Answering

We present a system for answering questions based on the full text of books (BookQA), which first selects book passages given a question at hand, and then uses a memory network to reason and predict an answer. To improve generalization, we pretrain our memory network using artificial questions generated from book sentences. We experiment with the recently published NarrativeQA corpus, on the subset of Who questions, which expect book characters as answers. We experimentally show that BERT-based retrieval and pretraining improve over baseline results significantly. At the same time, we confirm that NarrativeQA is a highly challenging data set, and that there is need for novel research in order to achieve high-precision BookQA results. We analyze some of the bottlenecks of the current approach, and we argue that more research is needed on text representation, retrieval of relevant passages, and reasoning, including commonsense knowledge.

pdf bib
It Takes Nine to Smell a Rat: Neural Multi-Task Learning for Check-Worthiness Prediction
Slavena Vasileva | Pepa Atanasova | Lluís Màrquez | Alberto Barrón-Cedeño | Preslav Nakov
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

We propose a multi-task deep-learning approach for estimating the check-worthiness of claims in political debates. Given a political debate, such as the 2016 US Presidential and Vice-Presidential ones, the task is to predict which statements in the debate should be prioritized for fact-checking. While different fact-checking organizations would naturally make different choices when analyzing the same debate, we show that it pays to learn from multiple sources simultaneously (PolitiFact, FactCheck, ABC, CNN, NPR, NYT, Chicago Tribune, The Guardian, and Washington Post) in a multi-task learning setup, even when a particular source is chosen as a target to imitate. Our evaluation shows state-of-the-art results on a standard dataset for the task of check-worthiness prediction.

2018

pdf bib
Automatic Stance Detection Using End-to-End Memory Networks
Mitra Mohtarami | Ramy Baly | James Glass | Preslav Nakov | Lluís Màrquez | Alessandro Moschitti
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We present an effective end-to-end memory network model that jointly (i) predicts whether a given document can be considered as relevant evidence for a given claim, and (ii) extracts snippets of evidence that can be used to reason about the factuality of the target claim. Our model combines the advantages of convolutional and recurrent neural networks as part of a memory network. We further introduce a similarity matrix at the inference level of the memory network in order to extract snippets of evidence for input claims more accurately. Our experiments on a public benchmark dataset, FakeNewsChallenge, demonstrate the effectiveness of our approach.

pdf bib
Integrating Stance Detection and Fact Checking in a Unified Corpus
Ramy Baly | Mitra Mohtarami | James Glass | Lluís Màrquez | Alessandro Moschitti | Preslav Nakov
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

A reasonable approach for fact checking a claim involves retrieving potentially relevant documents from different sources (e.g., news websites, social media, etc.), determining the stance of each document with respect to the claim, and finally making a prediction about the claim’s factuality by aggregating the strength of the stances, while taking the reliability of the source into account. Moreover, a fact checking system should be able to explain its decision by providing relevant extracts (rationales) from the documents. Yet, this setup is not directly supported by existing datasets, which treat fact checking, document retrieval, source credibility, stance detection and rationale extraction as independent tasks. In this paper, we support the interdependencies between these tasks as annotations in the same corpus. We implement this setup on an Arabic fact checking corpus, the first of its kind.

pdf bib
ClaimRank: Detecting Check-Worthy Claims in Arabic and English
Israa Jaradat | Pepa Gencheva | Alberto Barrón-Cedeño | Lluís Màrquez | Preslav Nakov
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

We present ClaimRank, an online system for detecting check-worthy claims. While originally trained on political debates, the system can work for any kind of text, e.g., interviews or just regular news articles. Its aim is to facilitate manual fact-checking efforts by prioritizing the claims that fact-checkers should consider first. ClaimRank supports both Arabic and English, it is trained on actual annotations from nine reputable fact-checking organizations (PolitiFact, FactCheck, ABC, CNN, NPR, NYT, Chicago Tribune, The Guardian, and Washington Post), and thus it can mimic the claim selection strategies for each and any of them, as well as for the union of them all.

pdf bib
Joint Multitask Learning for Community Question Answering Using Task-Specific Embeddings
Shafiq Joty | Lluís Màrquez | Preslav Nakov
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We address jointly two important tasks for Question Answering in community forums: given a new question, (i) find related existing questions, and (ii) find relevant answers to this new question. We further use an auxiliary task to complement the previous two, i.e., (iii) find good answers with respect to the thread question in a question-comment thread. We use deep neural networks (DNNs) to learn meaningful task-specific embeddings, which we then incorporate into a conditional random field (CRF) model for the multitask setting, performing joint learning over a complex graph structure. While DNNs alone achieve competitive results when trained to produce the embeddings, the CRF, which makes use of the embeddings and the dependencies between the tasks, improves the results significantly and consistently across a variety of evaluation metrics, thus showing the complementarity of DNNs and structured learning.

2017

pdf bib
Cross-language Learning with Adversarial Neural Networks
Shafiq Joty | Preslav Nakov | Lluís Màrquez | Israa Jaradat
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

We address the problem of cross-language adaptation for question-question similarity reranking in community question answering, with the objective to port a system trained on one input language to another input language given labeled training data for the first language and only unlabeled data for the second language. In particular, we propose to use adversarial training of neural networks to learn high-level features that are discriminative for the main learning task, and at the same time are invariant across the input languages. The evaluation results show sizable improvements for our cross-language adversarial neural network (CLANN) model over a strong non-adversarial system.

pdf bib
Discourse Structure in Machine Translation Evaluation
Shafiq Joty | Francisco Guzmán | Lluís Màrquez | Preslav Nakov
Computational Linguistics, Volume 43, Issue 4 - December 2017

In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation. We first design discourse-aware similarity measures, which use all-subtree kernels to compare discourse parse trees in accordance with the Rhetorical Structure Theory (RST). Then, we show that a simple linear combination with these measures can help improve various existing machine translation evaluation metrics regarding correlation with human judgments both at the segment level and at the system level. This suggests that discourse information is complementary to the information used by many of the existing evaluation metrics, and thus it could be taken into account when developing richer evaluation metrics, such as the WMT-14 winning combined metric DiscoTKparty. We also provide a detailed analysis of the relevance of various discourse elements and relations from the RST parse trees for machine translation evaluation. In particular, we show that (i) all aspects of the RST tree are relevant, (ii) nuclearity is more useful than relation type, and (iii) the similarity of the translation RST tree to the reference RST tree is positively correlated with translation quality.

pdf bib
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)
Nancy Ide | Aurélie Herbelot | Lluís Màrquez
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)

pdf bib
SemEval-2017 Task 3: Community Question Answering
Preslav Nakov | Doris Hoogeveen | Lluís Màrquez | Alessandro Moschitti | Hamdy Mubarak | Timothy Baldwin | Karin Verspoor
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

We describe SemEval–2017 Task 3 on Community Question Answering. This year, we reran the four subtasks from SemEval-2016: (A) Question–Comment Similarity, (B) Question–Question Similarity, (C) Question–External Comment Similarity, and (D) Rerank the correct answers for a new question in Arabic, providing all the data from 2015 and 2016 for training, and fresh data for testing. Additionally, we added a new subtask E in order to enable experimentation with Multi-domain Question Duplicate Detection in a larger-scale scenario, using StackExchange subforums. A total of 23 teams participated in the task, and submitted a total of 85 runs (36 primary and 49 contrastive) for subtasks A–D. Unfortunately, no teams participated in subtask E. A variety of approaches and features were used by the participating systems to address the different subtasks. The best systems achieved an official score (MAP) of 88.43, 47.22, 15.46, and 61.16 in subtasks A, B, C, and D, respectively. These scores are better than the baselines, especially for subtasks A–C.

pdf bib
Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks
Yonatan Belinkov | Lluís Màrquez | Hassan Sajjad | Nadir Durrani | Fahim Dalvi | James Glass
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

While neural machine translation (NMT) models provide improved translation quality in an elegant framework, it is less clear what they learn about language. Recent work has started evaluating the quality of vector representations learned by NMT models on morphological and syntactic tasks. In this paper, we investigate the representations learned at different layers of NMT encoders. We train NMT systems on parallel data and use the models to extract features for training a classifier on two tasks: part-of-speech and semantic tagging. We then measure the performance of the classifier as a proxy to the quality of the original NMT model for the given task. Our quantitative analysis yields interesting insights regarding representation learning in NMT models. For instance, we find that higher layers are better at learning semantics while lower layers tend to be better for part-of-speech tagging. We also observe little effect of the target language on source-side representations, especially in higher quality models.

pdf bib
A Context-Aware Approach for Detecting Worth-Checking Claims in Political Debates
Pepa Gencheva | Preslav Nakov | Lluís Màrquez | Alberto Barrón-Cedeño | Ivan Koychev
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

In the context of investigative journalism, we address the problem of automatically identifying which claims in a given document are most worthy and should be prioritized for fact-checking. Despite its importance, this is a relatively understudied problem. Thus, we create a new corpus of political debates, containing statements that have been fact-checked by nine reputable sources, and we train machine learning models to predict which claims should be prioritized for fact-checking, i.e., we model the problem as a ranking task. Unlike previous work, which has looked primarily at sentences in isolation, in this paper we focus on a rich input representation modeling the context: relationship between the target statement and the larger context of the debate, interaction between the opponents, and reaction by the moderator and by the public. Our experiments show state-of-the-art results, outperforming a strong rivaling system by a margin, while also confirming the importance of the contextual information.

pdf bib
Fully Automated Fact Checking Using External Sources
Georgi Karadzhov | Preslav Nakov | Lluís Màrquez | Alberto Barrón-Cedeño | Ivan Koychev
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

Given the constantly growing proliferation of false claims online in recent years, there has been also a growing research interest in automatically distinguishing false rumors from factually true claims. Here, we propose a general-purpose framework for fully-automatic fact checking using external sources, tapping the potential of the entire Web as a knowledge source to confirm or reject a claim. Our framework uses a deep neural network with LSTM text encoding to combine semantic kernels with task-specific embeddings that encode a claim together with pieces of potentially relevant text fragments from the Web, taking the source reliability into account. The evaluation results show good performance on two different tasks and datasets: (i) rumor detection and (ii) fact checking of the answers to a question in community question answering forums.

pdf bib
Do Not Trust the Trolls: Predicting Credibility in Community Question Answering Forums
Preslav Nakov | Tsvetomila Mihaylova | Lluís Màrquez | Yashkumar Shiroya | Ivan Koychev
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

We address information credibility in community forums, in a setting in which the credibility of an answer posted in a question thread by a particular user has to be predicted. First, we motivate the problem and we create a publicly available annotated English corpus by crowdsourcing. Second, we propose a large set of features to predict the credibility of the answers. The features model the user, the answer, the question, the thread as a whole, and the interaction between them. Our experiments with ranking SVMs show that the credibility labels can be predicted with high performance according to several standard IR ranking metrics, thus supporting the potential usage of this layer of credibility information in practical applications. The features modeling the profile of the user (in particular trollness) turn out to be most important, but embedding features modeling the answer and the similarity between the question and the answer are also very relevant. Overall, half of the gap between the baseline performance and the perfect classifier can be covered using the proposed features.

2016

pdf bib
SemEval-2016 Task 3: Community Question Answering
Preslav Nakov | Lluís Màrquez | Alessandro Moschitti | Walid Magdy | Hamdy Mubarak | Abed Alhakim Freihat | Jim Glass | Bilal Randeree
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
MTE-NN at SemEval-2016 Task 3: Can Machine Translation Evaluation Help Community Question Answering?
Francisco Guzmán | Preslav Nakov | Lluís Màrquez
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
It Takes Three to Tango: Triangulation Approach to Answer Ranking in Community Question Answering
Preslav Nakov | Lluís Màrquez | Francisco Guzmán
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Joint Learning with Global Inference for Comment Classification in Community Question Answering
Shafiq Joty | Lluís Màrquez | Preslav Nakov
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Semi-supervised Question Retrieval with Gated Convolutions
Tao Lei | Hrishikesh Joshi | Regina Barzilay | Tommi Jaakkola | Kateryna Tymoshenko | Alessandro Moschitti | Lluís Màrquez
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
An Interactive System for Exploring Community Question Answering Forums
Enamul Hoque | Shafiq Joty | Lluís Màrquez | Alberto Barrón-Cedeño | Giovanni Da San Martino | Alessandro Moschitti | Preslav Nakov | Salvatore Romeo | Giuseppe Carenini
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

We present an interactive system to provide effective and efficient search capabilities in Community Question Answering (cQA) forums. The system integrates state-of-the-art technology for answer search with a Web-based user interface specifically tailored to support the cQA forum readers. The answer search module automatically finds relevant answers for a new question by exploring related questions and the comments within their threads. The graphical user interface presents the search results and supports the exploration of related information. The system is running live at http://www.qatarliving.com/betasearch/.

pdf bib
Machine Translation Evaluation Meets Community Question Answering
Francisco Guzmán | Lluís Màrquez | Preslav Nakov
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
QCRI: Answer Selection for Community Question Answering - Experiments for Arabic and English
Massimo Nicosia | Simone Filice | Alberto Barrón-Cedeño | Iman Saleh | Hamdy Mubarak | Wei Gao | Preslav Nakov | Giovanni Da San Martino | Alessandro Moschitti | Kareem Darwish | Lluís Màrquez | Shafiq Joty | Walid Magdy
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
SemEval-2015 Task 3: Answer Selection in Community Question Answering
Preslav Nakov | Lluís Màrquez | Walid Magdy | Alessandro Moschitti | Jim Glass | Bilal Randeree
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
Document-Level Machine Translation with Word Vector Models
Eva Martinez Garcia | Cristina Espana-Bonet | Lluis Marquez
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
High-Order Low-Rank Tensors for Semantic Role Labeling
Tao Lei | Yuan Zhang | Lluís Màrquez | Alessandro Moschitti | Regina Barzilay
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
Lluís Màrquez | Chris Callison-Burch | Jian Su
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Global Thread-level Inference for Comment Classification in Community Question Answering
Shafiq Joty | Alberto Barrón-Cedeño | Giovanni Da San Martino | Simone Filice | Lluís Màrquez | Alessandro Moschitti | Preslav Nakov
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Factory of Comparable Corpora from Wikipedia
Alberto Barrón-Cedeño | Cristina España-Bonet | Josu Boldoba | Lluís Màrquez
Proceedings of the Eighth Workshop on Building and Using Comparable Corpora

pdf bib
Document-Level Machine Translation with Word Vector Models
Eva Martínez Garcia | Cristina España-Bonet | Lluís Màrquez
Proceedings of the 18th Annual Conference of the European Association for Machine Translation

pdf bib
Pairwise Neural Machine Translation Evaluation
Francisco Guzmán | Shafiq Joty | Lluís Màrquez | Preslav Nakov
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Thread-Level Information for Comment Classification in Community Question Answering
Alberto Barrón-Cedeño | Simone Filice | Giovanni Da San Martino | Shafiq Joty | Lluís Màrquez | Preslav Nakov | Alessandro Moschitti
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
IPA and STOUT: Leveraging Linguistic and Source-based Features for Machine Translation Evaluation
Meritxell Gonzàlez | Alberto Barrón-Cedeño | Lluís Màrquez
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
DiscoTK: Using Discourse Structure for Machine Translation Evaluation
Shafiq Joty | Francisco Guzmán | Lluís Màrquez | Preslav Nakov
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
Word’s Vector Representations meet Machine Translation
Eva Martínez Garcia | Jörg Tiedemann | Cristina España-Bonet | Lluís Màrquez
Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf bib
Learning to Differentiate Better from Worse Translations
Francisco Guzmán | Shafiq Joty | Lluís Màrquez | Alessandro Moschitti | Preslav Nakov | Massimo Nicosia
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
A Shortest-path Method for Arc-factored Semantic Role Labeling
Xavier Lluís | Xavier Carreras | Lluís Màrquez
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Semantic Kernels for Semantic Parsing
Iman Saleh | Alessandro Moschitti | Preslav Nakov | Lluís Màrquez | Shafiq Joty
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Using Discourse Structure Improves Machine Translation Evaluation
Francisco Guzmán | Shafiq Joty | Lluís Màrquez | Preslav Nakov
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
A Study of using Syntactic and Semantic Structures for Concept Segmentation and Labeling
Iman Saleh | Scott Cyphers | Jim Glass | Shafiq Joty | Lluís Màrquez | Alessandro Moschitti | Preslav Nakov
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
tSEARCH: Flexible and Fast Search over Automatic Translations for Improved Quality/Error Analysis
Meritxell Gonzàlez | Laura Mascarell | Lluís Màrquez
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf bib
Selectional Preferences for Semantic Role Classification
Beñat Zapirain | Eneko Agirre | Lluís Màrquez | Mihai Surdeanu
Computational Linguistics, Volume 39, Issue 3 - September 2013

pdf bib
The TALP-UPC Phrase-Based Translation Systems for WMT13: System Combination with Morphology Generation, Domain Adaptation and Corpus Filtering
Lluís Formiga | Marta R. Costa-jussà | José B. Mariño | José A. R. Fonollosa | Alberto Barrón-Cedeño | Lluís Màrquez
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
The TALP-UPC Approach to System Selection: Asiya Features and Pairwise Classification Using Random Forests
Lluís Formiga | Meritxell Gonzàlez | Alberto Barrón-Cedeño | José A. R. Fonollosa | Lluís Màrquez
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf bib
Joint Arc-factored Parsing of Syntactic and Semantic Dependencies
Xavier Lluís | Xavier Carreras | Lluís Màrquez
Transactions of the Association for Computational Linguistics, Volume 1

In this paper we introduce a joint arc-factored model for syntactic and semantic dependency parsing. The semantic role labeler predicts the full syntactic paths that connect predicates with their arguments. This process is framed as a linear assignment task, which allows to control some well-formedness constraints. For the syntactic part, we define a standard arc-factored dependency model that predicts the full syntactic tree. Finally, we employ dual decomposition techniques to produce consistent syntactic and predicate-argument structures while searching over a large space of syntactic configurations. In experiments on the CoNLL-2009 English benchmark we observe very competitive results.

pdf bib
UPC-CORE: What Can Machine Translation Evaluation Metrics and Wikipedia Do for Estimating Semantic Textual Similarity?
Alberto Barrón-Cedeño | Lluís Màrquez | Maria Fuentes | Horacio Rodríguez | Jordi Turmo
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity

pdf bib
Real-life Translation Quality Estimation for MT System Selection
Lluis Formiga | Lluis Marquez | Jaume Pujantell
Proceedings of Machine Translation Summit XIV: Papers

pdf bib
MT Techniques in a Retrieval System of Semantically Enriched Patents
Meritxell Gonzalez | Maria Mateva | Ramona Enache | Cristina Espana-Bonet | Lluis Marquez | Borislav Popov | Aarne Ranta
Proceedings of Machine Translation Summit XIV: Posters

pdf bib
FAUST: Feedback Analysis for User Adaptive Statistical Translation
William Byrne | Lluis Marquez
Proceedings of Machine Translation Summit XIV: European projects

2012

pdf bib
A Graphical Interface for MT Evaluation and Error Analysis
Meritxell Gonzàlez | Jesús Giménez | Lluís Màrquez
Proceedings of the ACL 2012 System Demonstrations

pdf bib
The UPC Submission to the WMT 2012 Shared Task on Quality Estimation
Daniele Pighin | Meritxell González | Lluís Màrquez
Proceedings of the Seventh Workshop on Statistical Machine Translation

pdf bib
An Analysis (and an Annotated Corpus) of User Responses to Machine Translation Output
Daniele Pighin | Lluís Màrquez | Jonathan May
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present an annotated resource consisting of open-domain translation requests, automatic translations and user-provided corrections collected from casual users of the translation portal http://reverso.net. The layers of annotation provide: 1) quality assessments for 830 correction suggestions for translations into English, at the segment level, and 2) 814 usefulness assessments for English-Spanish and English-French translation suggestions, a suggestion being useful if it contains at least local clues that can be used to improve translation quality. We also discuss the results of our preliminary experiments concerning 1) the development of an automatic filter to separate useful from non-useful feedback, and 2) the incorporation in the machine translation pipeline of bilingual phrases extracted from the suggestions. The annotated data, available for download from ftp://mi.eng.cam.ac.uk/data/faust/LW-UPC-Oct11-FAUST-feedback-annotation.tgz, is released under a Creative Commons license. To our best knowledge, this is the first resource of this kind that has ever been made publicly available.

pdf bib
The FAUST Corpus of Adequacy Assessments for Real-World Machine Translation Output
Daniele Pighin | Lluís Màrquez | Lluís Formiga
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a corpus consisting of 11,292 real-world English to Spanish automatic translations annotated with relative (ranking) and absolute (adequate/non-adequate) quality assessments. The translation requests, collected through the popular translation portal http://reverso.net, provide a most variated sample of real-world machine translation (MT) usage, from complete sentences to units of one or two words, from well-formed to hardly intelligible texts, from technical documents to colloquial and slang snippets. In this paper, we present 1) a preliminary annotation experiment that we carried out to select the most appropriate quality criterion to be used for these data, 2) a graph-based methodology inspired by Interactive Genetic Algorithms to reduce the annotation effort, and 3) the outcomes of the full-scale annotation experiment, which result in a valuable and original resource for the analysis and characterization of MT-output quality.

pdf bib
Deep evaluation of hybrid architectures: use of different metrics in MERT weight optimization
Cristina España-Bonet | Gorka Labaka | Arantza Díaz de Ilarranza | Lluís Màrquez | Kepa Sarasola
Proceedings of the Third International Workshop on Free/Open-Source Rule-Based Machine Translation

pdf bib
A Graph-based Strategy to Streamline Translation Quality Assessments
Daniele Pighin | Lluís Formiga | Lluís Màrquez
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers

We present a detailed analysis of a graph-based annotation strategy that we employed to annotate a corpus of 11,292 real-world English to Spanish automatic translations with relative (ranking) and absolute (adequate/non-adequate) quality assessments. The proposed approach, inspired by previous work in Interactive Evolutionary Computation and Interactive Genetic Algorithms, results in a simpler and faster annotation process. We empirically compare the method against a traditional, explicit ranking approach, and show that the graph-based strategy: 1) is considerably faster, and 2) produces consistently more reliable annotations.

pdf bib
Context-Aware Machine Translation for Software Localization
Victor Muntés-Mulero | Patricia Paladini Adell | Cristina España-Bonet | Lluís Màrquez
Proceedings of the 16th Annual Conference of the European Association for Machine Translation

pdf bib
A Hybrid System for Patent Translation
Ramona Enache | Cristina España-Bonet | Aarne Ranta | Lluís Màrquez
Proceedings of the 16th Annual Conference of the European Association for Machine Translation

2011

pdf bib
Automatic Projection of Semantic Structures: an Application to Pairwise Translation Ranking
Daniele Pighin | Lluís Màrquez
Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf bib
Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing
Irina Matveeva | Alessandro Moschitti | Lluís Màrquez | Fabio Massimo Zanzotto
Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing

pdf bib
Hybrid Machine Translation Guided by a Rule–Based System
Cristina España-Bonet | Gorka Labaka | Arantza Díaz de Ilarraza | Lluís Màrquez
Proceedings of Machine Translation Summit XIII: Papers

pdf bib
Patent translation within the MOLTO project
Cristina España-Bonet | Ramona Enache | Adam Slaski | Aarne Ranta | Lluís Màrquez | Meritxell Gonzàlez
Proceedings of the 4th Workshop on Patent Translation

2010

pdf bib
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Lluís Màrquez | Haifeng Wang
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

pdf bib
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Hang Li | Lluís Màrquez
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
Improving Semantic Role Classification with Selectional Preferences
Beñat Zapirain | Eneko Agirre | Lluís Màrquez | Mihai Surdeanu
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
SemEval-2010 Task 1: Coreference Resolution in Multiple Languages
Marta Recasens | Lluís Màrquez | Emili Sapena | M. Antònia Martí | Mariona Taulé | Véronique Hoste | Massimo Poesio | Yannick Versley
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
Document-Level Automatic MT Evaluation based on Discourse Representations
Elisabet Comelles | Jesús Giménez | Lluís Màrquez | Irene Castellón | Victoria Arranz
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
Robust Estimation of Feature Weights in Statistical Machine Translation
Cristina España-Bonet | Lluís Màrquez
Proceedings of the 14th Annual Conference of the European Association for Machine Translation

2009

pdf bib
Proceedings of the 13th Annual Conference of the European Association for Machine Translation
Lluís Màrquez | Harold Somers
Proceedings of the 13th Annual Conference of the European Association for Machine Translation

pdf bib
On the Robustness of Syntactic and Semantic Features for Automatic MT Evaluation
Jesús Giménez | Lluís Màrquez
Proceedings of the Fourth Workshop on Statistical Machine Translation

pdf bib
The CoNLL-2009 Shared Task: Syntactic and Semantic Dependencies in Multiple Languages
Jan Hajič | Massimiliano Ciaramita | Richard Johansson | Daisuke Kawahara | Maria Antònia Martí | Lluís Màrquez | Adam Meyers | Joakim Nivre | Sebastian Padó | Jan Štěpánek | Pavel Straňák | Mihai Surdeanu | Nianwen Xue | Yi Zhang
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task

pdf bib
A Second-Order Joint Eisner Model for Syntactic and Semantic Dependency Parsing
Xavier Lluís | Stefan Bott | Lluís Màrquez
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task

pdf bib
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)
Eneko Agirre | Lluís Màrquez | Richard Wicentowski
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

pdf bib
SemEval-2010 Task 1: Coreference Resolution in Multiple Languages
Marta Recasens | Toni Martí | Mariona Taulé | Lluís Màrquez | Emili Sapena
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

pdf bib
Generalizing over Lexical Features: Selectional Preferences for Semantic Role Classification
Beñat Zapirain | Eneko Agirre | Lluís Màrquez
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib
Semantic Role Labeling: Past, Present and Future
Lluís Màrquez
Tutorial Abstracts of ACL-IJCNLP 2009

2008

pdf bib
A Smorgasbord of Features for Automatic MT Evaluation
Jesús Giménez | Lluís Màrquez
Proceedings of the Third Workshop on Statistical Machine Translation

pdf bib
The CoNLL 2008 Shared Task on Joint Parsing of Syntactic and Semantic Dependencies
Mihai Surdeanu | Richard Johansson | Adam Meyers | Lluís Màrquez | Joakim Nivre
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning

pdf bib
A Joint Model for Parsing Syntactic and Semantic Dependencies
Xavier Lluís | Lluís Màrquez
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning

pdf bib
Robustness and Generalization of Role Sets: PropBank vs. VerbNet
Beñat Zapirain | Eneko Agirre | Lluís Màrquez
Proceedings of ACL-08: HLT

pdf bib
Heterogeneous Automatic MT Evaluation Through Non-Parametric Metric Combinations
Jesús Giménez | Lluís Màrquez
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Towards Heterogeneous Automatic MT Error Analysis
Jesús Giménez | Lluís Màrquez
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This work studies the viability of performing heterogeneous automatic MT error analyses. Error analysis is, undoubtly, one of the most crucial stages in the development cycle of an MT system. However, often not enough attention is paid to this process. The reason is that performing an accurate error analysis requires intensive human labor. In order to speed up the error analysis process, we suggest partially automatizing it by having automatic evaluation metrics play a more active role. For that purpose, we have compiled a large and heterogeneous set of features at different linguistic levels and at different levels of granularity. Through a practical case study, we show how these features provide an effective means of ellaborating interpretable and detailed automatic reports of translation quality.

pdf bib
Special Issue Introduction: Semantic Role Labeling: An Introduction to the Special Issue
Lluís Màrquez | Xavier Carreras | Kenneth C. Litkowski | Suzanne Stevenson
Computational Linguistics, Volume 34, Number 2, June 2008 - Special Issue on Semantic Role Labeling

2007

pdf bib
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)
Eneko Agirre | Lluís Màrquez | Richard Wicentowski
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
SemEval-2007 Task 09: Multilevel Semantic Annotation of Catalan and Spanish
Lluís Màrquez | Luis Villarejo | M. A. Martí | Mariona Taulé
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
UBC-UPC: Sequential SRL Using Selectional Preferences. An approach with Maximum Entropy Markov Models
Beñat Zapirain | Eneko Agirre | Lluís Màrquez
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
UPC: Experiments with Joint Learning within SemEval Task 9
Lluís Màrquez | Lluís Padró | Mihai Surdeanu | Luis Villarejo
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
Context-aware Discriminative Phrase Selection for Statistical Machine Translation
Jesús Giménez | Lluís Màrquez
Proceedings of the Second Workshop on Statistical Machine Translation

pdf bib
Linguistic Features for Automatic Evaluation of Heterogenous MT Systems
Jesús Giménez | Lluís Màrquez
Proceedings of the Second Workshop on Statistical Machine Translation

2006

pdf bib
Generation of Language Resources for the Development of Speech Technologies in Catalan
A. Moreno | Albert Febrer | Lluis Márquez
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes a joint initiative of the Catalan and Spanish Government to produce Language Resources for the Catalan language. A similar methodology to the Basic Language Resource Kit (BLARK) concept was applied to determine the priorities on the production of the Language Resources. The paper shows the LR and tools currently available for the Catalan Language both for Language and Speech technologies. The production of large databases for Automatic Speech Recognition purposes already started. All the resources generated in the project follow EU standards, will be validated by an external centre and will be free and public available through ELRA.

pdf bib
MT Evaluation: Human-Like vs. Human Acceptable
Enrique Amigó | Jesús Giménez | Julio Gonzalo | Lluís Màrquez
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
Low-Cost Enrichment of Spanish WordNet with Automatically Translated Glosses: Combining General and Specialized Models
Jesús Giménez | Lluís Màrquez
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)
Lluís Màrquez | Dan Klein
Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)

pdf bib
Projective Dependency Parsing with Perceptron
Xavier Carreras | Mihai Surdeanu | Lluís Màrquez
Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X)

pdf bib
The LDV-COMBO system for SMT
Jesús Giménez | Lluís Màrquez
Proceedings on the Workshop on Statistical Machine Translation

2005

pdf bib
Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling
Xavier Carreras | Lluís Màrquez
Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)

pdf bib
Semantic Role Labeling as Sequential Tagging
Lluís Màrquez | Pere Comas | Jesús Giménez | Neus Català
Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)

pdf bib
Combining Linguistic Data Views for Phrase-based SMT
Jesús Giménez | Lluís Màrquez
Proceedings of the ACL Workshop on Building and Using Parallel Texts

pdf bib
A Robust Combination Strategy for Semantic Role Labeling
Lluís Màrquez | Mihai Surdeanu | Pere Comas | Jordi Turmo
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
Senseval-3: The Spanish lexical sample task
Lluis Màrquez | Mariona Taulé | Antonia Martí | Núria Artigas | Mar García | Francis Real | Dani Ferrés
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

pdf bib
TALP system for the English lexical sample task
Gerard Escudero | Lluis Màrquez | German Rigau
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

pdf bib
Senseval-3: The Catalan lexical sample task
Lluis Màrquez | Mariona Taulé | Antonia Martí | Mar García | Francis Real | Dani Ferrés
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

pdf bib
The “Meaning” system on the English all-words task
Luís Villarejo | Lluis Màrquez | Eneko Agirre | David Martínez | Bernardo Magnini | Carlo Strapparava | Diana McCarthy | Andrés Montoyo | Armando Suárez
Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text

pdf bib
Introduction to the CoNLL-2004 Shared Task: Semantic Role Labeling
Xavier Carreras | Lluís Màrquez
Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL 2004

pdf bib
Hierarchical Recognition of Propositional Arguments with Perceptrons
Xavier Carreras | Lluís Màrquez | Grzegorz Chrupała
Proceedings of the Eighth Conference on Computational Natural Language Learning (CoNLL-2004) at HLT-NAACL 2004

pdf bib
MiniCors and Cast3LB: Two Semantically Tagged Spanish Corpora
M. Taulé | M. Civit | N. Artigas | M. García | L. Màrquez | M.A. Martí | B. Navarro
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
SVMTool: A general POS Tagger Generator Based on Support Vector Machines
Jesús Giménez | Lluís Màrquez
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Named Entity Recognition For Catalan Using Only Spanish Resources and Unlabelled Data
Xavier Carreras | Lluís Màrquez | Lluís Padró
10th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
A Simple Named Entity Extractor using AdaBoost
Xavier Carreras | Lluís Màrquez | Lluís Padró
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

pdf bib
Learning a Perceptron-Based Named Entity Chunker via Online Recognition Feedback
Xavier Carreras | Lluís Màrquez | Lluís Padró
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

pdf bib
Low-cost Named Entity Classification for Catalan: Exploiting Multilingual Resources and Unlabeled Data
Lluís Màrquez | Adrià de Gispert | Xavier Carreras | Lluís Padró
Proceedings of the ACL 2003 Workshop on Multilingual and Mixed-language Named Entity Recognition

pdf bib
Automatic Lexical Acquisition from Raw Corpora: An Application to Russian
Antoni Oliver | Irene Castellón | Lluís Màrquez
Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages

2002

pdf bib
Named Entity Extraction using AdaBoost
Xavier Carreras | Lluís Màrquez | Lluís Padró
COLING-02: The 6th Conference on Natural Language Learning 2002 (CoNLL-2002)

pdf bib
Syntactic Features for High Precision Word Sense Disambiguation
David Martínez | Eneko Agirre | Lluís Màrquez
COLING 2002: The 19th International Conference on Computational Linguistics

2001

pdf bib
Boosting trees for clause splitting
Xavier Carreras | Lluís Màrquez
Proceedings of the ACL 2001 Workshop on Computational Natural Language Learning (ConLL)

pdf bib
Using LazyBoosting for Word Sense Disambiguation
Gerard Escudero | Lluís Màrquez | German Rigau
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems

2000

pdf bib
A Comparison between Supervised Learning Algorithms for Word Sense Disambiguation
Gerard Escudero | Lluís Màrquez | German Rigau
Fourth Conference on Computational Natural Language Learning and the Second Learning Language in Logic Workshop

pdf bib
An Empirical Study of the Domain Dependence of Supervised Word Disambiguation Systems
Gerard Escudero | Lluis Marquez | German Rigau
2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

1999

pdf bib
Improving POS Tagging Using Machine-Learning Techniques
Lluis Marquez | Horacio Rodriguez | Josep Carmona | Josep Montolio
1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

1998

pdf bib
On the Evaluation and Comparison of Taggers: the Effect of Noise in Testing Corpora
Lluis Padro | Lluis Marquez
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf bib
On the Evaluation and Comparison of Taggers: the Effect of Noise in Testing Corpora.
Lluis Padro | Lluis Marquez
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

1997

pdf bib
A Flexible POS Tagger Using an Automatically Acquired Language Model
Lluis Marquez | Lluis Padro
35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics

Search
Co-authors