Silviu Cucerzan

Also published as: Silviu-Petru Cucerzan


2024

pdf bib
Knowledge-Centric Templatic Views of Documents
Isabel Alyssa Cachola | Silviu Cucerzan | Allen Herring | Vuksan Mijovic | Erik Oveson | Sujay Kumar Jauhar
Findings of the Association for Computational Linguistics: EMNLP 2024

Authors seeking to communicate with broader audiences often share their ideas in various document formats, such as slide decks, newsletters, reports, and posters. Prior work on document generation has generally tackled the creation of each separate format to be a different task, leading to fragmented learning processes, redundancy in models and methods, and disjointed evaluation. We consider each of these documents as templatic views of the same underlying knowledge/content, and we aim to unify the generation and evaluation of these templatic views. We begin by showing that current LLMs are capable of generating various document formats with little to no supervision. Further, a simple augmentation involving a structured intermediate representation can improve performance, especially for smaller models. We then introduce a novel unified evaluation framework that can be adapted to measuring the quality of document generators for heterogeneous downstream applications. This evaluation is adaptable to a range of user defined criteria and application scenarios, obviating the need for task specific evaluation metrics. Finally, we conduct a human evaluation, which shows that people prefer 82% of the documents generated with our method, while correlating more highly with our unified evaluation framework than prior metrics in the literature.

2018

pdf bib
Multi-lingual Entity Discovery and Linking
Avi Sil | Heng Ji | Dan Roth | Silviu-Petru Cucerzan
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

The primary goals of this tutorial are to review the framework of cross-lingual EL and motivate it as a broad paradigm for the Information Extraction task. We will start by discussing the traditional EL techniques and metrics and address questions relevant to the adequacy of these to across domains and languages. We will then present more recent approaches such as Neural EL, discuss the basic building blocks of a state-of-the-art neural EL system and analyze some of the current results on English EL. We will then proceed to Cross-lingual EL and discuss methods that work across languages. In particular, we will discuss and compare multiple methods that make use of multi-lingual word embeddings. We will also present EL methods that work for both name tagging and linking in very low resource languages. Finally, we will discuss the uses of cross-lingual EL in a variety of applications like search engines and commercial product selling applications. Also, contrary to the 2014 EL tutorial, we will also focus on Entity Discovery which is an essential component of EL.

2014

pdf bib
Towards Temporal Scoping of Relational Facts based on Wikipedia Data
Avirup Sil | Silviu-Petru Cucerzan
Proceedings of the Eighteenth Conference on Computational Natural Language Learning

2008

pdf bib
Augmenting Wikipedia with Named Entity Tags
Wisam Dakka | Silviu Cucerzan
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

2007

pdf bib
Large-Scale Named Entity Disambiguation Based on Wikipedia Data
Silviu Cucerzan
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2004

pdf bib
Spelling Correction as an Iterative Process that Exploits the Collective Knowledge of Web Users
Silviu Cucerzan | Eric Brill
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing

2003

pdf bib
Minimally Supervised Induction of Grammatical Gender
Silviu Cucerzan | David Yarowsky
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics

2002

pdf bib
Augmented Mixture Models for Lexical Disambiguation
Silviu Cucerzan | David Yarowsky
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

pdf bib
Bootstrapping a Multilingual Part-of-speech Tagger in One Person-day
Silviu Cucerzan | David Yarowsky
COLING-02: The 6th Conference on Natural Language Learning 2002 (CoNLL-2002)

pdf bib
Language Independent NER using a Unified Model of Internal and Contextual Evidence
Silviu Cucerzan | David Yarowsky
COLING-02: The 6th Conference on Natural Language Learning 2002 (CoNLL-2002)

2001

pdf bib
The John Hopkins SENSEVAL-2 System Descriptions
David Yarowsky | Silviu Cucerzan | Radu Florian | Charles Schafer | Richard Wicentowski
Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems

2000

pdf bib
Language Independent, Minimally Supervised Induction of Lexical Probabilities
Silviu Cucerzan | David Yarowsky
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics

1999

pdf bib
Language Independent Named Entity Recognition Combining Morphological and Contextual Evidence
Silviu Cucerzan | David Yarowsky
1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora