Srinivas Bangalore - ACL Anthology

Srinivas Bangalore

Also published as: Srinivas, B. Srinivas

2023

1-step Speech Understanding and Transcription Using CTC Loss
Karan Singla | Shahab Jalalv | Yeon-Jun Kim | Andrej Ljolje | Antonio Moreno Daniel | Srinivas Bangalore | Benjamin Stern
Proceedings of the 20th International Conference on Natural Language Processing (ICON)

Recent studies have made some progress in refining end-to-end (E2E) speech recognition encoders by applying Connectionist Temporal Classification (CTC) loss to enhance named entity recognition within transcriptions. However, these methods have been constrained by their exclusive use of the ASCII character set, allowing only a limited array of semantic labels. We propose 1SPU, a 1-step Speech Processing Unit which can recognize speech events (e.g: speaker change) or an NL event (Intent, Emotion) while also transcribing vocal content. It extends the E2E automatic speech recognition (ASR) system’s vocabulary by adding a set of unused placeholder symbols, conceptually akin to the <pad> tokens used in sequence modeling. These placeholders are then assigned to represent semantic events (in form of tags) and are integrated into the transcription process as distinct tokens. We demonstrate notable improvements on the SLUE benchmark and yields results that are on par with those for the SLURP dataset. Additionally, we provide a visual analysis of the system’s proficiency in accurately pinpointing meaningful tokens over time, illustrating the enhancement in transcription quality through the utilization of supplementary semantic tags.

E2E Spoken Entity Extraction for Virtual Agents
Karan Singla | Yeon-Jun Kim | Srinivas Bangalore
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track

In human-computer conversations, extracting entities such as names, street addresses and email addresses from speech is a challenging task. In this paper, we study the impact of fine-tuning pre-trained speech encoders on extracting spoken entities in human-readable form directly from speech without the need for text transcription. We illustrate that such a direct approach optimizes the encoder to transcribe only the entity relevant portions of speech ignoring the superfluous portions such as carrier phrases, or spell name entities. In the context of dialog from an enterprise virtual agent, we demonstrate that the 1-step approach outperforms the typical 2-step approach which first generates lexical transcriptions followed by text-based entity extraction for identifying spoken entities.

Combining Pre trained Speech and Text Encoders for Continuous Spoken Language Processing
Karan Singla | Mahnoosh Mehrabani | Daniel Pressel | Ryan Price | Bhargav Srinivas Chinnari | Yeon-Jun Kim | Srinivas Bangalore
Proceedings of the 20th International Conference on Natural Language Processing (ICON)

2021

A Hybrid Approach to Scalable and Robust Spoken Language Understanding in Enterprise Virtual Agents
Ryan Price | Mahnoosh Mehrabani | Narendra Gupta | Yeon-Jun Kim | Shahab Jalalvand | Minhua Chen | Yanjie Zhao | Srinivas Bangalore
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers

Spoken language understanding (SLU) extracts the intended mean- ing from a user utterance and is a critical component of conversational virtual agents. In enterprise virtual agents (EVAs), language understanding is substantially challenging. First, the users are infrequent callers who are unfamiliar with the expectations of a pre-designed conversation flow. Second, the users are paying customers of an enterprise who demand a reliable, consistent and efficient user experience when resolving their issues. In this work, we describe a general and robust framework for intent and entity extraction utilizing a hybrid of statistical and rule-based approaches. Our framework includes confidence modeling that incorporates information from all components in the SLU pipeline, a critical addition for EVAs to en- sure accuracy. Our focus is on creating accurate and scalable SLU that can be deployed rapidly for a large class of EVA applications with little need for human intervention.

Intent Features for Rich Natural Language Understanding
Brian Lester | Sagnik Ray Choudhury | Rashmi Prasad | Srinivas Bangalore
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers

Complex natural language understanding modules in dialog systems have a richer understanding of user utterances, and thus are critical in providing a better user experience. However, these models are often created from scratch, for specific clients and use cases and require the annotation of large datasets. This encourages the sharing of annotated data across multiple clients. To facilitate this we introduce the idea of intent features: domain and topic agnostic properties of intents that can be learnt from the syntactic cues only, and hence can be shared. We introduce a new neural network architecture, the Global-Local model, that shows significant improvement over strong baselines for identifying these features in a deployed, multi-intent natural language understanding module, and more generally in a classification setting where a part of an utterance has to be classified utilizing the whole context.

DialogActs based Search and Retrieval for Response Generation in Conversation Systems
Nidhi Arora | Rashmi Prasad | Srinivas Bangalore
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

Designing robust conversation systems with great customer experience requires a team of design experts to think of all probable ways a customer can interact with the system and then author responses for each use case individually. The responses are authored from scratch for each new client and application even though similar responses have been created in the past. This happens largely because the responses are encoded using domain specific set of intents and entities. In this paper, we present preliminary work to define a dialog act schema to merge and map responses from different domains and applications using a consistent domain-independent representation. These representations are stored and maintained using an Elasticsearch system to facilitate generation of responses through a search and retrieval process. We experimented generating different surface realizations for a response given a desired information state of the dialog.

2020

Constrained Decoding for Computationally Efficient Named Entity Recognition Taggers
Brian Lester | Daniel Pressel | Amy Hemmeter | Sagnik Ray Choudhury | Srinivas Bangalore
Findings of the Association for Computational Linguistics: EMNLP 2020

Current state-of-the-art models for named entity recognition (NER) are neural models with a conditional random field (CRF) as the final layer. Entities are represented as per-token labels with a special structure in order to decode them into spans. Current work eschews prior knowledge of how the span encoding scheme works and relies on the CRF learning which transitions are illegal and which are not to facilitate global coherence. We find that by constraining the output to suppress illegal transitions we can train a tagger with a cross-entropy loss twice as fast as a CRF with differences in F1 that are statistically insignificant, effectively eliminating the need for a CRF. We analyze the dynamics of tag co-occurrence to explain when these constraints are most effective and provide open source implementations of our tagger in both PyTorch and TensorFlow.

2018

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)
Srinivas Bangalore | Jennifer Chu-Carroll | Yunyao Li
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

The SLT-Interactions Parsing System at the CoNLL 2018 Shared Task
Riyaz A. Bhat | Irshad Bhat | Srinivas Bangalore
Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

This paper describes our system (SLT-Interactions) for the CoNLL 2018 shared task: Multilingual Parsing from Raw Text to Universal Dependencies. Our system performs three main tasks: word segmentation (only for few treebanks), POS tagging and parsing. While segmentation is learned separately, we use neural stacking for joint learning of POS tagging and parsing tasks. For all the tasks, we employ simple neural network architectures that rely on long short-term memory (LSTM) networks for learning task-dependent features. At the basis of our parser, we use an arc-standard algorithm with Swap action for general non-projective parsing. Additionally, we use neural stacking as a knowledge transfer mechanism for cross-domain parsing of low resource domains. Our system shows substantial gains against the UDPipe baseline, with an average improvement of 4.18% in LAS across all languages. Overall, we are placed at the 12th position on the official test sets.

Bootstrapping Multilingual Intent Models via Machine Translation for Dialog Automation
Nicholas Ruiz | Srinivas Bangalore | John Chen
Proceedings of the 21st Annual Conference of the European Association for Machine Translation

With the resurgence of chat-based dialog systems in consumer and enterprise applications, there has been much success in developing data-driven and rule-based natural language models to understand human intent. Since these models require large amounts of data and in-domain knowledge, expanding an equivalent service into new markets is disrupted by language barriers that inhibit dialog automation. This paper presents a user study to evaluate the utility of out-of-the-box machine translation technology to (1) rapidly bootstrap multilingual spoken dialog systems and (2) enable existing human analysts to understand foreign language utterances. We additionally evaluate the utility of machine translation in human assisted environments, where a portion of the traffic is processed by analysts. In English→Spanish experiments, we observe a high potential for dialog automation, as well as the potential for human analysts to process foreign language utterances with high accuracy.

2017

Underspecification in Natural Language Understanding for Dialog Automation
John Chen | Srinivas Bangalore
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

With the increasing number of communication platforms that offer variety of ways of connecting two interlocutors, there is a resurgence of chat-based dialog systems. These systems, typically known as chatbots have been successfully applied in a range of consumer and enterprise applications. A key technology in such chat-bots is robust natural language understanding (NLU) which can significantly influence and impact the efficacy of the conversation and ultimately the user-experience. While NLU is far from perfect, this paper illustrates the role of underspecification and its impact on successful dialog completion.

Proceedings of the Workshop on Speech-Centric Natural Language Processing
Nicholas Ruiz | Srinivas Bangalore
Proceedings of the Workshop on Speech-Centric Natural Language Processing

2016

Revisiting Supertagging and Parsing: How to Use Supertags in Transition-Based Parsing
Wonchang Chung | Suhas Siddhesh Mhatre | Alexis Nasr | Owen Rambow | Srinivas Bangalore
Proceedings of the 12th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+12)

Rapid Prototyping of Form-driven Dialogue Systems Using an Open-source Framework
Svetlana Stoyanchev | Pierre Lison | Srinivas Bangalore
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2014

AT&T: The Tag&Parse Approach to Semantic Parsing of Robot Spatial Commands
Svetlana Stoyanchev | Hyuckchul Jung | John Chen | Srinivas Bangalore
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

Predicting post-editor profiles from the translation process
Karan Singla | David Orrego-Carmona | Ashleigh Rhea Gonzales | Michael Carl | Srinivas Bangalore
Workshop on interactive and adaptive machine translation

The purpose of the current investigation is to predict post-editor profiles based on user behaviour and demographics using machine learning techniques to gain a better understanding of post-editor styles. Our study extracts process unit features from the CasMaCat LS14 database from the CRITT Translation Process Research Database (TPR-DB). The analysis has two main research goals: We create n-gram models based on user activity and part-of-speech sequences to automatically cluster post-editors, and we use discriminative classifier models to characterize post-editors based on a diverse range of translation process features. The classification and clustering of participants resulting from our study suggest this type of exploration could be used as a tool to develop new translation tool features or customization possibilities.

A Framework for Translating SMS Messages
Vivek Kumar Rangarajan Sridhar | John Chen | Srinivas Bangalore | Ron Shacham
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

Exploring System Combination approaches for Indo-Aryan MT Systems
Karan Singla | Anupam Singh | Nishkarsh Shastri | Megha Jhunjhunwala | Srinivas Bangalore | Dipti Misra Sharma
Proceedings of the EMNLP’2014 Workshop on Language Technology for Closely Related Languages and Language Variants

Reducing the Impact of Data Sparsity in Statistical Machine Translation
Karan Singla | Kunal Sachdeva | Srinivas Bangalore | Dipti Misra Sharma | Diksha Yadav
Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation

SEECAT: ASR & Eye-tracking enabled computer-assisted translation
Mercedes García-Martínez | Karan Singla | Aniruddha Tammewar | Bartolomé Mesa-Lao | Ankita Thakur | Anusuya M.A. | Srinivas Bangalore | Michael Carl
Proceedings of the 17th Annual Conference of the European Association for Machine Translation

Towards simultaneous interpreting: the timing of incremental machine translation and speech synthesis
Timo Baumann | Srinivas Bangalore | Julia Hirschberg
Proceedings of the 11th International Workshop on Spoken Language Translation: Papers

In simultaneous interpreting, human experts incrementally construct and extend partial hypotheses about the source speaker’s message, and start to verbalize a corresponding message in the target language, based on a partial translation – which may have to be corrected occasionally. They commence the target utterance in the hope that they will be able to finish understanding the source speaker’s message and determine its translation in time for the unfolding delivery. Of course, both incremental understanding and translation by humans can be garden-pathed, although experts are able to optimize their delivery so as to balance the goals of minimal latency, translation quality and high speech fluency with few corrections. We investigate the temporal properties of both translation input and output to evaluate the tradeoff between low latency and translation quality. In addition, we estimate the improvements that can be gained with a tempo-elastic speech synthesizer.

2013

Segmentation Strategies for Streaming Speech Translation
Vivek Kumar Rangarajan Sridhar | John Chen | Srinivas Bangalore | Andrej Ljolje | Rathinavelu Chengalvarayan
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Incremental Segmentation and Decoding Strategies for Simultaneous Translation
Mahsa Yarmohammadi | Vivek Kumar Rangarajan Sridhar | Srinivas Bangalore | Baskaran Sankaran
Proceedings of the Sixth International Joint Conference on Natural Language Processing

2012

Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Eric Fosler-Lussier | Ellen Riloff | Srinivas Bangalore
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Real-time Incremental Speech-to-Speech Translation of Dialogs
Srinivas Bangalore | Vivek Kumar Rangarajan Sridhar | Prakash Kolan | Ladan Golipour | Aura Jimenez
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Harvesting Parallel Text in Multiple Languages with Limited Supervision
Luciano Barbosa | Vivek Kumar Rangarajan Sridhar | Mahsa Yarmohammadi | Srinivas Bangalore
Proceedings of COLING 2012

2011

Predicting Relative Prominence in Noun-Noun Compounds
Taniya Mishra | Srinivas Bangalore
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

Crawling Back and Forth: Using Back and Out Links to Locate Bilingual Sites
Luciano Barbosa | Srinivas Bangalore | Vivek Kumar Rangarajan Sridhar
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

Qme! : A Speech-based Question-Answering system on Mobile Devices
Taniya Mishra | Srinivas Bangalore
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

Speech-Driven Access to the Deep Web on Mobile Devices
Taniya Mishra | Srinivas Bangalore
Proceedings of the ACL 2010 System Demonstrations

Phrase Based Decoding using a Discriminative Model
Prasanth Kolachina | Sriram Venkatapathy | Srinivas Bangalore | Sudheer Kolachina | Avinesh PVS
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation

Proceedings of the 10th International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+10)
Srinivas Bangalore | Robert Frank | Maribel Romero
Proceedings of the 10th International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+10)

2009

Incremental Parsing Models for Dialog Task Structure
Srinivas Bangalore | Amanda Stent
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

Effects of Word Confusion Networks on Voice Search
Junlan Feng | Srinivas Bangalore
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

MICA: A Probabilistic Dependency Parser Based on Tree Insertion Grammars (Application Note)
Srinivas Bangalore | Pierre Boullier | Alexis Nasr | Owen Rambow | Benoît Sagot
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

Robust Understanding in Multimodal Interfaces
Srinivas Bangalore | Michael Johnston
Computational Linguistics, Volume 35, Number 3, September 2009

Tightly coupling Speech Recognition and Search
Taniya Mishra | Srinivas Bangalore
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

2008

Trainable Speaker-Based Referring Expression Generation
Giuseppe Di Fabbrizio | Amanda Stent | Srinivas Bangalore
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning

Enriching Spoken Language Translation with Dialog Acts
Vivek Kumar Rangarajan Sridhar | Srinivas Bangalore | Shrikanth Narayanan
Proceedings of ACL-08: HLT, Short Papers

Referring Expression Generation Using Speaker-based Attribute Selection and Trainable Realization (ATTR)
Giuseppe Di Fabbrizio | Amanda J. Stent | Srinivas Bangalore
Proceedings of the Fifth International Natural Language Generation Conference

HotSpots: Visualizing Edits to a Text
Srinivas Bangalore | David Smith
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

2007

Exploiting Acoustic and Syntactic Features for Prosody Labeling in a Maximum Entropy Framework
Vivek Kumar Rangarajan Sridhar | Srinivas Bangalore | Shrikanth Narayanan
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

Three models for discriminative machine translation using Global Lexical Selection and Sentence Reconstruction
Sriram Venkatapathy | Srinivas Bangalore
Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation

Statistical Machine Translation through Global Lexical Selection and Sentence Reconstruction
Srinivas Bangalore | Patrick Haffner | Stephan Kanthak
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

Edit Machines for Robust Multimodal Language Processing
Srinivas Bangalore | Michael Johnston
11th Conference of the European Chapter of the Association for Computational Linguistics

Finite-state transducer-based statistical machine translation using joint probabilities
Srinivas Bangalore | Stephan Kanthak | Patrick Haffner
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

Learning the Structure of Task-Driven Human-Human Dialogs
Srinivas Bangalore | Giuseppe Di Fabbrizio | Amanda Stent
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2004

MATCHkiosk: A Multimodal Interactive City Guide
Michael Johnston | Srinivas Bangalore
Proceedings of the ACL Interactive Poster and Demonstration Sessions

Compiling Boostexter Rules into a Finite-state Transducer
Srinivas Bangalore
Proceedings of the ACL Interactive Poster and Demonstration Sessions

Balancing data-driven and rule-based approaches in the context of a Multimodal Conversational System
Srinivas Bangalore | Michael Johnston
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004

2002

Creating a Finite-State Parser with Application Semantics
Owen Rambow | Srinivas Bangalore | Tahir Butt | Alexis Nasr | Richard Sproat
COLING 2002: The 17th International Conference on Computational Linguistics: Project Notes

Towards Automatic Generation of Natural Language Generation Systems
John Chen | Srinivas Bangalore | Owen Rambow | Marilyn A. Walker
COLING 2002: The 19th International Conference on Computational Linguistics

Bootstrapping Bilingual Data using Consensus Translation for a Multilingual Instant Messaging System
Srinivas Bangalore | Vanessa Murdock | Giuseppe Riccardi
COLING 2002: The 19th International Conference on Computational Linguistics

Context-Free Parsing of a Tree Adjoining Grammar Using Finite-State Machines
Alexis Nasr | Owen Rambow | John Chen | Srinivas Bangalore
Proceedings of the Sixth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+6)

Extracting Clauses for Spoken Language Understanding in Conversational Systems
Narendra Gupta | Srinivas Bangalore
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

Reranking an n-gram supertagger
John Chen | Srinivas Bangalore | Michael Collins | Owen Rambow
Proceedings of the Sixth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+6)

MATCH: An Architecture for Multimodal Dialogue Systems
Michael Johnston | Srinivas Bangalore | Gunaranjan Vasireddy | Amanda Stent | Patrick Ehlen | Marilyn Walker | Steve Whittaker | Preetam Maloor
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

2001

A Finite-State Approach to Machine Translation
Srinivas Bangalore | Giuseppe Riccardi
Second Meeting of the North American Chapter of the Association for Computational Linguistics

Natural Language Generation in Dialog Systems
Owen Rambow | Srinivas Bangalore | Marilyn Walker
Proceedings of the First International Conference on Human Language Technology Research

Impact of Quality and Quantity of Corpora on Stochastic Generation
Srinivas Bangalore | John Chen | Owen Rambow
Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing

2000

Stochastic Finite-State models for Spoken Language Machine Translation
Srinivas Bangalore | Giuseppe Riccardi
ANLP-NAACL 2000 Workshop: Embedded Machine Translation Systems

Using TAGs, a Tree Model, and a Language Model for Generation
Srinivas Bangalore | Owen Rambow
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

Learning dependency translation models as collections of finite state head transducers
Hiyan Alsawi | Srinivas Bangalore | Shona Douglas
Computational Linguistics, Volume 26, Number 1, March 2000

Evaluation Metrics for Generation
Srinivas Bangalore | Owen Rambow | Steve Whittaker
INLG’2000 Proceedings of the First International Conference on Natural Language Generation

Finite-state Multimodal Parsing and Understanding
Michael Johnston | Srinivas Bangalore
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

Corpus-Based Lexical Choice in Natural Language Generation
Srinivas Bangalore | Owen Rambow
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics

Exploiting a Probabilistic Hierarchical Model for Generation
Srinivas Bangalore | Owen Rambow
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

1999

Supertagging: An Approach to Almost Parsing
Srinivas Bangalore | Aravind K. Joshi
Computational Linguistics, Volume 25, Number 2, June 1999

New Models for Improving Supertag Disambiguation
John Chen | Srinivas Bangalore
Ninth Conference of the European Chapter of the Association for Computational Linguistics

1998

Automatic Acquisition of Hierarchical Transduction Models for Machine Translation
Hiyan Alshawi | Srinivas Bangalore | Shona Douglas
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

Automatic Acquisition of Phrase Grammars for Stochastic Language Modeling
Giuseppe Riccardi | Srinivas Bangalore
Sixth Workshop on Very Large Corpora

Transplanting supertags from English to Spanish
Srinivas Bangalore
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)

Automatic Acquisition of Hierarchical Transduction Models for Machine Translation
Hiyan Alshawi | Srinivas Bangalore | Shona Douglas
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

1997

Maintaining the Forest and Burning out the Underbrush in XTAG
Christine Doran | Beth Hockey | Philip Hopely | Joseph Rosenzweig | Anoop Sarkar | B. Srinivas | Fei Xia
Computational Environments for Grammar Development and Linguistic Engineering

Performance Evaluation of Supertagging for Partial Parsing
B. Srinivas
Proceedings of the Fifth International Workshop on Parsing Technologies

In previous work we introduced the idea of supertagging as a means of improving the efficiency of a lexicalized grammar parser. In this paper, we present supertagging in conjunction with a lightweight dependency analyzer as a robust and efficient partial parser. The present work is significant for two reasons. First, we have vastly improved our results; 92% accurate for supertag disambiguation using lexical information, larger training corpus and smoothing techniques. Second, we show how supertagging can be used for partial parsing and provide detailed evaluation results for detecting noun chunks, verb chunks, preposition phrase attachment and a variety of other linguistic constructions. Using supertag representation, we achieve a recall rate of 93.0% and a precision rate of 91.8% for noun chunking, improving on the best known result for noun chunking.

EAGLE: An Extensible Architecture for General Linguistic Engineering
Breck Baldwin | Christine Doran | Jeffrey C. Reynar | Michael Niv | B. Srinivas
Fifth Conference on Applied Natural Language Processing: Descriptions of System Demonstrations and Videos

1996

Motivations and Methods for Text Simplification
R. Chandrasekar | Christine Doran | B. Srinivas
COLING 1996 Volume 2: The 16th International Conference on Computational Linguistics

1995

University of Pennsylvania: Description of the University of Pennsylvania System Used for MUC-6
Breck Baldwin | Jeff Reynar | Mike Collins | Jason Eisner | Adwait Ratnaparkhi | Joseph Rosenzweig | Anoop Sarkar | Srinivas
Sixth Message Understanding Conference (MUC-6): Proceedings of a Conference Held in Columbia, Maryland, November 6-8, 1995

Heuristics and Parse Ranking
B. Srinivas | Christine Doran | Seth Kulick
Proceedings of the Fourth International Workshop on Parsing Technologies

There are currently two philosophies for building grammars and parsers – Statistically induced grammars and Wide-coverage grammars. One way to combine the strengths of both approaches is to have a wide-coverage grammar with a heuristic component which is domain independent but whose contribution is tuned to particular domains. In this paper, we discuss a three-stage approach to disambiguation in the context of a lexicalized grammar, using a variety of domain independent heuristic techniques. We present a training algorithm which uses hand-bracketed treebank parses to set the weights of these heuristics. We compare the performance of our grammar against the performance of the IBM statistical grammar, using both untrained and trained weights for the heuristics.

Some Novel Applications of Explanation-Based Learning to Parsing Lexicalized Tree-Adjoining Grammars
B. Srinivas | Aravind K. Joshi
33rd Annual Meeting of the Association for Computational Linguistics

1994

Complexity of Description of Primitives: Relevance to Local Statistical Computations
Aravind K. Joshi | B. Srinivas
The Balancing Act: Combining Symbolic and Statistical Approaches to Language

XTAG System - A Wide Coverage Grammar for English
Christy Doran | Dania Egedi | Beth Ann Hockey | B. Srinivas | Martin Zaidel
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics

Disambiguation of Super Parts of Speech (or Supertags): Almost Parsing
Aravind K. Joshi | B. Srinivas
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics

Co-authors

Christine Doran 4

Aravind Joshi 4

Taniya Mishra 4

Giuseppe Riccardi 4

Hiyan Alshawi 3

Giuseppe Di Fabbrizio 3

Shona Douglas 3

Marilyn Walker 3

Breck Baldwin 2

Luciano Barbosa 2

Sagnik Ray Choudhury 2

Michael Collins 2

Narendra Gupta 2

Patrick Haffner 2

Beth Ann Hockey 2

Stephan Kanthak 2

Andrej Ljolje 2

Mahnoosh Mehrabani 2

Shrikanth Narayanan 2

Rashmi Prasad 2

Daniel Pressel 2

Joseph Rosenzweig 2

Nicholas Ruiz 2

Dipti Misra Sharma 2

Svetlana Stoyanchev 2

Sriram Venkatapathy 2

Steve Whittaker 2

Mahsa Yarmohammadi 2

Riyaz Ahmad Bhat 1

Pierre Boullier 1

Jaime G. Carbonell 1

Raman Chandrasekar 1

Rathinavelu Chengalvarayan 1

Bhargav Srinivas Chinnari 1

Jennifer Chu-Carroll 1

Wonchang Chung 1

Barbara Cuthill 1

Antonio Moreno Daniel 1

Christy Doran 1

Patrick Ehlen 1

Carol Espy-Wilson 1

Christiane Fellbaum 1

Eric Fosler-Lussier 1

Mercedes García-Martínez 1

John S. Garofolo 1

Ladan Golipour 1

Ashleigh Rhea Gonzales 1

Julia Hirschberg 1

Philip Hopely 1

Shahab Jalalv 1

Shahab Jalalvand 1

Megha Jhunjhunwala 1

Hyuckchul Jung 1

Prasanth Kolachina 1

Sudheer Kolachina 1

Prakash Kolan 1

Preetam Maloor 1

Andrew McCallum 1

Bartolomé Mesa-Lao 1

Suhas Siddhesh Mhatre 1

Nelson Morgan 1

Vanessa Murdock 1

David Orrego-Carmona 1

Michael Picheney 1

Lance Ramshaw 1

Adwait Ratnaparkhi 1

Jeffrey C. Reynar 1

Maribel Romero 1

Kunal Sachdeva 1

Benoît Sagot 1

Baskaran Sankaran 1

Nishkarsh Shastri 1

Hadar Shemtov 1

David A. Smith 1

Richard Sproat 1

Benjamin Stern 1

Aniruddha Tammewar 1

Ankita Thakur 1

Gunaranjan Vasireddy 1

Martin Zaidel 1

Venues