2024
pdf
bib
abs
Unsupervised Contrast-Consistent Ranking with Language Models
Niklas Stoehr
|
Pengxiang Cheng
|
Jing Wang
|
Daniel Preotiuc-Pietro
|
Rajarshi Bhowmik
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Language models contain ranking-based knowledge and are powerful solvers of in-context ranking tasks. For instance, they may have parametric knowledge about the ordering of countries by size or may be able to rank product reviews by sentiment. We compare pairwise, pointwise and listwise prompting techniques to elicit a language model’s ranking knowledge. However, we find that even with careful calibration and constrained decoding, prompting-based techniques may not always be self-consistent in the rankings they produce. This motivates us to explore an alternative approach that is inspired by an unsupervised probing method called Contrast-Consistent Search (CCS). The idea is to train a probe guided by a logical constraint: a language model’s representation of a statement and its negation must be mapped to contrastive true-false poles consistently across multiple statements. We hypothesize that similar constraints apply to ranking tasks where all items are related via consistent, pairwise or listwise comparisons. To this end, we extend the binary CCS method to Contrast-Consistent Ranking (CCR) by adapting existing ranking methods such as the Max-Margin Loss, Triplet Loss and an Ordinal Regression objective. Across different models and datasets, our results confirm that CCR probing performs better or, at least, on a par with prompting.
pdf
bib
abs
HPipe: Large Language Model Pipeline Parallelism for Long Context on Heterogeneous Cost-effective Devices
Ruilong Ma
|
Xiang Yang
|
Jingyu Wang
|
Qi Qi
|
Haifeng Sun
|
Jing Wang
|
Zirui Zhuang
|
Jianxin Liao
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track)
Micro-enterprises and individual developers emerge analysis demands for long sequence with powerful Large Language Models (LLMs). They try to deploy the LLMs at local, but only possess various commodity devices and the unreliable interconnection between devices. Existing parallel techniques do not lead to the same effectiveness in limited environment. The heterogeneity of devices, coupled with their limited capacity and expensive communication, brings challenges to private deployment for maximized utilization of available devices while masking latency. Hence, we introduce HPipe, a pipeline inference framework that successfully mitigates LLMs from high-performance clusters to heterogeneous commodity devices. By ensuring a balanced distribution of workloads, HPipe facilitates the parallel execution of LLMs through pipelining the sequences on the token dimension. The evaluation conducted on LLaMA-7B and GPT3-2B demonstrates that HPipe holds the potential for context analysis on LLM with heterogeneity devices, achieving an impressive speedup in latency and throughput up to 2.28 times.
2023
pdf
bib
Analyzing and Predicting Persistence of News Tweets
Maggie Liu
|
Jing Wang
|
Daniel Preotiuc-Pietro
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
2022
pdf
bib
abs
Modeling Aspect Correlation for Aspect-based Sentiment Analysis via Recurrent Inverse Learning Guidance
Longfeng Li
|
Haifeng Sun
|
Qi Qi
|
Jingyu Wang
|
Jing Wang
|
Jianxin Liao
Proceedings of the 29th International Conference on Computational Linguistics
Aspect-based sentiment analysis (ABSA) aims to distinguish sentiment polarity of every specific aspect in a given sentence. Previous researches have realized the importance of interactive learning with context and aspects. However, these methods are ill-studied to learn complex sentence with multiple aspects due to overlapped polarity feature. And they do not consider the correlation between aspects to distinguish overlapped feature. In order to solve this problem, we propose a new method called Recurrent Inverse Learning Guided Network (RILGNet). Our RILGNet has two points to improve the modeling of aspect correlation and the selecting of aspect feature. First, we use Recurrent Mechanism to improve the joint representation of aspects, which enhances the aspect correlation modeling iteratively. Second, we propose Inverse Learning Guidance to improve the selection of aspect feature by considering aspect correlation, which provides more useful information to determine polarity. Experimental results on SemEval 2014 Datasets demonstrate the effectiveness of RILGNet, and we further prove that RILGNet is state-of-the-art method in multiaspect scenarios.
2020
pdf
bib
abs
Multi-Domain Named Entity Recognition with Genre-Aware and Agnostic Inference
Jing Wang
|
Mayank Kulkarni
|
Daniel Preotiuc-Pietro
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Named entity recognition is a key component of many text processing pipelines and it is thus essential for this component to be robust to different types of input. However, domain transfer of NER models with data from multiple genres has not been widely studied. To this end, we conduct NER experiments in three predictive setups on data from: a) multiple domains; b) multiple domains where the genre label is unknown at inference time; c) domains not encountered in training. We introduce a new architecture tailored to this task by using shared and private domain parameters and multi-task learning. This consistently outperforms all other baseline and competitive methods on all three experimental setups, with differences ranging between +1.95 to +3.11 average F1 across multiple genres when compared to standard approaches. These results illustrate the challenges that need to be taken into account when building real-world NLP applications that are robust to various types of text and the methods that can help, at least partially, alleviate these issues.
2015
pdf
bib
abs
A Sense-Topic Model for Word Sense Induction with Unsupervised Data Enrichment
Jing Wang
|
Mohit Bansal
|
Kevin Gimpel
|
Brian D. Ziebart
|
Clement T. Yu
Transactions of the Association for Computational Linguistics, Volume 3
Word sense induction (WSI) seeks to automatically discover the senses of a word in a corpus via unsupervised methods. We propose a sense-topic model for WSI, which treats sense and topic as two separate latent variables to be inferred jointly. Topics are informed by the entire document, while senses are informed by the local context surrounding the ambiguous word. We also discuss unsupervised ways of enriching the original corpus in order to improve model performance, including using neural word embeddings and external corpora to expand the context of each data instance. We demonstrate significant improvements over the previous state-of-the-art, achieving the best results reported to date on the SemEval-2013 WSI task.