Qiang Chen

2025

Multi-style outline controllable generation is crucial for multiple applications, including document semantic structuring and retrieval-augmented generation.The great success of preference alignment approaches encourages their application in controllable generation tasks.However, these attempts encounter several limitations: (1) response pair requirements, (2) substantial computation costs, and (3) insufficient exploitation of fine-grained preference signals.To address these problems, we propose a token-level preference self-alignment optimization, named TKPO, for outline controllable generation. TKPO extends the Bradley-Terry model from pair-wise to list-wise comparison, which is further applied at the token level for fine-grained preference signal utilization. In comparison to the representative methods, e.g., DPO, TKPO does not require response pairs; instead, we propose a controllable attributes-driven method to construct reject samples for self-alignment. Additionally, TKPO optimizes only the base model, thereby avoiding additional memory usage and substantial computational costs.We curate two outline controllable generation datasets with regard to language style and level-of-detail.Extensive experiments demonstrate that TKPO outperforms DPO by up to 19.28% in performance while requiring only 56.25% in training time.We release the code and datasets resources at https://github.com/WHUIR/TKPO.

2023

pdf bib abs

Dynamic early exit has demonstrated great potential in coping with the sharply increasing number of pre-trained language model parameters, which can achieve a good trade-off between performance and efficiency. The existing early exit paradigm relies on training parametrical internal classifiers at each intermediate layer to complete specific tasks. Based on the predictions of these internal classifiers, different methods are designed to decide when to exit. Under this circumstance, each intermediate layer takes on both generic language representation learning and task-specific feature extraction, which makes each intermediate layer struggle to balance two types of backward loss signals during training. To break this dilemma, we propose an adapter method to decouple the two distinct types of representation and further introduce a non-parametric simplex equiangular tight frame classifier (ETF) for improvement. Extensive experiments on monolingual and multilingual tasks demonstrate that our method gains significant improvements over strong PLM backbones and early exit methods.

2022

pdf bib abs

The conventional success of textual classification relies on annotated data, and the new paradigm of pre-trained language models (PLMs) still requires a few labeled data for downstream tasks. However, in real-world applications, label noise inevitably exists in training data, damaging the effectiveness, robustness, and generalization of the models constructed on such data. Recently, remarkable achievements have been made to mitigate this dilemma in visual data, while only a few explore textual data. To fill this gap, we present SelfMix, a simple yet effective method, to handle label noise in text classification tasks. SelfMix uses the Gaussian Mixture Model to separate samples and leverages semi-supervised learning. Unlike previous works requiring multiple models, our method utilizes the dropout mechanism on a single model to reduce the confirmation bias in self-training and introduces a textual level mixup training strategy. Experimental results on three text classification benchmarks with different types of text show that the performance of our proposed method outperforms these strong baselines designed for both textual and visual data under different noise ratios and noise types. Our anonymous code is available at https://github.com/noise-learning/SelfMix.

pdf bib abs

In e-commerce, the salience of commonsense knowledge (CSK) is beneficial for widespread applications such as product search and recommendation. For example, when users search for “running” in e-commerce, they would like to find products highly related to running, such as “running shoes” rather than “shoes”. Nevertheless, many existing CSK collections rank statements solely by confidence scores, and there is no information about which ones are salient from a human perspective. In this work, we define the task of supervised salience evaluation, where given a CSK triple, the model is required to learn whether the triple is salient or not. In addition to formulating the new task, we also release a new Benchmark dataset of Salience Evaluation in E-commerce (BSEE) and hope to promote related research on commonsense knowledge salience evaluation. We conduct experiments in the dataset with several representative baseline models. The experimental results show that salience evaluation is a hard task where models perform poorly on our evaluation set. We further propose a simple but effective approach, PMI-tuning, which shows promise for solving this novel problem. Code is available in https://github.com/OpenBGBenchmark/OpenBG-CSK.

2019

pdf bib abs

Meta Relational Learning for Few-Shot Link Prediction in Knowledge Graphs
Mingyang Chen | Wen Zhang | Wei Zhang | Qiang Chen | Huajun Chen
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Link prediction is an important way to complete knowledge graphs (KGs), while embedding-based methods, effective for link prediction in KGs, perform poorly on relations that only have a few associative triples. In this work, we propose a Meta Relational Learning (MetaR) framework to do the common but challenging few-shot link prediction in KGs, namely predicting new triples about a relation by only observing a few associative triples. We solve few-shot link prediction by focusing on transferring relation-specific meta information to make model learn the most important knowledge and learn faster, corresponding to relation meta and gradient meta respectively in MetaR. Empirically, our model achieves state-of-the-art results on few-shot link prediction KG benchmarks.

Qiang Chen

2025

2023

2022

2019

2015

2014

Co-authors

Venues