Hailong Yang

Also published as: 海龙


2025

Extensive LLM applications demand efficient structured generations, particularly for LR(1) grammars, to produce outputs in specified formats (e.g., JSON). Existing methods primarily parse LR(1) grammars into a pushdown automaton (PDA), leading to runtime execution overhead for context-dependent token processing, especially inefficient under large inference batches.To address these issues, we propose Pre3 that exploits deterministic pushdown automata (DPDA) to optimize the constrained LLM decoding efficiency.First, by **pre**computing **pre**fix-conditioned edges during the **pre**processing, Pre3 enables ahead-of-time edge analysis and thus makes parallel transition processing possible.Futher, leveraging the prefix-conditioned edges, Pre3 introduces a novel approach that transforms LR(1) transition graphs into DPDA, eliminating the need for runtime path exploration and achieving edge transitions with minimal overhead.Pre3 can be seamlessly integrated into standard LLM inference frameworks, improving time per output token (TPOT) by up to 40% and throughput by up to 36% in our experiments. Our code is available at https://github.com/ModelTC/lightllm.
Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks but their performance in complex logical reasoning tasks remains unsatisfactory. Although some prompting methods, such as Chain-of-Thought, can improve the reasoning ability of LLMs to some extent, they suffer from an unfaithful issue where derived conclusions may not align with the generated reasoning chain. To address this issue, some studies employ the approach of propositional logic to further enhance logical reasoning abilities of LLMs. However, the potential omissions in the extraction of logical expressions in these methods can cause information loss in the logical reasoning process, thereby generating incorrect results. To this end, we propose Logic-of-Thought (LoT) prompting which employs propositional logic to generate expanded logical information descriptions and utilizes them as an additional augmentation to original contexts, thereby ensuring information completeness and enhancing logical reasoning ability. LoT is orthogonal to existing prompting methods and can be seamlessly integrated with them. Extensive experiments demonstrate that LoT boosts the performance of various prompting methods with a striking margin across five logical reasoning tasks. In particular, LoT enhances Chain-of-Thought’s performance on the ReClor dataset by +4.35%, improves Chain-of-Thought with Self-Consistency’s performance on the RuleTaker dataset by +3.52%, and boosts performance of Tree-of-Thoughts on the ProofWriter dataset by +8%.

2023

“诈骗案件分类问题是打击电信网络诈骗犯罪过程中的关键一环,根据不同的诈骗方式、手法等将其分类,通过对不同案件进行有效分类能够便于统计现状,有助于公安部门掌握当前电信网络诈骗案件的分布特点,进而能够对不同类别的诈骗案件作出针对性的预防、监管、制止、侦查等措施。诈骗案件分类属于自然语言处理领域的文本分类任务,传统的基于LSTM和CNN等分类模型能在起到一定的效果,但是由于它们模型结构的参数量的限制,难以达到较为理想的效果。本文基于预训练语言模型Nezha,结合对抗扰动和指数移动平均策略,有助于电信网络诈骗案件分类任务取得更好效果,充分利用电信网络诈骗案件的数据。我们队伍未采用多模型融合的方法,并最终在此次评测任务中排名第三,评测指标分数为0.8625。”