Lucas Cecchi


2025

pdf bib
LAW: Legal Agentic Workflows for Custody and Fund Services Contracts
William Watson | Nicole Cho | Nishan Srishankar | Zhen Zeng | Lucas Cecchi | Daniel Scott | Suchetha Siddagangappa | Rachneet Kaur | Tucker Balch | Manuela Veloso
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track

Legal contracts in the custody and fund services domain govern critical aspects such as key provider responsibilities, fee schedules, and indemnification rights. However, it is challenging for an off-the-shelf Large Language Model (LLM) to ingest these contracts due to the lengthy unstructured streams of text, limited LLM context windows, and complex legal jargon. To address these challenges, we introduce LAW (Legal Agentic Workflows for Custody and Fund Services Contracts). LAW features a modular design that responds to user queries by orchestrating a suite of domain-specific tools and text agents. Our experiments demonstrate that LAW, by integrating multiple specialized agents and tools, significantly outperforms the baseline. LAW excels particularly in complex tasks such as calculating a contract’s termination date, surpassing the baseline by 92.9% points. Furthermore, LAW offers a cost-effective alternative to traditional fine-tuned legal LLMs by leveraging reusable, domain-specific tools.

2024

pdf bib
ReportGPT: Human-in-the-loop Verifiable Table-to-Text Generation
Lucas Cecchi | Petr Babkin
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

Recent developments in the quality and accessibility of large language models have precipitated a surge in user-facing tools for content generation. Motivated by a necessity for human quality control of these systems, we introduce ReportGPT: a pipeline framework for verifiable human-in-the-loop table-to-text generation. ReportGPT is based on a domain specific language, which acts as a proof mechanism for generating verifiable commentary. This allows users to quickly check the relevancy and factuality of model outputs. User selections then become few-shot examples for improving the performance of the pipeline. We configure 3 approaches to our pipeline, and find that usage of language models in ReportGPT’s components trade off precision for more insightful downstream commentary. Furthermore, ReportGPT learns from human feedback in real-time, needing only a few samples to improve performance.