Manali Sharma
2026
PatentVision: A multimodal method for drafting patent applications
Ruo Yang | Sai Krishna Reddy Mudhiganti | Manali Sharma
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
Ruo Yang | Sai Krishna Reddy Mudhiganti | Manali Sharma
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
Patent drafting is complex due to its need for detailed technical descriptions, legal compliance, and visual elements. Although Large Vision-Language Models (LVLMs) show promise across various tasks, their application in automating patent writing remains underexplored. In this paper, we present PatentVision, a multimodal framework that integrates textual and visual inputs—such as patent claims and drawings—to generate complete patent specifications. Built on advanced LVLMs, PatentVision enhances accuracy by combining fine-tuned vision-language models with domain-specific training tailored to patents. Experiments reveal it surpasses text-only methods, producing outputs with greater fidelity and alignment with human-written standards. Its incorporation of visual data allows it to better represent intricate design features and functional connections, leading to richer and more precise results. This study underscores the value of multimodal techniques in patent automation, providing a scalable tool to reduce manual workloads and improve consistency. PatentVision not only advances patent drafting but also lays groundwork for broader use of LVLMs in specialized areas, potentially transforming intellectual property management and innovation processes.
2024
Patentformer: A Novel Method to Automate the Generation of Patent Applications
Juanyan Wang | Sai Krishna Reddy Mudhiganti | Manali Sharma
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Juanyan Wang | Sai Krishna Reddy Mudhiganti | Manali Sharma
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
In recent years, Large Language Models (LLMs) have demonstrated impressive performances across various NLP tasks. However, their potential for automating the task of writing patent documents remains relatively unexplored. To address this gap, in this work, we propose a novel method, Patentformer, for generating patent specification by fine-tuning the generative models with diverse sources of information, e.g., patent claims, drawing text, and brief descriptions of the drawings. To enhance the generative models’ comprehension of the complex task of writing patent specification, we introduce a new task, claim+drawing-to-specification, and release a new dataset. We evaluate our proposed method on thousands of patents from the USPTO and show that our method can generate human-like patent specification in legal writing style. Human evaluations by four patent experts further affirm that our proposed method has the potential to generate correct specification, and the quality of generated specification may sometimes be better than the actual specification.
2022
Ranking-Constrained Learning with Rationales for Text Classification
Juanyan Wang | Manali Sharma | Mustafa Bilgic
Findings of the Association for Computational Linguistics: ACL 2022
Juanyan Wang | Manali Sharma | Mustafa Bilgic
Findings of the Association for Computational Linguistics: ACL 2022
We propose a novel approach that jointly utilizes the labels and elicited rationales for text classification to speed up the training of deep learning models with limited training data. We define and optimize a ranking-constrained loss function that combines cross-entropy loss with ranking losses as rationale constraints. We evaluate our proposed rationale-augmented learning approach on three human-annotated datasets, and show that our approach provides significant improvements over classification approaches that do not utilize rationales as well as other state-of-the-art rationale-augmented baselines.