Suman Kalyan Maity

2024

An Experimental Analysis on Evaluating Patent Citations
Rabindra Nath Nandi | Suman Kalyan Maity | Brian Uzzi | Sourav Medya
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

The patent citation count is a good indicator of patent quality. This often generates monetary value for the inventors and organizations. However, the factors that influence a patent receiving high citations over the year are still not well understood. With the patents over the past two decades, we study the problem of patent citation prediction and formulate this as a binary classification problem. We create a semantic graph of patents based on their semantic similarities, enabling the use of Graph Neural Network (GNN)-based approaches for predicting citations. Our experimental results demonstrate the effectiveness of our GNN-based methods when applied to the semantic graph, showing that they can accurately predict patent citations using only patent text. More specifically, these methods produce up to 94% recall for patents with high citations and outperform existing baselines. Furthermore, we leverage this constructed graph to gain insights and explanations for the predictions made by the GNNs.

2017

pdf bib abs

Adapting predominant and novel sense discovery algorithms for identifying corpus-specific sense differences
Binny Mathew | Suman Kalyan Maity | Pratip Sarkar | Animesh Mukherjee | Pawan Goyal
Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing

Word senses are not static and may have temporal, spatial or corpus-specific scopes. Identifying such scopes might benefit the existing WSD systems largely. In this paper, while studying corpus specific word senses, we adapt three existing predominant and novel-sense discovery algorithms to identify these corpus-specific senses. We make use of text data available in the form of millions of digitized books and newspaper archives as two different sources of corpora and propose automated methods to identify corpus-specific word senses at various time points. We conduct an extensive and thorough human judgement experiment to rigorously evaluate and compare the performance of these approaches. Post adaptation, the output of the three algorithms are in the same format and the accuracy results are also comparable, with roughly 45-60% of the reported corpus-specific senses being judged as genuine.

Co-authors

Pratip Sarkar 1

Brian Uzzi 1

Venues

EMNLP1
TextGraphs1

Fix author