De-Identification of Sensitive Personal Data in Datasets Derived from IIT-CDIP Stefan Larson author Nicole Cornehl Lima author Santiago Pedroza Diaz author Amogh Manoj Joshi author Siddharth Betala author Jamiu Tunde Suleiman author Yash Mathur author Kaushal Kumar Prajapati author Ramla Alakraa author Junjie Shen author Temi Okotore author Kevin Leach author 2024-11 text Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing Yaser Al-Onaizan editor Mohit Bansal editor Yun-Nung Chen editor Association for Computational Linguistics Miami, Florida, USA conference publication larson-etal-2024-de 10.18653/v1/2024.emnlp-main.1198 https://aclanthology.org/2024.emnlp-main.1198/ 2024-11 21494 21505