Bing Lim

2024

Recently, LLMs have significantly improved code generation, making it increasingly accessible to users. As a result, LLM-powered code generation applications have sprung up, vastly boosting user productivity. This paper mainly explores how to improve the efficiency and experience of users in formatting the document. Specifically, we propose an automatic document formatting method, Text-to-Format, which is driven by various prompting strategies. Text-to-Format takes the user’s formatting instructions and then generates code that can be run in Microsoft Word to format the content in a document. Further, to evaluate automatic document formatting approaches and advance the document formatting task, we built an evaluation specification including a high-quality dataset DocFormEval data, a code runtime environment, and evaluation metrics. Extensive experimental results on data reveal that the prompting strategy’s effect positively correlates with how much knowledge it introduces related to document formatting task. We believe the constructed DocFormEval data and the exploration about Text-to-Format can help developers build more intelligent tools for automatic document formatting, especially in offline scenarios, where the data privacy is the top priority.

Co-authors

Guan Weixin 1

Venues

emnlp1

Fix data