Comparing Supervised Machine Learning Techniques for Genre Analysis in Software Engineering Research Articles

Felipe Araújo de Britto; Thiago Castro Ferreira; Leonardo Pereira Nunes; Fernando Silva Parreiras

Comparing Supervised Machine Learning Techniques for Genre Analysis in Software Engineering Research Articles

Felipe Araújo de Britto, Thiago Castro Ferreira, Leonardo Pereira Nunes, Fernando Silva Parreiras

Abstract

Written communication is of utmost importance to the progress of scientific research. The speed of such development, however, may be affected by the scarcity of reviewers to referee the quality of research articles. In this context, automatic approaches that are able to query linguistic segments in written contributions by detecting the presence or absence of common rhetorical patterns have become a necessity. This paper aims to compare supervised machine learning techniques tested to accomplish genre analysis in Introduction sections of software engineering articles. A semi-supervised approach was carried out to augment the number of annotated sentences in SciSents (Avaliable on: ANONYMOUS). Two supervised approaches using SVM and logistic regression were undertaken to assess the F-score for genre analysis in the corpus. A technique based on logistic regression and BERT has been found to perform genre analysis highly satisfactorily with an average of 88.25 on F-score when retrieving patterns at an overall level.

Anthology ID:: 2021.ranlp-1.8
Volume:: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:: September
Year:: 2021
Address:: Held Online
Editors:: Ruslan Mitkov, Galia Angelova
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd.
Note:
Pages:: 63–72
Language:
URL:: https://aclanthology.org/2021.ranlp-1.8/
DOI:
Bibkey:
Cite (ACL):: Felipe Araújo de Britto, Thiago Castro Ferreira, Leonardo Pereira Nunes, and Fernando Silva Parreiras. 2021. Comparing Supervised Machine Learning Techniques for Genre Analysis in Software Engineering Research Articles. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 63–72, Held Online. INCOMA Ltd..
Cite (Informal):: Comparing Supervised Machine Learning Techniques for Genre Analysis in Software Engineering Research Articles (Araújo de Britto et al., RANLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.ranlp-1.8.pdf

PDF Cite Search Fix data