Forecasting Emerging Trends from Scientific Literature

Kartik Asooja, Georgeta Bordea, Gabriela Vulcu, Paul Buitelaar


Abstract
Text analysis methods for the automatic identification of emerging technologies by analyzing the scientific publications, are gaining attention because of their socio-economic impact. The approaches so far have been mainly focused on retrospective analysis by mapping scientific topic evolution over time. We propose regression based approaches to predict future keyword distribution. The prediction is based on historical data of the keywords, which in our case, are LREC conference proceedings. Considering the insufficient number of data points available from LREC proceedings, we do not employ standard time series forecasting methods. We form a dataset by extracting the keywords from previous year proceedings and quantify their yearly relevance using tf-idf scores. This dataset additionally contains ranked lists of related keywords and experts for each keyword.
Anthology ID:
L16-1066
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
417–420
Language:
URL:
https://aclanthology.org/L16-1066
DOI:
Bibkey:
Cite (ACL):
Kartik Asooja, Georgeta Bordea, Gabriela Vulcu, and Paul Buitelaar. 2016. Forecasting Emerging Trends from Scientific Literature. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 417–420, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Forecasting Emerging Trends from Scientific Literature (Asooja et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1066.pdf