Vana Kalogeraki


pdf bib
First Story Detection using Entities and Relations
Nikolaos Panagiotou | Cem Akkaya | Kostas Tsioutsiouliklis | Vana Kalogeraki | Dimitrios Gunopulos
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

News portals, such as Yahoo News or Google News, collect large amounts of documents from a variety of sources on a daily basis. Only a small portion of these documents can be selected and displayed on the homepage. Thus, there is a strong preference for major, recent events. In this work, we propose a scalable and accurate First Story Detection (FSD) pipeline that identifies fresh news. In comparison to other FSD systems, our method relies on relation extraction methods exploiting entities and their relations. We evaluate our pipeline using two distinct datasets from Yahoo News and Google News. Experimental results demonstrate that our method improves over the state-of-the-art systems on both datasets with constant space and time requirements.