Olena Medelyan
2010
SemEval-2010 Task 5 : Automatic Keyphrase Extraction from Scientific Articles
Su Nam Kim
|
Olena Medelyan
|
Min-Yen Kan
|
Timothy Baldwin
Proceedings of the 5th International Workshop on Semantic Evaluation
2009
Human-competitive tagging using automatic keyphrase extraction
Olena Medelyan
|
Eibe Frank
|
Ian H. Witten
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing
2007
Computing Lexical Chains with Graph Clustering
Olena Medelyan
Proceedings of the ACL 2007 Student Research Workshop
2006
Language Specific and Topic Focused Web Crawling
Olena Medelyan
|
Stefan Schulz
|
Jan Paetzold
|
Michael Poprat
|
Kornél Markó
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
We describe an experiment on collecting large language and topic specific corpora automatically by using a focused Web crawler. Our crawler combines efficient crawling techniques with a common text classification tool. Given a sample corpus of medical documents, we automatically extract query phrases and then acquire seed URLs with a standard search engine. Starting from these seed URLs, the crawler builds a new large collection consisting only of documents that satisfy both the language and the topic model. The manual analysis of acquired English and German medicine corpora reveals the high accuracy of the crawler. However, there are significant differences between both languages.
Search
Co-authors
- Stefan Schulz 1
- Jan Paetzold 1
- Michael Poprat 1
- Kornél Markó 1
- Su Nam Kim 1
- show all...