Fang Chen


2018

pdf bib
A Unified Neural Network Model for Geolocating Twitter Users
Mohammad Ebrahimi | Elaheh ShafieiBavani | Raymond Wong | Fang Chen
Proceedings of the 22nd Conference on Computational Natural Language Learning

Locations of social media users are important to many applications such as rapid disaster response, targeted advertisement, and news recommendation. However, many users do not share their exact geographical coordinates due to reasons such as privacy concerns. The lack of explicit location information has motivated a growing body of research in recent years looking at different automatic ways of determining the user’s primary location. In this paper, we propose a unified user geolocation method which relies on a fusion of neural networks. Our joint model incorporates different types of available information including tweet text, user network, and metadata to predict users’ locations. Moreover, we utilize a bidirectional LSTM network augmented with an attention mechanism to identify the most location indicative words in textual content of tweets. The experiments demonstrate that our approach achieves state-of-the-art performance over two Twitter benchmark geolocation datasets. We also conduct an ablation study to evaluate the contribution of each type of information in user geolocation performance.

pdf bib
Summarization Evaluation in the Absence of Human Model Summaries Using the Compositionality of Word Embeddings
Elaheh ShafieiBavani | Mohammad Ebrahimi | Raymond Wong | Fang Chen
Proceedings of the 27th International Conference on Computational Linguistics

We present a new summary evaluation approach that does not require human model summaries. Our approach exploits the compositional capabilities of corpus-based and lexical resource-based word embeddings to develop the features reflecting coverage, diversity, informativeness, and coherence of summaries. The features are then used to train a learning model for predicting the summary content quality in the absence of gold models. We evaluate the proposed metric in replicating the human assigned scores for summarization systems and summaries on data from query-focused and update summarization tasks in TAC 2008 and 2009. The results show that our feature combination provides reliable estimates of summary content quality when model summaries are not available.

pdf bib
A Graph-theoretic Summary Evaluation for ROUGE
Elaheh ShafieiBavani | Mohammad Ebrahimi | Raymond Wong | Fang Chen
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

ROUGE is one of the first and most widely used evaluation metrics for text summarization. However, its assessment merely relies on surface similarities between peer and model summaries. Consequently, ROUGE is unable to fairly evaluate summaries including lexical variations and paraphrasing. We propose a graph-based approach adopted into ROUGE to evaluate summaries based on both lexical and semantic similarities. Experiment results over TAC AESOP datasets show that exploiting the lexico-semantic similarity of the words used in summaries would significantly help ROUGE correlate better with human judgments.

2016

pdf bib
Appraising UMLS Coverage for Summarizing Medical Evidence
Elaheh ShafieiBavani | Mohammad Ebrahimi | Raymond Wong | Fang Chen
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

When making clinical decisions, practitioners need to rely on the most relevant evidence available. However, accessing a vast body of medical evidence and confronting with the issue of information overload can be challenging and time consuming. This paper proposes an effective summarizer for medical evidence by utilizing both UMLS and WordNet. Given a clinical query and a set of relevant abstracts, our aim is to generate a fluent, well-organized, and compact summary that answers the query. Analysis via ROUGE metrics shows that using WordNet as a general-purpose lexicon helps to capture the concepts not covered by the UMLS Metathesaurus, and hence significantly increases the performance. The effectiveness of our proposed approach is demonstrated by conducting a set of experiments over a specialized evidence-based medicine (EBM) corpus - which has been gathered and annotated for the purpose of biomedical text summarization.