2021
pdf
bib
abs
Persian SemCor: A Bag of Word Sense Annotated Corpus for the Persian Language
Hossein Rouhizadeh
|
Mehrnoush Shamsfard
|
Mahdi Dehghan
|
Masoud Rouhizadeh
Proceedings of the 11th Global Wordnet Conference
Supervised approaches usually achieve the best performance in the Word Sense Disambiguation problem. However, the unavailability of large sense annotated corpora for many low-resource languages make these approaches inapplicable for them in practice. In this paper, we mitigate this issue for the Persian language by proposing a fully automatic approach for obtaining Persian SemCor (PerSemCor), as a Persian Bag-of-Word (BoW) sense-annotated corpus. We evaluated PerSemCor both intrinsically and extrinsically and showed that it can be effectively used as training sets for Persian supervised WSD systems. To encourage future research on Persian Word Sense Disambiguation, we release the PerSemCor in
http://nlp.sbu.ac.ir.
2019
bib
abs
Knowledge-Based Word Sense Disambiguation with Distributional Semantic Expansion
Hossein Rouhizadeh
|
Mehrnoush Shamsfard
|
Masoud Rouhizadeh
Proceedings of the 2019 Workshop on Widening NLP
In this paper, we presented a WSD system that uses LDA topics for semantic expansion of document words. Our system also uses sense frequency information from SemCor to give higher priority to the senses which are more probable to happen.
2018
pdf
bib
abs
Identifying Locus of Control in Social Media Language
Masoud Rouhizadeh
|
Kokil Jaidka
|
Laura Smith
|
H. Andrew Schwartz
|
Anneke Buffone
|
Lyle Ungar
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Individuals express their locus of control, or “control”, in their language when they identify whether or not they are in control of their circumstances. Although control is a core concept underlying rhetorical style, it is not clear whether control is expressed by how or by what authors write. We explore the roles of syntax and semantics in expressing users’ sense of control –i.e. being “controlled by” or “in control of” their circumstances– in a corpus of annotated Facebook posts. We present rich insights into these linguistic aspects and find that while the language signaling control is easy to identify, it is more challenging to label it is internally or externally controlled, with lexical features outperforming syntactic features at the task. Our findings could have important implications for studying self-expression in social media.
2017
pdf
bib
abs
Assessing Objective Recommendation Quality through Political Forecasting
H. Andrew Schwartz
|
Masoud Rouhizadeh
|
Michael Bishop
|
Philip Tetlock
|
Barbara Mellers
|
Lyle Ungar
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Recommendations are often rated for their subjective quality, but few researchers have studied comment quality in terms of objective utility. We explore recommendation quality assessment with respect to both subjective (i.e. users’ ratings) and objective (i.e., did it influence? did it improve decisions?) metrics in a massive online geopolitical forecasting system, ultimately comparing linguistic characteristics of each quality metric. Using a variety of features, we predict all types of quality with better accuracy than the simple yet strong baseline of comment length. Looking at the most predictive content illustrates rater biases; for example, forecasters are subjectively biased in favor of comments mentioning business transactions or dealings as well as material things, even though such comments do not indeed prove any more useful objectively. Additionally, more complex sentence constructions, as evidenced by subordinate conjunctions, are characteristic of comments leading to objective improvements in forecasting.
pdf
bib
abs
Detecting Personal Medication Intake in Twitter: An Annotated Corpus and Baseline Classification System
Ari Klein
|
Abeed Sarker
|
Masoud Rouhizadeh
|
Karen O’Connor
|
Graciela Gonzalez
BioNLP 2017
Social media sites (e.g., Twitter) have been used for surveillance of drug safety at the population level, but studies that focus on the effects of medications on specific sets of individuals have had to rely on other sources of data. Mining social media data for this in-formation would require the ability to distinguish indications of personal medication in-take in this media. Towards that end, this paper presents an annotated corpus that can be used to train machine learning systems to determine whether a tweet that mentions a medication indicates that the individual posting has taken that medication at a specific time. To demonstrate the utility of the corpus as a training set, we present baseline results of supervised classification.
2016
pdf
bib
Using Syntactic and Semantic Context to Explore Psychodemographic Differences in Self-reference
Masoud Rouhizadeh
|
Lyle Ungar
|
Anneke Buffone
|
H Andrew Schwartz
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing
2015
pdf
bib
Similarity Measures for Quantifying Restrictive and Repetitive Behavior in Conversations of Autistic Children
Masoud Rouhizadeh
|
Richard Sproat
|
Jan van Santen
Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality
pdf
bib
Measuring idiosyncratic interests in children with autism
Masoud Rouhizadeh
|
Emily Prud’hommeaux
|
Jan van Santen
|
Richard Sproat
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
2014
pdf
bib
Detecting linguistic idiosyncratic interests in autism using distributional semantic models
Masoud Rouhizadeh
|
Emily Prud’hommeaux
|
Jan van Santen
|
Richard Sproat
Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality
2013
pdf
bib
Distributional semantic models for the evaluation of disordered language
Masoud Rouhizadeh
|
Emily Prud’hommeaux
|
Brian Roark
|
Jan van Santen
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
2012
pdf
bib
Annotation Tools and Knowledge Representation for a Text-To-Scene System
Bob Coyne
|
Alex Klapheke
|
Masoud Rouhizadeh
|
Richard Sproat
|
Daniel Bauer
Proceedings of COLING 2012
2011
pdf
bib
Collecting Semantic Data from Mechanical Turk for a Lexical Knowledge Resource in a Text to Picture Generating System
Masoud Rouhizadeh
|
Margit Bowler
|
Richard Sproat
|
Bob Coyne
Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011)