<?xml version="1.0" encoding="UTF-8" ?>
<volume id="W17">
  <paper id="5800">
    <title>Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)</title>
    <editor>Jitendra Jonnagaddala</editor>
    <editor>Hong-Jie Dai</editor>
    <editor>Yung-Chun Chang</editor>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <url>http://www.aclweb.org/anthology/W17-58</url>
    <bibtype>book</bibtype>
    <bibkey>DDDSM:2017</bibkey>
  </paper>

  <paper id="5801">
    <title>Automatic detection of stance towards vaccination in online discussion forums</title>
    <author><first>Maria</first><last>Skeppstedt</last></author>
    <author><first>Andreas</first><last>Kerren</last></author>
    <author><first>Manfred</first><last>Stede</last></author>
    <booktitle>Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>1&#8211;8</pages>
    <url>http://www.aclweb.org/anthology/W17-5801</url>
    <abstract>A classifier for automatic detection of stance towards vaccination in online
	forums was trained and evaluated. Debate posts from six discussion threads on
	the British parental website Mumsnet were manually annotated for stance
	'against' or 'for' vaccination, or as 'undecided'.  A support vector machine,
	trained to detect the three classes, achieved a macro F-score of 0.44, while a
	macro F-score of 0.62 was obtained by the same type of classifier on the binary
	classification task of distinguishing stance 'against' vaccination from stance
	'for' vaccination. These results show that vaccine stance detection in online
	forums is a difficult task, at least for the type of model investigated and for
	the relatively small training corpus that was used. Future work will therefore
	include an expansion of the training data and an evaluation of other types of
	classifiers and features.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>skeppstedt-kerren-stede:2017:DDDSM</bibkey>
  </paper>

  <paper id="5802">
    <title>Analysing the Causes of Depressed Mood from Depression Vulnerable Individuals</title>
    <author><first>Noor Fazilla</first><last>Abd Yusof</last></author>
    <author><first>Chenghua</first><last>Lin</last></author>
    <author><first>Frank</first><last>Guerin</last></author>
    <booktitle>Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>9&#8211;17</pages>
    <url>http://www.aclweb.org/anthology/W17-5802</url>
    <abstract>We develop a computational model to
	discover the potential causes of depression
	by analysing the topics in a usergenerated
	text. We show the most prominent
	causes, and how these causes evolve
	over time. Also, we highlight the differences
	in causes between students with low
	and high neuroticism. Our studies demonstrate
	that the topics reveal valuable clues
	about the causes contributing to depressed
	mood. Identifying causes can have a significant
	impact on improving the quality of
	depression care; thereby providing greater
	insights into a patient’s state for pertinent
	treatment recommendations. Hence, this
	study significantly expands the ability to
	discover the potential factors that trigger
	depression, making it possible to increase
	the efficiency of depression treatment.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>abdyusof-lin-guerin:2017:DDDSM</bibkey>
  </paper>

  <paper id="5803">
    <title>Multivariate Linear Regression of Symptoms-related Tweets for Infectious Gastroenteritis Scale Estimation</title>
    <author><first>Ryo</first><last>Takeuchi</last></author>
    <author><first>Hayate</first><last>ISO</last></author>
    <author><first>Kaoru</first><last>Ito</last></author>
    <author><first>Shoko</first><last>Wakamiya</last></author>
    <author><first>Eiji</first><last>Aramaki</last></author>
    <booktitle>Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>18&#8211;25</pages>
    <url>http://www.aclweb.org/anthology/W17-5803</url>
    <abstract>To date, various Twitter-based event detection systems have been proposed.
	Most of their targets, however, share common characteristics. They are seasonal
	or global events such as earthquakes and flu pandemics.
	In contrast, this study targets unseasonal and local disease events.
	Our system investigates the frequencies of disease-related words such as
	"nausea","chill",and "diarrhea" and estimates the number of patients using
	regression of these word frequencies.
	Experiments conducted using Japanese 47 areas from January 2017 to April 2017
	revealed that the detection of small and unseasonal event is extremely
	difficult (overall performance: 0.13).
	However, we found that the event scale and the detection performance show high
	correlation in the specified cases (in the phase of patient increasing or
	decreasing).
	The results also suggest that when 150 and more patients appear in a high
	population area, we can expect that our social sensors detect this outbreak.
	Based on these results, we can infer that social sensors can reliably detect
	unseasonal and local disease events under certain conditions, just as they can
	for seasonal or global events.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>takeuchi-EtAl:2017:DDDSM</bibkey>
  </paper>

  <paper id="5804">
    <title>Incorporating Dependency Trees Improve Identification of Pregnant Women on Social Media Platforms</title>
    <author><first>Yi-Jie</first><last>Huang</last></author>
    <author><first>Chu Hsien</first><last>Su</last></author>
    <author><first>Yi-Chun</first><last>Chang</last></author>
    <author><first>Tseng-Hsin</first><last>Ting</last></author>
    <author><first>Tzu-Yuan</first><last>Fu</last></author>
    <author><first>Rou-Min</first><last>Wang</last></author>
    <author><first>Hong-Jie</first><last>Dai</last></author>
    <author><first>Yung-Chun</first><last>Chang</last></author>
    <author><first>Jitendra</first><last>Jonnagaddala</last></author>
    <author><first>Wen-Lian</first><last>Hsu</last></author>
    <booktitle>Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>26&#8211;32</pages>
    <url>http://www.aclweb.org/anthology/W17-5804</url>
    <abstract>The increasing popularity of social media lead users to share enormous
	information on the internet. This information has various application like, it
	can be used to develop models to understand or predict user behavior on social
	media platforms. For example, few online retailers have studied the shopping
	patterns to predict shopper’s pregnancy stage. Another interesting
	application is to use the social media platforms to analyze users’
	health-related information. In this study, we developed a tree kernel-based
	model to classify tweets conveying pregnancy related information using this
	corpus. The developed pregnancy classification model achieved an accuracy of
	0.847 and an F-score of 0.565. A new corpus from popular social media platform
	Twitter was developed for the purpose of this study.  In future, we would like
	to improve this corpus by reducing noise such as retweets.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>huang-EtAl:2017:DDDSM</bibkey>
  </paper>

  <paper id="5805">
    <title>Using a Recurrent Neural Network Model for Classification of Tweets Conveyed Influenza-related Information</title>
    <author><first>Chen-Kai</first><last>Wang</last></author>
    <author><first>Onkar</first><last>Singh</last></author>
    <author><first>Zhao-Li</first><last>Tang</last></author>
    <author><first>Hong-Jie</first><last>Dai</last></author>
    <booktitle>Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>33&#8211;38</pages>
    <url>http://www.aclweb.org/anthology/W17-5805</url>
    <abstract>Traditional disease surveillance systems depend on outpatient reporting and
	virological test results released by hospitals. These data have valid and
	accurate information about emerging outbreaks but it’s often not timely. In
	recent years the exponential growth of users getting connected to social media
	provides immense knowledge about epidemics by sharing related information.
	Social media can now flag more immediate concerns related to out-breaks in real
	time. In this paper we apply the long short-term memory recurrent neural
	net-work (RNN) architecture to classify tweets conveyed influenza-related
	information and compare its performance with baseline algorithms including
	support vector machine (SVM), decision tree, naive Bayes, simple logistics, and
	naive Bayes multinomial. The developed RNN model achieved an F-score of 0.845
	on the MedWeb task test set, which outperforms the F-score of SVM without
	applying the synthetic minority oversampling technique by 0.08. The F-score of
	the RNN model is within 1% of the highest score achieved by SVM with
	oversampling technique.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>wang-EtAl:2017:DDDSM</bibkey>
  </paper>

  <paper id="5806">
    <title>ZikaHack 2016: A digital disease detection competition</title>
    <author><first>Dillon C</first><last>Adam</last></author>
    <author><first>Jitendra</first><last>Jonnagaddala</last></author>
    <author><first>Daniel</first><last>Han-Chen</last></author>
    <author><first>Sean</first><last>Batongbacal</last></author>
    <author><first>Luan</first><last>Almeida</last></author>
    <author><first>Jing Z</first><last>Zhu</last></author>
    <author><first>Jenny J</first><last>Yang</last></author>
    <author><first>Jumail M</first><last>Mundekkat</last></author>
    <author><first>Steven</first><last>Badman</last></author>
    <author><first>Abrar</first><last>Chughtai</last></author>
    <author><first>C Raina</first><last>MacIntyre</last></author>
    <booktitle>Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>39&#8211;46</pages>
    <url>http://www.aclweb.org/anthology/W17-5806</url>
    <abstract>Effective response to infectious diseases
	outbreaks relies on the rapid and
	early detection of those outbreaks. Invalidated,
	yet timely and openly available
	digital information can be used for
	the early detection of outbreaks. Public
	health surveillance authorities can exploit
	these early warnings to plan and
	co-ordinate rapid surveillance and
	emergency response programs. In
	2016, a digital disease detection competition
	named ZikaHack was
	launched. The objective of the competition
	was for multidisciplinary teams
	to design, develop and demonstrate innovative
	digital disease detection solutions
	to retrospectively detect the 2015-
	16 Brazilian Zika virus outbreak earlier
	than traditional surveillance methods.
	In this paper, an overview of the ZikaHack
	competition is provided. The
	challenges and lessons learned in organizing
	this competition are also discussed
	for use by other researchers interested
	in organizing similar competitions.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>adam-EtAl:2017:DDDSM</bibkey>
  </paper>

  <paper id="5807">
    <title>A Method to Generate a Machine-Labeled Data for Biomedical Named Entity Recognition with Various Sub-Domains</title>
    <author><first>Juae</first><last>Kim</last></author>
    <author><first>Sunjae</first><last>Kwon</last></author>
    <author><first>Youngjoong</first><last>Ko</last></author>
    <author><first>Jungyun</first><last>Seo</last></author>
    <booktitle>Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>47&#8211;51</pages>
    <url>http://www.aclweb.org/anthology/W17-5807</url>
    <abstract>Biomedical Named Entity (NE) recognition is a core technique for various works
	in the biomedical domain. In previous studies, using machine learning algorithm
	shows better performance than dictionary-based and rule-based approaches
	because there are too many terminological variations of biomedical NEs and new
	biomedical NEs are constantly generated. To achieve the high performance with a
	machine-learning algorithm, good-quality corpora are required. However, it is
	difficult to obtain the good-quality corpora because an-notating a biomedical
	corpus for ma-chine-learning is extremely time-consuming and costly. In
	addition, most previous corpora are insufficient for high-level tasks because
	they cannot cover various domains. Therefore, we propose a method for
	generating a large amount of machine-labeled data that covers various domains.
	To generate a large amount of machine-labeled data, firstly we generate an
	initial machine-labeled data by using a chunker and MetaMap. The chunker is
	developed to extract only biomedical NEs with manually annotated data. MetaMap
	is used to annotate the category of bio-medical NE. Then we apply the
	self-training approach to bootstrap the performance of initial machine-labeled
	data. In our experiments, the biomedical NE recognition system that is trained
	with our proposed machine-labeled data achieves much high performance. As a
	result, our system outperforms biomedical NE recognition system that using
	MetaMap only with 26.03%p improvements on F1-score.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>kim-EtAl:2017:DDDSM</bibkey>
  </paper>

  <paper id="5808">
    <title>Enhancing Drug-Drug Interaction Classification with Corpus-level Feature and Classifier Ensemble</title>
    <author><first>Jing Cyun</first><last>Tu</last></author>
    <author><first>Po-Ting</first><last>Lai</last></author>
    <author><first>Richard Tzong-Han</first><last>Tsai</last></author>
    <booktitle>Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>52&#8211;56</pages>
    <url>http://www.aclweb.org/anthology/W17-5808</url>
    <abstract>The study of drug-drug interaction (DDI) is important in the drug discovering.
	Both PubMed and DrugBank are rich resources to retrieve DDI information which
	is usually represented in plain text. Automatically extracting DDI pairs from
	text improves the quality of drug discov-ering. In this paper, we presented a
	study that focuses on the DDI classification. We normalized the drug names, and
	developed both sentence-level and corpus-level features for DDI classification.
	A classifier ensemble approach is used for the unbalance DDI labels problem.
	Our approach achieved an F-score of 65.4% on SemEval 2013 DDI test set. The
	experimental results also show the effects of proposed corpus-level features in
	the DDI task.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>tu-lai-tsai:2017:DDDSM</bibkey>
  </paper>

  <paper id="5809">
    <title>Chemical-Induced Disease Detection Using Invariance-based Pattern Learning Model</title>
    <author><first>Neha</first><last>Warikoo</last></author>
    <author><first>Yung-Chun</first><last>Chang</last></author>
    <author><first>Wen-Lian</first><last>Hsu</last></author>
    <booktitle>Proceedings of the International Workshop on Digital Disease Detection using Social Media 2017 (DDDSM-2017)</booktitle>
    <month>November</month>
    <year>2017</year>
    <address>Taipei, Taiwan</address>
    <publisher>Association for Computational Linguistics</publisher>
    <pages>57&#8211;64</pages>
    <url>http://www.aclweb.org/anthology/W17-5809</url>
    <abstract>In this work, we introduce a novel feature engineering approach named
	"algebraic invariance" to identify discriminative patterns for learning
	relation pair features for the chemical-disease relation (CDR) task of
	BioCreative V. Our method exploits the existing structural similarity of the
	key concepts of relation descriptions from the CDR corpus to generate robust
	linguistic patterns for SVM tree kernel-based learning. Preprocessing of the
	training data classifies the entity pairs as either related or unrelated to
	build instance types for both inter-sentential and intra-sentential scenarios.
	An invariant function is proposed to process and optimally cluster similar
	patterns for both positive and negative instances. The learning model for CDR
	pairs is based on the SVM tree kernel approach, which generates feature trees
	and vectors and is modeled on suit- able invariance based patterns, bringing
	brevity, precision and context to the identifier features. Results demonstrate
	that our method outperformed other compared approaches, achieved a high recall
	rate of 85.08%, and averaged an F1- score of 54.34% without the use of any
	additional knowledge bases.</abstract>
    <bibtype>inproceedings</bibtype>
    <bibkey>warikoo-chang-hsu:2017:DDDSM</bibkey>
  </paper>

</volume>

