Jenq-Haur Wang


2025

The rapid development of social networks, coupled with the prevalence of Generative AI (GAI) in our society today, has led to a sharp increase in fake tweets and fake news on social media platforms. These fake media led to more in-depth research on fake news detection. At present, there are two mainstream methods used in detecting fake news, namely content-based fake news detection and propagation / network-based fake news detection. Early content-based detection method inputs an article’s content and uses a similarity algorithm to identify fake news. This method improved by using single-modality features such as images and text as input features. However, existing research shows that single-modality features alone cannot identify fake news efficiently. The most recent method then fuses multimodal features such as images and text, as features to be input into the model for classification purposes. The second propagation / network-based fake news detection method creates graphs or decision trees through social networks, treating them as features to be input into the model for classification purposes. In this study, we propose a multimodal fake news detection framework that combines these two mainstream methods. This framework not only uses images and text as input features but also combines social metadata features such as comments. The framework extracts these comments and builds them into a tree structure to obtain its features. Furthermore, we also propose different feature fusion methods which can achieve better results compared with the existing methods. Finally, we conducted ablation experiments and proved that each module is required to contribute to the framework’s overall performance. This clearly demonstrated the effectiveness of our proposed approach.
本文提出 CWSMN(Capture Writing Style Multi-Graph Network),一個以圖神經網路為基礎的早期假新聞偵測方法,透過捕捉寫作風格克服傳統語意內容與傳播特徵方法在標註稀缺與跨域泛化不足下的限制。CWSMN 結合文體分析、語意嵌入與多圖融合:以 Bi-GRU 進行上下文初始化,採用 GAT 進行注意力導向的圖聚合,並以 LDA 建構主題圖,同時以輕量級前饋分類器輸出預測。於多個資料集之實驗顯示,CWSMN 對比 BERT、ALBERT 與 GraphSAINT 等強基準皆有穩定超越;在未知來源的 Source-CV 場景尤為顯著,證明其於低資源與跨領域環境之穩健泛化能力,並實現不依賴傳播的早期偵測,實驗結果證實本方法在樣本稀缺與未知來源條件下,仍能達成有效的早期偵測。

2024

2021

Conventional opinion polls were usually conducted via questionnaires or phone interviews, which are time-consuming and error-prone. With the advances in social networking platforms, it’s easier for the general public to express their opinions on popular topics. Given the huge amount of user opinions, it would be useful if we can automatically collect and aggregate the overall topical stance for a specific topic. In this paper, we propose to predict topical stances from social media by concept expansion, sentiment classification, and stance aggregation based on word embeddings. For concept expansion of a given topic, related posts are collected from social media and clustered by word embeddings. Then, major keywords are extracted by word segmentation and named entity recognition methods. For sentiment classification and aggregation, machine learning methods are used to train sentiment lexicon with word embeddings. Then, the sentiment scores from user-centric and post-centric views are aggregated as the total stance on the topic. In the experiments, we evaluated the performance of our proposed approach using social media data from online forums. The experimental results for 2016 Taiwan Presidential Election showed that our proposed method can effectively expand keywords and aggregate topical stances from the public for accurate prediction of election results. The best performance is 0.52% in terms of mean absolute error (MAE). Further investigation is needed to evaluate the performance of the proposed method in larger scales.

2020

2019

2018

2013