Nikos Salamanos, Pantelitsa Leonidou, Nikolaos Laoutaris, Michael Sirivianos, Maria Aspri, Marius Paraschiv. “HyperGraphDis: Leveraging Hypergraphs for Contextual and Social-Based Disinformation Detection” 18th International AAAI Conference on Web and Social Media (ICWSM’24) (In press). Preprint available at http://arxiv.org/abs/2310.01113
In the paper “HyperGraphDis: Leveraging Hypergraphs for Contextual and Social-Based Disinformation Detection,” we improved disinformation detection accuracy and computational efficiency on social media by introducing HyperGraphDis, a novel approach for detecting disinformation on Twitter. The method employs a hypergraph-based representation to capture (i) the intricate social structures arising from retweet cascades, (ii) relational features among users, and (iii) semantic and topical nuances. In the initial phase of hypergraph construction, we apply a graph partitioning algorithm to the Twitter social network, where nodes represent users and edges represent social connections. Upon identifying these user clusters, we transform each user in a cluster into a list of Twitter cascades in which they have participated. This transformation inherently reshapes the problem space: it turns the complex task of disinformation classification on Twitter from an intricate, multi-variable problem into a more straightforward node classification problem within the hypergraph. Below, we present a toy example of the hypergraph construction.
We evaluate our approach on four datasets: (i) an extensive dataset on the 2016 U.S. presidential election; (ii) a substantial collection of tweets related to the COVID-19 pandemic; and (iii) the Health Release and Health Story datasets. The HyperGraphDis shows exceptional performance. Evaluating it with MM-COVID achieves an impressive F1 score of around 89.5%. It outperforms the Meta-graph method by approximately 4% and Cluster-GCN by 33%. Additionally, HyperGraphDis outperforms HyperGraph for Fake News Detection (HGFND) by 6.4%.
Furthermore, noteworthy enhancements are observed in the computation time for model training. The completion time is notably expedited, ranging from 2.3 times to 7.6 times faster than the second-best method per dataset (in terms of F1 score).
Funding from:
MedDMO (Grant Agreement no. 101083756)
INCOGNITO (Grant Agreement no. 824015)