Unsupervised Rumor Detection on Twitter using Topic Homogeneity

paper, specified "short paper"
Authorship
  1. 1. David Lawrence Shepard

    University of California, Los Angeles (UCLA)

  2. 2. Takako Hashimoto

    Chiba University of Commerce

  3. 3. Kilho Shin

    Hyogo University

  4. 4. Takeaki Uno

    National Institute of Informatics

  5. 5. Tetsuji Kuboyama

    Gakushuin University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Rumor detection on twitter is a highly-studied topic. Our prior work showed that it is possible to identify rumors (stories about events that have no basis in fact) on twitter by observing the relationship between the number of tweets about an event and the number of topics within those tweets (Shepard et al., 2019). In that paper, we presented case studies, but could not measure the difference between rumors and facts. This paper builds on that work by presenting a formula for making that distinction: given a data set of tweet texts with time stamps, our method automatically discovers rumors in those tweets. This paper presents our method, which is completely unsupervised in contrast to most recent work on rumor detection. As a case study, it demonstrates the method’s effectiveness on a sample dataset of tweets from the 3/11 earthquake in Japan.BackgroundAt least seven new methods have been proposed for rumor detection on twitter in the past two years, including (Wang et al., 2017; Singh et al., 2017; Chen et al., 2017; Yoshida and Aritsugi, 2019; Ma et al., 2018; Poddar et al., 2018; Lin et al., 2019). One major constraint of these approaches, however, is that all of them require supervised learning. Our major intervention is that our approach is unsupervised and language-independent. Our algorithm makes only the assumption that the text has been tokenized into words.MethodOur approach results from the observation that little new information emerges about rumors. For example, if a chemical spill is announced on twitter, people often retweet the announcement. People are also likely to be affected in some way, such as seeing emergency vehicles or evacuating the area. New information will be created as the event progresses. If the spill is just a rumor, however, people will have no new information to add, and so will simply retweet the original rumor. We call the amount of information surrounding an event its “topic diversity.” Calculating an event’s topic diversity allows us to classify an event as a rumor or a fact. A true event will have both high tweet counts and high topic diversity, while a rumor will have high tweet counts but low topic diversity.The first step in calculating topic diversity is separating our data into what we call word-window tweet lists, or WWTLs. These are lists of tweets that mention a word, split into 30-minute windows. We generate WWTLs for every word in our dataset of tweets, after removing stopwords. Next, we compute each WWTL’s topic diversity by performing topic modeling on each WWTL and counting the number of topics produced. We use a novel method of topic modeling that determines the number of topics automatically based on maximal clique enumeration (MCE). To prepare a TTWL for MCE, we first generate a graph of all tweets. Each node represents one tweet, and edges are generated by computing the Jaccard similarity of each pair of tweets based on shared vocabulary. Then, we perform data polishing (Uno et al., 2015; Uno et al., 2017) and MCE using nysol_python (nysol, 2019; Nakamoto and Hamuro, 2018) to produce a clique count for each WWTL. A WWTL with a large number of cliques has high topic diversity.Finally, we examine whether increases in tweet counts in WWTLs correlate with topic diversity. For each word, we find the WWTL with the maximum tweet count (Tmax). Then, we find the minimum number of non-zero tweets prior to that maximum (Tmin), and call that period between those windows a “keyword rise.” For each keyword rise, we find the number of communities at the beginning and the end of the keyword rise (CTmin and CTmax). We then compute what we call the topic homogeneity factor (THF), using the following formula:THF = ((T_max - T_min) / (T_min)) / ( (C_T_max - C_T_min ) / (C_T_min) )We find that keywords with a THF of 25 or greater are likely to be rumors.ExperimentTo validate our method, we performed an experiment on a data set of around 200 million tweets sent in the three weeks after the 3/11 earthquake in Japan, gathered by the social media company Hottolink (https://www.hottolink.co.jp/english). We knew a number of rumors had spread during the disaster, but were later corrected in official news sources.Our test focused on identifying keywords related to two rumors. The first was about a chemical leak at the Cosmo Oil plant. The second was a widely-retweeted message purporting to be the last words of a dying system administrator, referred to as the “Geek House Asakusa” tweet.To compare the patterns of these known rumor keywords against words not associated with rumors, we arbitrarily selected 150 other words that we expected to exhibit a diverse set of frequencies and topic diversity. We included users’ twitter handles, government agencies, and locations in the affected area. We expected that twitter handles would be mentioned infrequently, and have low topic diversity, while government agencies and affected areas would be mentioned frequently and have high topic diversity. Given the variety among these words’ frequency and topic diversity, we expected that many of them would have either word frequency or topic diversity measures that could confuse our method.ResultsUsing our method, it was easy to differentiate words related to rumors from other words. The term (“Cosmo”) had a THF of 39, and (“Cosmo Oil”) had a THF of 73. Similar results were obtained for (“geek”), with a THF of 41.73. Of the 150 non-rumor keywords, we found that with a threshold of 25.0, only one word was incorrectly classified as a rumor. Given this success, we considered our method to be effective.ConclusionWe have shown that our method can effectively detect keywords likely to be associated with rumors. While we acknowledge that it was tested on a dataset from the past and not on a live tweet stream, in future work, we plan to experiment on a live tweet stream. We also plan to experiment with political rumors likely to generate discussion even if they are not true, which would produce higher topic diversity. Overall, though, we found our method to be effective at detecting rumors in our dataset and anticipate it to be effective at discovering rumors in similar dataset.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2020
"carrefours / intersections"

Hosted at Carleton University, Université d'Ottawa (University of Ottawa)

Ottawa, Ontario, Canada

July 20, 2020 - July 25, 2020

475 works by 1078 authors indexed

Conference cancelled due to coronavirus. Online conference held at https://hcommons.org/groups/dh2020/. Data for this conference were initially prepared and cleaned by May Ning.

Conference website: https://dh2020.adho.org/

References: https://dh2020.adho.org/abstracts/

Series: ADHO (15)

Organizers: ADHO