In a recent study published in PLOS ONE, researchers analyzed coronavirus disease 2019 (COVID-19) disinformation on Twitter.
The widespread usage of social media during the COVID-19 pandemic had resulted in an ‘infodemic’ of dis- and misinformation regarding COVID-19, leading to potentially fatal consequences. Understanding the magnitude and impact of this false information is essential for the public health agencies to estimate the behavior of the general population with respect to vaccine uptake and non-pharmaceutical interventions (NPIs) like social distancing and masking.
About the study
In the present study, researchers assessed tweets circulating on Twitter containing the hashtags #Plandemic and #Scamdemic.
On 3 January 2021, the team used Twint, a Twitter scraping tool, to collect English-language tweets containing the hashtags #Plandemic or #Scamdemic posted between 1 January and 31 December 2020. On 15 January 2021, the team subsequently employed the Twitter application programming software (API) to obtain the same tweets using corresponding tweet identities. The team provided descriptive statistics for the selected tweets, such as the correlating content of the tweet and user profiles, to determine the availability of the tweets in both datasets developed according to the Twitter API status codes.
Sentiment analysis of the tweets was performed by tokenizing the tweets and cleaning them. The tokens were subsequently transformed into their root form using natural language processing techniques, including lemmatizing, stemming, and removing stop words. Python’s VADER library was employed to recognize and categorize the sentiment of the tweet as either neutral, positive, or negative and the subjectivity of the tweet as either subjective or objective. VADER applied a rule-based analysis of sentiments with a polarity scale ranging between -1 and 1.
The subjective analysis was performed using TextBlob, which labeled each tweet on a scale of zero or objective to one or subjective. Objective tweets were considered to provide facts, while subjective tweets communicated an opinion or a belief. The team visualized a histogram of the subjectivity scores for the #Plandemic and #Scamdemic hashtags. The Python library was also used to label the primary emotion associated with each tweet as fear, anticipation, anger, surprise, trust, sadness, joy, disgust, positive, or negative.
The predominant topics discussed in the tweet library were recognized, and a machine-learning algorithm was applied. This algorithm identified the clusters of tweets using a representative group of words. The words with the highest weights in each cluster were used to define the content of each topic.
The study results showed that a total of 420,107 tweets comprised the hashtags #Plandemic and #Scamdemic. The team removed tweets that were retweets, replies, non-English, or duplicates to retain 227,067 tweets from approximately 40,081 users. Almost 74.4% of the total tweets were posted by 78.4% of the active Twitter users, while 25.6% of the tweets were posted by 21.6% of users whose account was suspended by 15 January 2021. The team noted that users with suspended profiles were likely to tweet more. Users who used both the hashtags had a 29.2% chance of being suspended as opposed to 25.9% for tweets using #Plandemic and 13.2% for tweets using #Scamdemic.
The team found that most of the users were aged 40 years and above. Moreover, the suspended users majorly included males and users aged 18 years and below and 30 to 39 years. Almost 88% of active users and 79% of suspended users tweeted from their personal accounts. Notably, objectivity was displayed by almost 65% of the tweets analyzed.
Emotion analysis of the tweets revealed that fear was the predominant emotion, followed by sadness, trust, and anger. Emotions like surprise, disgust, and joy were the least expressed ones while suspended tweets were more likely to display disgust, surprise, and anger.
The overall sentiment expressed by the tweets containing #Plandemic and #Scamdemic hashtags was negative. The overall mean weekly sentiments were -0.05 for #Plandemic, and -0.09 for #Scamdemic, wherein 1 and -1 denoted completely positive and negative sentiments, respectively.
The most frequently observed tweet topic was ‘complaints against mandates introduced during the COVID-19 pandemic’, which also included complaints against face masks, closures, and social distancing. This was followed by tweets with topics ‘downplaying the dangers of COVID-19’, ‘lies and brainwashing by politicians and the media’, and ‘corporations and global agenda.’
Overall, the study findings showed that the COVID-19-related tweets displayed an overall negative sentiment. While several tweets expressed anger against the restrictions during the pandemic, a significant proportion of tweets also presented disinformation.