Data and Misinformation Part 3:

How can one be able to detect misinformation, disinformation networks, bots, influencers, and other fake news perpetrators on social media to help fight the social media infodemic?

The growing number of social media platforms and users has amplified the fake news problem around the world. With access to social media, each of us has the ability to create and disseminate content to the masses in an uncontrollable manner. In addition to this, social media has become a major channel through which reputable institutions and high-status individuals or journalists put out important information to the public. For example in Uganda today, the ministry of health publishes information about daily COVID cases, COVID deaths, and other COVID-related information through their official social media accounts. And all this happens alongside the pandemic of fake news spreading all over social media.

According to an article published by the BBC, 90 percent of the posts spreading misinformation about Covid-19 on Facebook and Twitter remained visible online even after being reported without any warning attached to them. This shows that social media platforms as well are failing to tackle the problem. So perhaps we all need to equip ourselves with the necessary skills for investigating fake news.

So, how would one collect and analyze social media data to identify the masterminds behind the disinformation and social media bots?

Many techniques that rely on the news content for detecting fake news have been proposed which include suggestions like advising people to read past the headlines, check the publisher or author, look at the links and sponsors used, check publication date and time, search if other outlets are reporting it, etc, but these techniques alone won’t help us eradicate this fake news problem.

Techniques such as social network analysis can be used to fight dis-information by looking beyond the news being spread and putting more attention to the spreaders and the relationships between the spreaders. When using these network-based approaches, data can be obtained from social media platforms such as Twitter and used to build relevant networks that outline the relationship between the accounts that could potentially be groups of organized individuals working together to spread disinformation.

According to a report by Reuters, despite the majority of fake news coming from ordinary people, posts by celebrities (including politicians and influencers) created more engagement on social media substantially boosting the fake news reach. So another way to fight disinformation and misinformation is by seeking out those posts with the most influence. This can be done by looking at the number of retweets or shares, likes or favorites, and comments or replies on a particular post or tweet. After spotting the most popular post or tweet on a social media platform, one can go ahead and investigate further by answering questions such as;

Who originally produced this tweet or post?
Who has retweeted or shared the post?
What do the followings of these users look like? (What does the network analysis of these followings look like?)
What hashtags or URLs have been used?

One can then identify the particular hashtags used and collect all data linked to a particular hashtag for further analysis for example by looking out for the most commonly occurring hashtags, the most popular accounts using a particular hashtag, splitting up hashtag citations by time, etc.

With URLs, one can be able to establish where the origin of the information in cases where the origin is beyond social media. One can begin by authenticating the site by using tools such as Whois Lookup to find detailed information about the site, the server name, the registrant country, the creation date, the updated date, etc. However, URLs can be tricky to spot at times, in some cases, these URLs are shortened or the same domain can be cited numerous times using different strings of text making it difficult to detect. Tools like urlex.org and checkshorturl.com can be used to expand the shortened URLs revealing their original full form.

Other tools one can use to verify contents include TinyEye, Google’s reverse image search for obtaining information about images, and the “Amnesty International tool” and “GitHub YouTube Geo-search tool” for obtaining information on videos being shared.

Social media bots also play a key role in spreading disinformation online. According to a study, between 9% and 15% of the active accounts on Twitter are bots and Facebook is said to have deactivated 6.5 billion fake accounts in 2019. So how can one differentiate between a bot and a real account? In most cases can be able to detect a bot by noticing abnormal account activity such as; the number of engagements a post or tweet gets overtime for example when comments go out as soon as the influencer posts, this can indicate auto-generated comments, the ratio of engagements for example if a post has thousands of likes but no comments, it can be an indicator of automated bot activity, the number of posts put out in a short time span, if an account is always posting about a specific trend using the same hashtags all the time, etc.

Machine learning has also been used to create models that can predict the likelihood of an account being a bot relying on features such as followers, post time, the ratio of engagement, follower origin, number of posts in a given time span, etc. Common bot detection tools that use machine learning to detect bots include botometer, tweetbotornot, botcheck.me, bot sentinel, etc.

All tools mentioned above are open source but other tools exist that can provide access to other features but at a cost. Ultimately the tools mentioned here serve to guide a beginner into tackling the disinformation problem online and they can later explore other options. Our team at Pollicy has attempted to tackle the fake news problem with a choose your own adventure game through which players are able to navigate the world on fake news in an African context to understand the dynamics of how it is spread.

Play this game here!

Written by Arthur Kakande, Communications Lead at Pollicy.

Data and Misinformation Part 3: