A multidisciplinary Lafayette duo has been collecting tweets related to the novel coronavirus pandemic to gather insights that may prove valuable to mitigate the outbreak and manage future pandemics.

Millions of tweets from around the world and in multiple languages—often several gigabytes a day of data—have been collected since Jan. 22, when the total reported cases of COVID-19 stood at below 600. Tweets about the pandemic carry opinions, information, and misinformation, a reflection of how people are responding to the crisis, their altered lives, and official directives.

In pursuing this monumental task of collecting vast datasets of tweets, Christian López, assistant professor of computer science, who’s affiliated with the mechanical engineering department, and Caleb Gallemore, assistant professor of international affairs, both assisted by Malolan Vasu ’23 (computer science and math), are capturing an international discourse on the pandemic. Their work in surveying real-time reaction to the crisis may be of interest to policymakers and elected officials as they consider ways to address the pandemic.

By analyzing tweets, the team can identify the most common responses to the pandemic and how they differ across time and countries and in reaction to official policies.

“You can get in real time the sentiment of people about certain policies,” López says. “So let’s say that a governor orders everybody to stay at home, and then everybody is complaining about that. For politicians and policymakers, they can see the immediate responses to their decisions and make modifications or adjust their messaging as needed.”

The team has used Twitter’s Application Programming Interface platform to collect the data. López says most people would be surprised by how much data Twitter collects from users, including geographical information.

Keywords used for searches have included virus, coronavirus, ncov19, ncov2019, and covid. The number of retweets declined in February but rose abruptly as the global health crisis expanded in Europe in early March. English language tweets remain the most prominent in the dataset, accounting for more than 50% of the total.

“You can see that people are tweeting more often as the pandemic evolves because people are staying at home for longer periods of time,” López says.

“We are able to do an in-depth text analysis of the tweet itself, to look at the sentiment of the tweets,” he says.

It’s a direct window into how people react to policies that governments put in place to control the spread of the virus. Tweets relay if people believe the policies are too stringent, if they should be delayed, or even if they should be implemented at all. The analysis also reveals the spread of false information on Twitter and how that trickles down across the vast social network, López notes.

“We can not only see if they’re related to the pandemic but what they are talking about, whether it’s about a specific policy, a company, or person and by geographic area,” he says. “By looking at the text of the tweet itself, we could tap into that kind of detailed information.”