A study of the “geography of Twitter” compares, among many other data sets, what it calls “Twitter versus mainstream news media.” This part of Kalev Leetaru, Shaowen Wang, Guofeng Cao, Anand Padmanabhan and Eric Shook’s study compared the geographic data from about five weeks’ worth of tweets with locations from more than 3 million Google News articles during the same period.
“Does Twitter cover the same locations as the mainstream media, or do they discuss very different areas of the world?” they asked.
In one map, they compared “georeferenced Twitter Decahose (blue) and English Google News (red) geographic coverage” between Oct. 23, 2012 and Nov. 30, 2012.
“Areas that are blue have stronger Twitter representation,” they write while, “while red areas are covered more closely by mainstream media, and white areas have an equal balance.”
Here’s a crop of that map. Many major U.S. cities appear to be tweeted about as much as they’re covered. And the rest of the country?
“These results indicate there is a strong difference in the geographic profiles of Twitter and mainstream media and that the intensity of discourse mentioning a country does not necessarily match the intensity of discourse emanating from that country in social media,” the authors write. “It also suggests that Twitter is not simply a mirror of mainstream media, but rather has a distinct geographic profile.”
Another finding from the analysis: “On average, the distance for a retweet was 1,115 miles and 1,118 for a reference,” Joseph Stromberg writes in Smithsonian’s Surprising Science blog.
But, counterintuitively, there was a positive relationship between the number of times a given user retweeted or referenced another user and their distance: Pairs of users with just a handful of interactions, on the whole, were more likely to be closer together (500-600 miles apart) than those with dozens of retweets and references between them.
One Twitter mapping project may leave you feeling a little less groovy: Humboldt State University students mapped hate tweets across the U.S. They searched for tweets containing certain terms, then students “manually read and coded the sentiment of each tweet to determine if the given word was used in a positive, negative or neutral manner.”
This allowed us to avoid using any algorithmic sentiment analysis or natural language processing, as many algorithms would have simply classified a tweet as ‘negative’ when the word was used in a neutral or positive way. For example the phrase ‘dyke’, while often negative when referring to an individual person, was also used in positive ways (e.g. “dykes on bikes #SFPride”). The students were able to discern which were negative, neutral, or positive. Only those tweets used in an explicitly negative way are included in the map.
Tweets featuring the n-word “are not concentrated in any single place or region in the United States,” they write. “Instead, quite depressingly, there are a number of pockets of concentration that demonstrate heavy usage of the word.”