Social Media: News about social media that matters to journalism. Written by Jeff Sonderman. Suggest a story.

Bird words

How to do Twitter research on a shoestring

Twitter’s increasingly influential role in journalism has prompted an accompanying upsurge in academic research, particularly around the ways in which journalists and media organizations have integrated Twitter into their norms and practices.

With 500 million tweets a day, Twitter offers researchers a potentially deep and rich stream of social media data. However, unlike historical newspaper content, which is readily available via library microfiches or databases like Lexis Nexis, much of the historical data on Twitter (what’s called the Twitter firehose) is walled off in costly private archives.

Information may want to be free, but accessing and analyzing that information can be costly.

The Library of Congress signed a deal with Twitter in 2010 to build an on-site research archive but that system has still not been finalized. A progress update is expected this summer, but the archive, which now houses more than 170 billion tweets, poses major logistical challenges for the Library and the firehose reseller Gnip, which is delivering the data for Twitter. For example, a single search of the 21 billion tweets in the fixed 2006-10 archive was taking 24 hours just last year. Twitter acquired Gnip in April, prompting hopes that the archive may be operational in 2014-15, but even so, the archive will only be accessible on-site at the Library in Washington, D.C.

That source of reliable, inexpensive online access to the Twitter firehose has become almost a Holy Grail for journalism professors in the U.S. and Canada who I surveyed this June using a Google form.

Kathleen Culver, assistant professor at the University of Wisconsin-Madison, says she would like to see “portals for academics into Twitter, supported by Twitter,” and an easy user interface for research. Alf Hermida, associate professor at the University of British Columbia, agrees. Hermida, who has just published a new book on social media, says such a portal could contain “shared archives of Twitter data, best practices and approaches.”

Mike Reilley, online journalism instructor at DePaul University in Chicago says he wants something that will let him go deep. “I’m looking for that ‘super tool’ and am hoping someone at JS Knight Stanford comes up with one. An all-in-one tool – scrapes, archive, great search, everything. I’m tired of ‘tool-hopping’ to get work done,” he said.

Flawed early Twitter research tools

Early efforts such as the freemium TwapperKeeper service offered that “all-in-one” functionality, albeit with some restrictions. TwapperKeeper, which allowed users to create and download .csv files of Twitter archives, limited historical searches to 7 days earlier or 3,500 tweets (whichever came sooner). However, TwapperKeeper, which was launched in 2009, was taken over by HootSuite in 2011 and became a premium subscription product.

Some researchers then shifted to Topsy Pro, which offered trial accounts for researchers or a single annual license for $12,000, but the datasets were often incomplete.

Hermida says Topsy used its own criteria to delete tweets from the archive. “Topsy removed tweets that had been deleted from the Twitter firehose, and tweets without at least six retweets or a retweet by an influential user were removed from the search index after 30 days,” he said.

Robert Hernandez, associate professor of professional practice at USC Annenberg School for Communication & Journalism, says he has long suspected such archives could be flawed. “I always have an uneasy feeling that the archive – whether I get it formally or not – doesn’t feel as accurate or complete as one thinks,” he said.

‘Divide between data-rich and data-poor researchers’

A more robust historical search, such as analyzing the #Newtown tweets to examine journalists’ behavior in the hours immediately following the school massacre, would require subscription to a certified Twitter firehose service such as Gnip or DataSift.

These services can retrieve an unlimited amount of tweets from practically any time in Twitter’s history. But the resellers’ main focus is businesses seeking more data on how consumers view them. Thus, their pricing is aimed at institutions or corporations. The pricing plans for both companies are difficult to decipher (both sites ask users to submit a Web form for a quote) but a 2014 article put DataSift at about $3,000 a month with Gnip starting at $500 for each one-off search. Licensing fees cost an additional $0.10 per 1,000 tweets and are paid to Twitter. These licensing fees accounted for $32 million of Twitter’s earnings in the first half of 2013.

However, a $3,000-a-month subscription level or even a $500 search would be too expensive for most academics unless they were able to make arrangements with their institutions.

Hermida’s university uses Crimson Hexagon, which charges $5,000 a year for 50 search terms or “monitors” as part of its Social Research Grant Program. Elizabeth Breese, senior content and digital marketing strategist, said the program seeks researchers who are “a good fit… that the research is non-commercial in nature, and that the results will be made public in some way,” she said by email. The grant program provides 50 “simultaneous monitors” which can be deleted to provide for a new query, giving researchers more flexibility.

Meanwhile, a freemium Twitter scraping tool from the British company ScraperWiki could be the solution for cash-poor researchers. Users set a hashtag, keyword or user name as a search term and then let ScraperWiki monitor Twitter for all new occurrences. Like TwapperKeeper before it, ScraperWiki can only create new archives rather than search for historical data, but the drawbacks of the service are mitigated by the price.

Pricing starts at $9 a month for “Explorer” access to three datasets and tops out at $29 a month for “Data Scientist” access to 100 datasets. The main drawback, as discussed, is the lack of historical data. But the tool is incredibly robust and a well-planned project could return tens of thousands of rows of data for analysis and visualization as there is no maximum limit on tweets.

For example, using the Data Scientist package, a researcher could easily track the ongoing output from up to 100 users for a project such as a comparative analysis on a constructed week. In my own research, I used ScraperWiki to retrieve approximately 22,000 tweets tracing the social media development of the Tuam babies story in Ireland in May/June 2014.

ScraperWiki CEO Francis Irving says the scraper is easy to use. “Once you know the search term or user you want to archive you can create the dataset and let the tool run. Once you have enough tweets, you can download the data for analysis,” he said via Skype. The software can also visualize and summarize the data, saving much work for the researcher.

A similar freemium tool is Simply Measured’s RowFeeder, but the datasets are more expensive and the results are more limited than ScraperWiki’s. For example, RowFeeder’s cheapest product, which costs $35 a month, includes just three datasets for a maximum of 5,000 tweets per month.

In the absence of an archive from the Library of Congress, ScraperWiki seems a reasonable solution to the ongoing problem of how to collect and analyze meaningful Twitter data. Its relatively inexpensive pricing partially addresses the growing two tiers in academic research caused by the high cost of data analysis. As Hermida said in the survey, “Paying for access means that there could be a divide between data-rich and data-poor researchers.”

Correction: A previous version of this story misspelled Mike Reilley’s name.

Kelly Fincham, an assistant professor at Hofstra University, has been using Twitter for research since 2010. Read more


Thursday, May 29, 2014

social media _ depositphotos

How Muck Rack’s social media tool lets journalists track content sharing

Muck Rack

Here’s a handy social media tool you might not be aware of: Muck Rack’s Who Shared My Link feature. Simply paste any link, and it shows you how many times it was shared on Facebook, Twitter, LinkedIn and more. There’s even a button for your bookmarks bar so you can instantly see social shares for whatever page you’re on.

Sara Morrison wrote about the feature last year for CJR. As Muck Rack CEO Gregory Galant told her: “Since pageviews are known only to the publishers, who usually embellish the numbers before releasing them, ‘shares’ is one of the few metrics that are public and equal across the Web.”

Muck Rack announced on Tuesday that it added the ability to generate PDF summaries of how a link performed on social media (you have to be a Muck Rack Pro member or a verified journalist to access the PDF reports — and a list of Muck Rack users who shared your link). The new PDF reports are potentially a useful way to pass the data on to a boss or coworkers, provided your newsroom doesn’t already track social shares closely — and PDF attachments and shameless bragging won’t annoy your boss.

More interesting is that Who Shared My Link allows you to check out your competitors’ social performance, too. For instance, as of about 4:30 p.m. on Wednesday, The New York Times obituary for Maya Angelou was much more widely shared than The Wall Street Journal’s, according to Muck Rack:

The feature is also fascinating when it comes to tracking how media organizations differ in terms of where their audiences discover content. The vast majority of people who shared BuzzFeed’s story about Angelou passed it along on Facebook, while Vox’s take was actually shared more frequently on Twitter than on Facebook:

It’s a useful tool for social media editors (not to mention media reporters) who want an easy, one-click way to track sharing activity.

// Read more

1 Comment

Friday, Mar. 14, 2014

Shown are the main offices of the San Francisco Chronicle newspaper in San Francisco, Friday, March 13, 2009.(AP Photo/Eric Risberg)

S.F. Chronicle social ‘boot camp’ changing culture, practices

The 148-year-old San Francisco Chronicle has invested in an off-site incubator for its journalists to learn about and experiment with a variety of digital tools, including social media. PBS Media Shift explored goals of the “boot camp” in January.

Now that the effort is underway, I reached out to Marcus Gilmer, newsroom social media manager at the Chronicle and (He and I worked together at the Chicago Sun-Times last year.) Gilmer joined the Chronicle in December and has spent time at the incubator teaching social media skills and tools to reporters and editors. (This interview has been edited and condensed for clarity.) Read more


Monday, Feb. 17, 2014

Facebook like icon keyboard

Mental Floss a big winner after Facebook’s mysterious ‘high quality’ algorithm change

When Facebook announced in December that it was altering its News Feed algorithm to focus on “high quality content,” speculation centered on which sites might be in danger of excommunication as Facebook took aim at the viral bubble.

Was BuzzFeed’s silly clickbait a target, or would the site’s growing commitment to real news and longform save the domain from banishment? (It’s doing just fine.) What about Upworthy, the viral site that ruled Facebook in November with its widely mocked and mimicked “you won’t believe ____” headlines? (Business Insider declared it “crushed” after a December traffic dip, but a wider view of Quantcast data leads to a less dramatic conclusion.)

Meanwhile, some sites stood to gain, and one winner seems to be Mental Floss, a source of eminently shareable trivia, historical facts and answers to hundreds of questions you didn’t know you had. Read more

1 Comment

Tuesday, Feb. 04, 2014

Screen Shot 2014-02-04 at 9.03.39 AM

3 ways Facebook’s Paper app outperforms other news aggregators (and 3 ways it doesn’t)

Paper, the first app from Facebook’s Creative Labs available now for iPhones, could challenge Flipboard, Zite and Feedly in the business of aggregating news on mobile devices. Not only does it beautify your Facebook newsfeed, but it also links to content from major news sources in various sections like Headlines (news), Score (sports), Exposure (photos) and Planet (science and sustainability). Here are some reasons Paper might be the news reader for you (or not):

Pictures feel bigger (but not always better)

Almost all screens, from movie theaters to TVs to computers to tablets, are horizontal for a reason (tablet users seem to prefer the landscape orientation to portrait, but of course it’s used both ways). So it’s often frustrating to view our horizontal world through the tiny vertical window of a phone. Pinch-to-zoom works OK for seeing more detail, but the multitouch gesture is a little cumbersome and, of course, zooming makes it impossible to see the entire image at once.

Paper’s tilt-to-pan function sometimes misses the mark, as in this photo of Philip Seymour Hoffman that isn’t improved by automatic zooming.

Paper offers an interesting solution to the problem of awkward mobile photo exploration by automatically zooming in on images and allowing users to pan left or right by tilting their phones. That makes for a cool immersive experience when viewing photos of scenes such as this one, with Kenyan police raiding a mosque.

But other times the feature feels gimmicky, disorienting and arbitrary. Is it really necessary for me to tilt my phone if I want to see either of Philip Seymour Hoffman’s ears? A simple tap of the photo brings up the full, letterboxed view, but I’m not convinced a zoomed-in, full-screen image is always the best way to come across new photos, even on a small screen. In the future, hopefully Paper can develop a way to employ the tilt-to-pan feature only when it makes sense.

Navigation is fun and mostly intuitive (but a little slow)

Paper’s lengthy, audio-narrated guide when first opening the app made me worry about how complex the app’s navigation would be, but the layered navigation was easy to get the hang of. There are no “X” buttons or “done” buttons to get in the way of viewing content, just swipes to dive deeper into content or swipes to dismiss it. Exploring the app’s layers was intuitive in ways exploring for the first time wasn’t.

While this view in the Paper app allows readers to see more than one story at once, zoomed-out story cards at the bottom of the screen are practically unreadable.

Yet the story-selection process itself isn’t as pleasurable as it is in other apps. In single-story view, for instance, you lose the the quick-browsing advantage of flicking your finger to scroll through your newsfeed in Facebook’s primary app. Each story has to be evaluated and considered in isolation before you flip to the next one, slowing down the process of zeroing in on the content you really want.

Each piece of content, from a status update to a shared photo to a link to a news story, gets its own story card taking up the entire screen. Jumping back a layer in Paper does allow you to see a carousel of zoomed-out story cards (see screenshot), but the photos and type are hopelessly tiny. Feedly, Zite and Flipboard all allow more than one legible piece of content on the screen at once, providing more on-screen choice and requiring less thumb action.

It’s very social (but only when it comes to Facebook)

One beauty of Zite, the smart aggregator owned by CNN, is that I can thumbs-up or thumbs-down stories without worrying about anyone but the algorithm knowing what a sucker I am for fake Apple product mock-ups or statistical analyses of Peyton Manning’s legacy. At the same time, if I want to share what I read via Facebook, Twitter, or Google+, it’s easy to do so. But Paper, naturally, is all about Facebook, so likes are public and you can’t even tweet from the app.

(To be fair, the official Twitter app doesn’t exactly facilitate posting to Facebook, either. You can always link your Facebook account to Twitter and vice versa, but the platforms often demand different types of sharing, limiting the usefulness of posting the same content simultaneously.)

That Paper is so intensely Facebook-centric brings all the advantages of in-app commenting on stories, engaging with friends and seeing which news stories are most popular according to more than a billion users. But as a pure news aggregator it falls short of multi-platform sharing functionality of Zite, Flipboard, Feedly and Inside. If you’re a Facebook junkie and want a little bit of aggregated news on the side, Paper could become the only Facebook app and only news app you need. But it’s no major threat to Flipboard and the like yet.

Read more


Wednesday, Jan. 15, 2014


3 lessons from BuzzFeed’s Twitter swarm during the Golden Globes

BuzzFeed wants to own the Twitter conversation when events of national interest take place, and Sunday’s airing of the Golden Globes gave the social news site another chance to hone its craft.

I spoke with BuzzFeed social masters Mike Hayes and Samir Mezrahi via phone about their strategy for covering awards shows and Super Bowls. Here are some lessons: Read more

1 Comment

Tuesday, Dec. 17, 2013


Viral strategy behind WaPo’s Know More blog won’t blow your mind; read this anyway

A two-month-old viral blog by The Washington Post (y’know, the venerable 136-year-old newspaper and venerable 17-year-old website) seems to have tapped into the shareable content trend of the moment.

And even if viral content’s a bubble bound to burst — thanks to Facebook interrupting its business model via algorithm changes or otherwise — the Post hardly has much to lose if Know More, a Wonkblog spinoff, doesn’t work out.

But if BuzzFeed and Upworthy manage to maintain full steam ahead, so too should Know More, which has adopted many of the two viral sites’ strategies, including engaging images, click-bait headlines (not necessarily pejorative), and a social media presence summed up in three words: Facebook, Facebook, Facebook.

“The most obvious similarity there is in targeting Facebook rather than Twitter,” said Dylan Matthews, the main reporter behind the blog, via phone. “If you look at any site that does well socially, there’s just a handful that get their traffic from Twitter. Journalists sometimes forget this because we tend to really like Twitter.”

(Ezra Klein explored journalists’ obsession with Twitter recently on Wonkblog.)

Indeed, Know More has a modest Twitter following — about 1,800 at @knowmorewp and 150 at @GnomeOar. On Facebook, it’s nearing 6,100 likes, far fewer than Klein’s 192,500 followers and the Post’s nearly 1.2 million likes. Matthews runs the Facebook page himself but leaves the Twitter feeds to update automatically.

Of course, Know More doesn’t operate independently from the Post; it draws lots of content directly from Wonkblog and other Post blogs. Meanwhile, the institutional social media accounts bring some valuable exposure, and the institution itself provides some instant credibility — credibility that BuzzFeed and Upworthy often unfairly lack, Matthews said. (He and Klein praised the sites over at Nieman Lab for understanding what readers want.)

Yet the site is also forging an identity distinct from the Post, evident from how the image-heavy grid look departs from the design of other Post blogs, which look more like the main site.

“There are certainly Post readers who go to and expect the physical newspaper as a website,” Matthews said. “Know More is very different from that, so I think there’s an appropriate separation.”

Matthews said he hasn’t yet cracked the code for what content will go viral — we can’t all be Gawker’s Neetzan Zimmerman, after all — but even the least successful posts provide loyal readers with some cool lessons about poverty rates or climate change, with referrals to outside sources, often experts in their fields.

A WaPo memo obtained in November by BuzzFeed, naturally, indicated Know More was the news organization’s most-read blog for the third week of October. Among the blog’s bigger early viral successes, which can account for a major share of an entire day’s traffic, according to Matthews:

“I haven’t studied Upworthy and BuzzFeed’s numbers with the talmudic precision I probably should,” Matthews said. “But the thing that’s been surprising us is how much a single thing can do.”

Correction: A previous version of this story misspelled Dylan Matthews’ name.

Related: Is viral content the next bubble? | Is Facebook’s latest News Feed algorithm really intended to save us from ourselves? Read more


Monday, Nov. 25, 2013

Twitter, Time team up for ‘person of the year’

Time | Twitter

Seven years after bizarrely naming you its person of the year for your ability to, I dunno, tweet and stuff, Time wants you to help with its selection this year. And it has formally enlisted Twitter as your official means of weighing in: Read more


Wednesday, Nov. 13, 2013


Twitter’s custom timelines won’t kill Storify but could become robust filters

Twitter announced Tuesday a “custom timelines” feature that seems to mimic many of Storify’s functions. But is it a Storify killer?

All Tweetdeck users will soon be able to drop individual tweets into a “custom timelines” column with a name and short description. Then, those curated timelines are publicly accessible and can be embedded and shared. Read more

1 Comment

Friday, Nov. 01, 2013

Globe and Mail falls for hoax tweet, falsely reports ex-NSA chief’s death

A fake tweet by a Twitter account resembling that of the popular site Breaking News led at least one news organization, the Globe and Mail, to falsely report the death of ex-NSA chief Michael Hayden at today’s shooting at Los Angeles International Airport.

The apparent hoax account @HeadIineNews, which only showed one tweet at the time, looked like Breaking News’ @BreakingNews account, which has more than 6.3 million followers. The fake account used the same profile photo as Breaking News, and its Twitter handle substituted a capital “I” for a lower-case “L” in @HeadIineNews.

John Stackhouse, the Globe and Mail’s editor-in-chief, tells Poynter via phone that the “embarrassing” mistake was an occasion to reiterate the newsroom’s policies for verifying breaking news. “It was an unfortunate human error made by people not following the practices and procedures we have in place,” he said.

Stackhouse said a reporter initially came across the hoax tweet and forwarded it via email to the editorial Web team. From there, a homepage editor notified a senior editor, who immediately made the call to send out breaking news alerts. Along the way, newsroom personnel failed to verify the legitimacy of the source, and the senior editor failed to confer with the news editor in charge before blasting out the information and adding it to the story.

The first tweet, sent at 1:40 p.m. ET, was quickly deleted. The second tweet at 1:45 p.m., labeled an “update,” should have been labeled a correction, Stackhouse said. At 1:54 p.m., a third tweet correctly identified the report of Hayden’s death as a hoax. He said deleting erroneous tweets isn’t a written policy, but it’s what editors do in practice.

The Globe and Mail’s mobile news alerts aren’t timestamped, Stackhouse said, but they followed the same sequence as the tweets. The newspaper’s written guidelines offer clear advice on the dangers of alerting without verification: “Once sent, an alert cannot be deleted or changed. So if in doubt, wait.”

Stackhouse added that any information added to a wire story should be properly attributed and that a lengthier correction was forthcoming.

A screenshot of the story on the Globe and Mail’s website before it recognized the hoax shows the false report of Hayden’s death under a Reuters and Associated Press byline. But those two outlets never reported it:

“At this point I would think there is some serious fence-mending to be done by the Globe with Reuters and AP after the paper wrongly attributed the Hayden information to them on its website,” says Poynter’s Craig Silverman, a former Globe and Mail columnist. “And I expect to see a report from the paper’s public editor about how this happened, and why the information was wrongly attributed. One thing that also should have happened but hasn’t: a correction tweet from the Globe that acknowledges the paper’s mistake.”

By email, Director of AP Media Relations Paul Colford said, “It was unfortunate that AP and Reuters were mistakenly made to appear in the wrong.” But he said he was pleased to see a number of subsequent tweets from other users correct the report.

Meanwhile, Cory Bergman, general manager and co-founder of Breaking News, tweeted that he asked Twitter to remove the hoax account.

Another Canadian outlet, the Sun News Network, also appears to have tweeted the fake report, deleting the tweet after recognizing it as a hoax:

Some lessons here are obvious: Check the history of a Twitter feed before quoting it, and always seek a second and third source. In this case, the damage was fairly contained and the wrong information didn’t appear to spread far.

Read more

1 Comment