Megan Taylor


How journalists can use Mechanical Turk to organize data, transcribe notes

Some of what journalists do is tedious, repetitive, time-consuming and expensive to outsource. Transcribing interviews, for instance, can take up a lot of time that reporters don’t have.

Amazon’s Mechanical Turk (MTurk) is a tool that can help journalists better manage these kinds of time-consuming tasks. It’s sort of like eBay for work: Post a task, decide how much you’re willing to pay and gain access to thousands of workers worldwide.

I talked to a few journalists who have used MTurk to transcribe notes, search for data and verify URLS and other information. ProPublica helped start this conversation when it published a guide for journalists looking to use Mechanical Turk.

Using MTurk for transcriptions

Andy Baio, a journalist/programmer in Oregon who created, first used MTurk in 2008 to transcribe a 36-minute interview. Read more


How to Use TimeFlow to Manage, Analyze Chronological Data

As a reporter at The Washington Post, Sarah Cohen was frequently frustrated with the dearth of tools for working with chronological data. Now the Knight Professor of the Practice of Journalism and Public Policy at Duke University, Cohen looks for ways to help journalists be more efficient.

TimeFlow, a free and open-source data analysis tool, is the first version (still alpha) of a project that she has been working on to make it easier for reporters to look at data over periods of time. Unlike some of the alternatives, such as the SIMILE Timeline and Dipity, TimeFlow is not built to present the data online.

Cohen worked with programmers Fernanda Viégas and Martin Wattenberg (who previously worked on Many Eyes and are now at Google) to describe what features the tool would need. Read more


How Poligraft Can Help Journalists and Consumers Discover Connections in the News

Poligraft is a new tool released by the Sunlight Foundation that tries to add political context to news stories. It scans news articles for the names of donors, corporations, lobbyists and politicians and shows how they are connected by contributions.

It’s easy to use: Just submit the URL or text of a news article, and Poligraft will create a sidebar containing the relevant information from data provided by the Center for Responsive Politics and the National Institute for Money in State Politics.

The sidebar shows the aggregated contributions from an organization to a politician (for instance, from various employees of one company). The second section, “points of influence,” shows campaign contributions received by politicians, as well as contributions made by organizations. You can click on the names of people or organizations to learn more about them, such as who their contributors are or what lobbying firms they’ve hired. Read more


How Journalists Can Incorporate Computational Thinking into Their Work

Over the last few years, the journalism community has discussed mindset, skillset, journalist-programmers, and other ideas aimed not just at “saving journalism,” but making journalism better. Perhaps now it’s time to discuss how we think about journalism.

Greg Linch, the news innovation manager at Publish2, has been spreading an idea he calls “Rethinking Our Thinking.” The core of this idea is that journalists should explore other disciplines for concepts that they can use to do better journalism.

Linch begins this process by reading and writing about “computational thinking.” He asks, “What from the field of computation can we use to do better journalism?”

Jeannette Wing, a professor of computer science at Carnegie Mellon University, described computational thinking in the 2006 article that sparked Linch’s interest:

“Computational thinking involves solving problems, designing systems, and understanding human behavior, by drawing on the concepts fundamental to computer science. Read more

How to Deal with Web Browser ‘Fingerprints’

A few years ago, The New York Times exposed how “anonymous” search data isn’t anonymous by using saved AOL search terms to track down an elderly widow in Georgia. Now, the Electronic Frontier Foundation has revealed that Web browsers leave information on websites you visit, which could be used to track your digital movements.

Volunteers for an EFF experiment visited The website logged data that are automatically collected when you visit most sites: configuration and version of a user’s operating system, browser and plug-ins.

That information was compared with a database of configurations from other visitors.

EFF found that 84 percent of the configuration combinations ended up identifying unique browsers — essentially acting as fingerprints. Browsers installed with Adobe Flash or Java plug-ins were unique and trackable 94 percent of the time. Read more

1 Comment

Jay Rosen’s Would Have Journalists Answer Users’ Questions

If you listen to Rebooting the News, a podcast done by Jay Rosen, a journalism professor at NYU, and Dave Winer, often described as the father of blogging and RSS, you’ve heard their ongoing discussion about the importance of context and explanation in a new system for news.

Building on those ideas and several existing projects, Rosen has developed an idea that could make journalism better by allowing more people to participate in the process: ExplainThis.

ExplainThis has two parts. One is an open system through which anyone can ask and answer questions and vote on them. The second part involves “journalists standing by.” Journalists would monitor questions, looking for ones that meet three conditions:

  • Many people are asking the same thing.
  • The question can’t be answered well via search.
Read more


Washington Post‘s ‘Post Alert’ Offers Breaking News, Special Projects Updates to Users

The Washington Post has released a site-wide notification system that delivers notices on breaking news and special reports to users of the Web site.

Steven King, who is overseeing the project, told me in a phone interview that editors at the Post can choose to promote stories site-wide or within a section. Anyone who is on the Web site during that time will see a Post Alert. Internally, this project is known as Toast because, as the “Innovations in News” blog said, “it came up from the bottom of your browser like a piece of toast coming out of a toaster.”

The Washington Post is able to track the number of people who click on links, as well as those who opt-out. King said that although it has only been a week since Post Alert launched, he is “very happy with what’s happening.”

The opt-out rate has been low, and the Alert links are being clicked on, driving traffic to special sections, King said. Read more


‘Apps For America’ Shows Innovative Ways to Display Government Data

The Sunlight Foundation, a nonprofit dedicated to greater government openness and transparency via the Internet, recently announced the winners of the “Apps for America 2: The Challenge” development contest. There is a lot to learn from the winners: Datamasher, GovPulse and ThisWeKnow.

News organizations have been putting data online for years, but not many of them have been doing it well. (Think data ghettos.) As government agencies and third parties place a high priority on sharing information that’s key to public discourse, news organizations may benefit from observing how they put data online.

Apps for America 2 was a direct response to the launch of, which makes federal data sets available to the public. The goal of the development contest, according to Clay Johnson, director of Sunlight Labs, was to show that when the federal government releases data, “it makes itself more accountable and creates more trust and opportunity in its actions.”

Developers had to create a Web application that used at least one data source from Read more


How AP’s News Registry Will (and Won’t) Work

The Associated Press’s announcement of a news registry to “track and tag all AP content” to “assure compliance with terms of use” has stirred a lot of discussion. From techies to journalists, it’s unclear how the registry will work, whether it will do what AP claims, and how it will fit in with copyright law and the culture of the Web.

The news registry was announced as part of the AP’s initiative to “protect news content from misappropriation online.” Bloggers worried that AP was after them, spurred by AP CEO Tom Curley’s statement to The New York Times that the registry would be used to regulate even the use of a headline and a link to an article. Others at the AP, however, have said that the news organization has no problem with people quoting its content in the course of blogging. Read more