ProPublica’s DocDiver helps users collaborate in document-based investigations

October 4, 2011
Category: Uncategorized

A ProPublica reporting project published today turns primary source documents into a platform for crowdsourcing and reader collaboration.

Readers’ findings are displayed in a sidebar next to the relevant portion of the document.

The investigative reporting nonprofit built a tool called DocDiver, a plugin for DocumentCloud that creates an annotation layer on top of document pages. Readers can make notes on the document, and journalists and other readers can see those notes threaded in a sidebar.

“The tool enables much closer collaboration between journalists and their readers in real time,” said Amanda Michel, ProPublica’s director of distributed reporting (and a member of Poynter’s National Advisory Board). In this case, it supplements reporter Paul Kiel’s latest report on the failed oversight of mortgage lenders by inviting people to annotate three government audits of GMAC.

Of course, many news organizations have taken to posting source documents online. DocumentCloud’s built-in annotation tool enables journalists to annotate the documents they upload.

But until now, most routes for reader feedback were disorganized — packed in a single, separate comment thread — or hidden from public view. When The New York Times published Sarah Palin’s emails in June, for example, readers could privately send the Times messages about notable passages; journalists reviewed them and posted annotations if they decided it was warranted.

DocDiver improves on the status quo by making crowdsourced contributions both public and organized, Michel said. It enables readers “to move through a document alongside a reporter, and for a reporter to be able to publicly address and answer people’s questions and note great findings.”

To use DocDiver, a reader logs in with her Facebook profile, clicks a relevant section of the document and writes a comment (which ProPublica calls a “finding”). The comment immediately goes live so others can read and respond. Readers also have the option to post a finding and a link to it on their Facebook wall, a good example of how social integration can drive more attention and participation.

ProPublica staff can post their own findings and highlight “key findings” among those submitted by readers. Readers can also upvote the ones they find useful. (ProPublica also blogged about more technical details on its Nerd Blog.)

What they hope will emerge is a pool of intelligence and analysis that not only engages readers but informs journalists and can lead to followup stories, said Al Shaw, the news applications developer who built DocDiver.

The tool probably isn’t appropriate for every document or story, Shaw said. DocDiver will be best “for stories that are ‘document dives’ — where there’s more information there than we have bandwidth to go through — and so we want to galvanize our users to go through and find stuff that we’re going to miss.”

A good example of that is when the Guardian obtained 458,000 pages of expense records for members of parliament; the news outlet built a Web system for the public to help review, categorize and find newsworthy details among them.

The Guardian custom-built the interface for that story. The DocDiver tool can be used on any set of records that ProPublica uploads to DocumentCloud. Over time, Shaw hopes collaboration with readers will enable ProPublica journalists to take on more document-based stories and surface the most newsworthy facts buried in the pages.

DocumentCloud is building its own reader annotation tool with this year’s $320,000 grant from the Knight News Challenge. Among the problems DocumentCloud is trying to solve: how to organize a large number of annotations in one part of a document, how to make those notes usable by the news organization, and how to let the news organization moderate posted notes.

ProPublica’s approach to these issues is to thread comments next to the document, enable users to vote up the most helpful comments, and to have journalists to highlight the best comments in a document overview section. We’ll see if DocumentCloud comes up with different solutions.