A group weblog about the intersection of news & technology

How Journalists Can Incorporate Computational Thinking into Their Work

Over the last few years, the journalism community has discussed mindset, skillset, journalist-programmers, and other ideas aimed not just at “saving journalism,” but making journalism better. Perhaps now it’s time to discuss how we think about journalism.

Greg Linch, the news innovation manager at Publish2, has been spreading an idea he calls “Rethinking Our Thinking.” The core of this idea is that journalists should explore other disciplines for concepts that they can use to do better journalism.

Linch begins this process by reading and writing about “computational thinking.” He asks, “What from the field of computation can we use to do better journalism?”

Jeannette Wing, a professor of computer science at Carnegie Mellon University, described computational thinking in the 2006 article that sparked Linch’s interest:

“Computational thinking involves solving problems, designing systems, and understanding human behavior, by drawing on the concepts fundamental to computer science. Computational thinking includes a range of mental tools that reflect the breadth of the field of computer science.”

The three major areas that Wing outlines are automation, algorithms and abstraction.

Automation: How can we automate things that need to be done manually each time?

Good examples of automation applied to journalism include acquiring data through an API, aggregating links with Publish2 or even pushing RSS feeds through Twitter. Projects like StatSheet, Neighborhood Watch and NPRbackstory are good examples of automation in journalism.

Derek Willis recently wrote about how The New York Times uses APIs to “cut down on repetitive manual work and bring new ideas to fruition.”

The Times’ APIs make it easier to build applications and graphics that use some of the same information, such as “How G.O.P. Senators Plan to Vote on Kagan” and “Democrats to Watch on the Health Care Vote.”

Algorithms: How can we outline steps we should take to accomplish our goals, solve problems and find answers?

For example, journalists have a process for verifying facts through reporting. We ask sources for background information, sort through data, do our own research and conclude whether a statement is a fact.

A cops reporter’s call sheet is another algorithm: It’s a list of police and fire department phone numbers that the reporter is supposed to call at specified times to see whether there’s any news. Similarly, some news organizations have outlined processes on how to get background information on candidates, such as educational history, arrest records and business holdings.

Does your organization have a flowchart or a list for situations like these? Many reporters don’t like rules, but algorithms help make information-gathering more reliable and consistent.

Abstraction: At what different levels can we view this story or idea?

PolitiFact started out as a way to examine candidates’ claims during the 2008 presidential campaign. It now examines statements made in national politics, keeps track of President Obama’s campaign promises, and has branched out to cover politics in certain states. Earlier this year, PolitiFact teamed up with ABC’s “This Week” to fact-check guests on the show. PolitiFact could easily cover international politics as well.

In 2008, The New York Times built a document viewer to show Hillary Clinton’s past White House schedules. Programmers saw that the document viewer could be used for other stories, so they kept improving it.

Then a few people realized how the viewer could be part of a repository of documents, and DocumentCloud was born. The service builds on the Times’ code to create a space where journalists can share, search and analyze documents. DocumentCloud is an abstraction of the Times’ original document viewer.

Using computational thinking to improve corrections

Finally, an example in which all the aspects of computational thinking can make journalism better: corrections.

Scott Rosenberg of MediaBugs recently wrote about how badly news organizations handle corrections online. Rosenberg suggested some best practices for corrections: Make it easy for readers to report mistakes to you; review and respond to all error reports; make corrections forthright and accessible; make fixing mistakes a priority.

Some of these things can be automated. An online error report goes straight to someone who can manage it. Maybe the reader gets an automated “thank you” e-mail.

There could be an algorithm for investigating the error — or for fact-checking — and another algorithm to handle typos differently than factual errors.

Along the way, those readers who help your organization fix errors might become sources and contributors.

Spacer Spacer

The point of “Rethinking Our Thinking,” Linch told me, is “not to try to fit things into the computational thinking box, but to consider the applications of computational thinking to improve the process of journalism.”

Perhaps we can apply methods of thinking used in other disciplines in the same way we apply “critical thinking” to journalism — less a conscious act and more a general awareness of concepts that can improve the practice. Read more


Monday, July 12, 2010

Chat Replay: How Can Journalists and Programmers Collaborate More Effectively?

It’s oversimplified to call it a right-brain, left-brain difference, but it’s clear that while programmers and journalists need each other, they don’t always find it easy to work together. Differences in project needs and personal styles can add to the disconnect.

Below, you can replay a chat we held about the practical ways to help journalists and programmers collaborate. Here are the folks we talked to:

<a href=”http://www.coveritlive.com/mobile.php/option=com_mobile/task=viewaltcast/altcast_code=2402af06b0″ >Live Chat: How Can Journalists and Programmers Collaborate More Effectively?</a> Read more


Sunday, July 04, 2010

From Open Mics to Buzz Brokers, ‘Content Farms’ are Not all Created Equal

They are called a variety of euphemisms, from “content mills” or “content farms” to “content creation houses” and the Fifth Estate, but make no mistake: sites that specialize in the production and distribution of user-generated content are influencing the news industry and journalism.

The evergreen content produced by Demand Media, Helium.com, and Associated Content finds its way from these platforms to a variety of media partners, including newspapers, magazines and online news providers seeking to add local or evergreen content to their sites.

These partnerships generate low-cost content for publications and revenue for the content provider. And for some writers, these opportunities provide them with credibility and a small amount of regular income.

In a recent webinar hosted by Poynter’s News University, Mitch Gelman, Vice President of Special Projects at Examiner.com — a relatively recent addition to the stable of content creation houses — discussed the differences between these sites.

Gelman introduced three basic models that describe the writers drawn to the sites and the content that they provide.

  • Open Mic sites have their roots in the “Speaker’s Corner.” People drive the content production on these sites. Both Associated Content and Helium.com have Open Mic components to their content production models. Demand Media may be adding this to their offerings in the near future as well.
  • Buzz Brokers analyze search trends and put out calls for stories. Associated Content has incorporated this model, and this is the primary content model at Demand Media.
  • Pro-Am sites reach out to people in neighborhoods who can contribute. Its roots are in the stringer model of local newspapers, and these models seek to develop their contributors’ skills. Helium and Examiner.com make use of aspects of this model.
The three pieces in this series touch on elements of each model. The partnership between Demand Media and USA Today for travel content reflects some buzz brokering. The purchase of Associated Content by Yahoo! may demonstrate the value of a local open mic. And Helium’s strategy of credentialing writers underlines a distinct pro-am influence.

Key to how each of these develops will be the evolution of their contributor communities. Their respective page views make it clear there is a huge demand for the content produced. And if contributor participation and collaboration continue, maybe these upstart disruptors will begin to replicate a virtual newsroom experience while expanding their business models. Read more


Friday, June 25, 2010

How the Semantic Web Can Connect News and Make Stories More Accessible

Tom Tague isn’t content to let an article just be an article. “How do I take a chunk of text,” he asked, “and turn it into a chunk of data?”

He was speaking Thursday night at a panel discussion hosted by Hacks/Hackers, a San Francisco-based group that bridges the worlds of journalism and engineering. Coinciding with the 2010 Semantic Technology Conference, Thursday’s presentation dealt with the Web’s evolution from a tangle of text to a database capable of understanding its own content.

Tague, vice president for platform strategy with Thompson Reuters, was joined by New York Times Semantic Technologist Evan Sandhaus, allVoices CEO Amra Tareen, and Read It Later creator Nate Weiner. The semantic Web is already here, they explained; and it’s getting smarter.

Make news worth more

Simply put, the semantic Web is a strategy for enabling communication between independent databases on the Web.

For example, Sandhaus said, there’s a wealth of priceless data in databases at Amazon, the Environmental Protection Agency, the Census Bureau, Twitter and Wikipedia. “But they don’t know anything about one another,” he said, so there’s no way to answer questions like, “What is the impact of pollution on population?” or “What do people tweet about on smoggy days?” (Sandhaus said he did not do his presentation as a representative of the Times.)

This is a particular problem for news publishers, said Tague. Publishers need to monetize content, engage with users and launch new products; since news articles lie in a “sweet spot” between fleeting tweets and durable scientific journals, they have the most potential to grab and retain readers.

In other words, it’s possible for publishers to improve the value and shelf life of news. All that’s required is rich metadata.

Metadata, Tague said, improves reader engagement by linking together related media. For readers, that means more context on each story and a more personalized experience. And for advertisers, it means better demographic data than ever before.

But there’s a problem: Currently, the economics of online news doesn’t support the manual creation of metadata.

Let algorithms curate

Tague’s solution to the Internet’s overwhelming volume of news is OpenCalais, a Thomson Reuters tool that can examine any news article, understand what it’s about, and connect it to related media.

This is more than a simple keyword search. OpenCalais extracts “named entities,” analyzing sentence structure to determine the topic of the article. It is able to understand facts and events. For example, when fed a short article about a hurricane forming near Mexico, an OpenCalais demo tool recognized locations like Acapulco, facilities like The National Hurricane Center and an even occupations like “hurricane specialist.” It also understood facts, synthesizing a subject-verb-object phrase to express that a hurricane center had predicted a hurricane.

OpenCalais has already been put to work at a wide range of news organizations, including The Nation, The New Republic, Slate, and Aljazeera. Each site’s implementation is unique; for example, DailyMe uses semantic data to monitor each user’s reading habits, presenting the user with personalized reading suggestions.

Both The Nation and The New Republic saw immediate benefits to the use of OpenCalais, Tague said; the tool coincided with significant gains in time-on-site, and it automatically generates pages dedicated to a single topic, which had been a labor-intensive process for editors.

Overcome overwhelming content

As OpenCalais frees editors from the minutiae of searching for complementary stories, Nate Weiner’s software facilitates the gathering of reading material. Read It Later integrates with browsers and RSS readers; when users see something that they want to read later, they simply flag the page and the application gathers it for later consumption.

Unfortunately, users can sometimes wind up with an overwhelming, disorganized collection of articles. So Weiner decided to teach the application how to group similar items, making them easier to skim and select.

Initial experiments with manual tagging didn’t work out, since users weren’t interested in taking the time to add tags to every article they collected. So Weiner turned to semantic applications that could automatically analyze each article and organize related topics. His tool of choice: OpenCalais, which turned Read It Later’s “Digest” view from an unwieldy list into a magazine-like layout.

Organize the organizing

Sandhaus described the alchemy of the semantic Web as “graphs of triples,” which drew furrowed brows from his audience. But it turned out not to be as complicated as it sounds; the “triples” are just simple subject-verb-object sentences, chained together. For example, if a tool detects “Barack Obama” in an article, it will scan nearby words to create a relationship like “Barack Obama is the President.” Then it can build on its knowledge of “the President” to branch further out: “The President lives in the White House,” “The White House was burned in 1814,” and so on.

These relationships are derived from massive databases that grow larger and larger by the day. For example, DBpedia has turned Wikipedia into a database of 2.6 million entities; Freebase is a database of databases with 11 million topics; GeoNames tracks 8 million place names, and MusicBrainz can recognize 9 million songs.

But the real magic happens when the databases come together, such as when the BBC wanted to create a comprehensive resource for information about bands. By merging its own information with entries from Wikipedia and MusicBrainz, the BBC created a website that seems to know everything about music.

Trust algorithms, but trust humans more

As smart as the semantic Web can be, it’s still not as smart as a human editor. “Our algorithms can never be perfect,” said allVoices CEO Amra Tareen. Her company provides citizen journalists with their own news platform, incentivizing high-quality reporting with payments based on page views.

Since its launch in 2008, allVoices has scanned articles to generate what Tareen called a “bag of words” that connects each story to complementary reporting. Depending on a reporter’s algorithmically calculated reputation and users’ engagement with the story, the story can work its way up from a local section to national or even global focus on the site.

Tareen estimates that the curating of news on the site is about 20 percent human and 80 percent algorithmic.

Expect to see more semantic Web tools

Expect to see more semantic Web technology — lots more, and soon. “There’s growing momentum in this space,” said Sandhaus, gesturing to a slide showing exponential growth of connected databases. “The more that you put yourself out there and people point back to you, the easier you are to find.”

Fortunately for journalists, the semantic Web will work for humans, not the other way around. “We don’t want to get in the way of the journalistic process,” said OpenCalais’ Tague. That’s welcome news to any reporter who has been frustrated by a clunky content management system, a labyrinthine tagging and categorization system or manual photo management.

Semantic Web developers’ goal, Tague said, is to free journalists to report, rather than sentencing them to generate endless metadata for the sake of SEO. “I hate the idea of journalists writing for searchability,” he said. “That’s a problem we should solve on the tech side.”

Spacer Spacer

Weiner of Read It Later agreed. Speaking on behalf of developers, he advised journalists, “Keep doing what you’re doing. We’ll try to adapt.” Read more


Monday, June 21, 2010

Why USA Today Partnered with Demand Media

As more news organizations begin to consider integrating user-generated content into their daily offerings, several traditional news publishers (Hearst) have started using various forms of user-generated content from content production sites like Helium.com and Associated Content. Demand Media is the newest and perhaps most closely watched of the content production sites.

Concern over Demand comes not just from its 2008 merger with blog syndicator and aggregation software developer Pluck, but also due to its proprietary algorithm that is said to help content producers generate keyword-rich content that increases reach into the first pages of Google and other search results.

In the deal between Demand Media and USA Today, Demand provides 4,000-plus keyword-rich “Travel Tips” articles and other types of content that will be cached in USA Today’s Travel Section. Demand Media will also provide keyword-rich advertising to accompany the content. While the article content will be free to USA Today, the revenue generated from the ads will be split between the news organization and Demand.

Recently, I had the opportunity to correspond with Victoria Borton, General Manager of the Travel section at USA Today, on the decision to partner with Demand Media and the benefits.

Tish Grier: Were other content outlets considered before Demand Media was chosen?

Victoria Borton: USA TODAY had a long-standing relationship with Pluck through our integration of their social media tools on USATODAY.com, enabling community as part of our Network Journalism launch in 2007. Pluck introduced us to Demand Studios about extending our relationship around their search optimized content model, we agreed travel was a category where creating a co-branded section using their approach made sense.

… We’ve had a positive relationship with Pluck since the launch of Network Journalism on USATODAY.com in 2007.

How important was Demand’s ad production and placement plan to the deal?

Borton: Demand Media’s system to create content based on search trends and the corresponding advertising model provided a strong business case for entering into this relationship.

Why was the Travel section chosen over other USAT sections that feature evergreen content? Is there an expectation that Demand’s content will help the USAT Travel section become a “destination site” known for its travel info?

Borton: Travel is an area where consumers are always looking for functional, actionable tips and information around a wide variety of topics. It’s an ideal area to offer travel tips. USA TODAY Travel is already a popular destination site for original, trusted travel information, and the addition of Travel Tips is one way to broaden our overall content offering.

What might be expected earnings from the travel section now that this deal is in place?

Borton: We are most excited about the demand-driven, search friendliness of this content, and its ability to bring new users to our site. As traffic increases over time, advertising revenues will follow those traffic increases.

Are there any plans to extend Demand content to other sections in USA Today, or to use them for any news or investigative reporting?

Borton: We will watch the performance of the section over time and make further decisions on whether to extend to other areas if it makes sense for both our audience and our business. While Demand Media’s co-branded content expands our overall offering to our audience, there has been no thought that it would replace our existing content coverage, news and investigative reporting in any way.

Much has been made of the possibility of Demand’s content not meeting with prevailing journalistic standards. Could you comment on Demand’s standards and how those standards relate to the journalistic standards maintained by USA Today?

Borton: We worked with Demand Media to share our overall editorial guidelines, and they selected their top writers with existing travel experience for our project. USA TODAY reserves the right to remove content we don’t feel is up to our standards. For this type of consumer service content, we are happy with the quality to date.

How might you describe the relationship between “content” and “journalism”?

Spacer SpacerBorton: Journalism is core to the USA TODAY brand — it’s our unique investigation and reporting around timely events and items of interest. Content can be anything consumed by a user: data, information, listings, photos, videos, maps and so on.

Note: On June 14, USA Today announced a partnership with location-based social network Gowalla. I asked Borton if any of the Demand Media content would be served on the three Gowalla applications. She responded: “All three of the USA TODAY content features appearing on the Gowalla application are written by USA TODAY staffers and freelance columnists.” Read more


Helium Hopes Credentialing Sets it Apart from other Social Content Producers

For those who are concerned about the future of news, the notion that a “content mill” could produce quality journalism seems to be anathema.

But Mark Ranalli, CEO of Helium.com, has been working towards building the kind of online community that could do that.

In a recent conversation with Ranalli, he explained that since its launch in 2006, Helium has been growing as both a content platform and community in many different ways. One of the significant changes is Helium’s Credentialed Professional Program.

As more professionals have come to Helium, some via its partnership with the Society of Professional Journalists, Helium needed a system that brought their offline credentials into the online community.

For example, a journalist or SPJ member can apply to Helium’s credentialing board with all the necessary information, and the board will check those credentials. If the writer is credentialed as a journalist, then he will receive the appropriate site badge, and a four-star ranking.

A paramedic who is writing on health issues might apply to be credentialed as a medical professional. Credentialing and badges let others know that the writers are people who have substantial experience in particular fields and that their work can be trusted.

Helium has also assembled a credentialed Editorial Team. Potential editors must apply for a position, and pass what Ranalli describes as a very stringent test of their editorial skills before being considered for the team. “I know that the people on are editorial team are top-notch,” Ranalli said, “because even I can’t pass our editor’s test.”

Since the implementation of credentialing and the introduction of the Editorial Team, Ranalli noted that more magazine and online publishers are turning to Helium’s content rather than to freelancers. The pay, however, is lower than what freelancers may once have made. Ranalli sees a downward trend for wages: “People might not get paid the same amounts as in the past, but they will be paid and published.”

Credentialing may also be important as Helium considers doing investigative journalism. In December of 2009, Helium News was introduced to encourage more news-style reporting, as well as collaboration between contributors. Ranalli believes these changes begin creating an online newsroom experience, where seasoned reporters mingle and exchange ideas with new writers.

This “virtual newsroom” community does not fast track publishing on Helium. All articles, whether or not they come from a credentialed writer, are submitted first to a blind peer-review process. This process has always been part of the Helium model of editorial oversight, partly because it can lessen the likelihood stories will be approved based on a writer’s popularity.

Ranalli also consider the blind review process an important way to bring forth new voices that might otherwise never be heard and have the potential to make a strong contribution to journalism.

Helium, over the years, has created partnerships with prestigious organizations in order to raise the profile of it writers. A partnership with the National Press Club has opened the doors of that 100-year-old organization to Helium contributors who have earned a five-star ranking.

And an ongoing relationship with The Pulitzer Center for Crisis Reporting brings the Global Issues/Citizen Voices essay contest on under-reported topics to the Helium community. The current and ninth contest is focused on global maternal health.

Spacer SpacerThis partnership and others have lead to the Citizen Journalism Awards, which cover a broad range of topics and are sponsored by organizations as diverse as The Sunshine Foundation, the Knight Center for International Media and ITVS (for the 1H2O Project), and PETA.

Helium hopes these partnerships and its editorial processes elevate it above other content-creation companies. At least some believe it has. The Massachusetts firm was recently named one of the “Hottest Boston Companies.” Read more


How Associated Content Helps Yahoo Go Local

Since 2004, Associated Content — “The People’s Media Company” — has grown a stable of over 380,000 loyal content producers who have contributed over two million pieces of text, audio, video, and photographic content to its distribution platform. In mid-May, it was announced that Associated Content had been sold to Yahoo! for a little more than $100 million, and has plans to shut down the Associated Content website when the sale is complete in the third quarter of this year.

How will Associated Content continue to court the loyalty of its contributors while the sale and shutdown are pending? And what — beyond the obvious advantages of loyal contributors and a huge cache of money-earning, evergreen content — does Yahoo get? I posed these questions to Patrick Keane, CEO of Associated Content.

But first, here’s how it works now. Associated Content’s writers create self-selected and assignment-based content. Most of what is produced is evergreen content, but there are also personal essays, product reviews, and the like. While some content is paid at scale or “upfront,” Keane explained that various types of content are often valued individually, according to the form (text, video, etc.) and potential earnings.

Since monetization happens over the lifetime of an article, and articles are considered annuities for both Associated Content and the producers, potential earnings are determined by a number of factors, including Web search results and Ad Sense metrics.

Content contributors have a number of options to help them distribute and promote their work across social networking sites and blogs. Contributors can also rely on the site’s search engine optimization and how-to guides for creating headlines and leads with search-friendly keywords. The combination of self-promotion and search engine optimization helps producers maximize what is available to them beyond the upfront payment system.

Keane said that “no immediate changes” would be made to the payment process. He elaborated: “We remain committed to the people that produce content. The acquisition by Yahoo! brings a great deal of opportunity for them and this will increase our contributor base. Contributors will continue to create and upload content onto Associated Content’s platform. They will now be supported with a much larger distribution – 600 million unique monthly visitors.

“There may be tweaks and changes to the process of content creation in the future,” he continued, “but both Yahoo! and Associated Content are committed to maintaining the standards our contributors are used to in order to produce the most useful, original content by the people, for the people. Yahoo! plans to leverage content from our contributors across its leading media properties including Yahoo! News, Yahoo! Sports and Yahoo! Finance.”

Contributors will also have the opportunity to produce for Shine, Yahoo! Movies, OMG, and most of the Yahoo! network.

Associated Content currently partners with media organizations, including Thomson Reuters, Cox Newspapers, CNN. Keane said the sale is viewed positively by them. “Yahoo! partners and collaborates with publishers and they view the acquisition of Associated Content as an opportunity to extend those partnerships,” he told me. “We envision that this agreement will open new opportunities to partner with other companies that share the same mission of producing high-quality original content at scale. No specific changes have been made at this point.”

There may be a battle brewing over who will produce fresh, news-style content, though. Even though its focus until now has been on the production of evergreen content, with less than 10 percent considered “news,” there are a number of seasoned journalists who contribute news-style content to several of AC’s verticals, including Sports and Society.

Prior to the sale, I had asked Keane about the potential for Associated Content to create local news. “Using the virtual assignment desk, we can activate any audience in any ZIP code,” he responded. “So, then, we could potentially have someone follow the story of a plane crash. We can activate people in any community to create news stories if we’d like to do that. But that’s not our focus.”

Yahoo!, however, will now be able to throw its hat in the ring, alongside others such as AOL, in the fight to produce content for local portals.

Spacer SpacerWhen asked after the sale whether Associated Content might begin to produce more locally-focused content, Keane responded: “The local section on Associated Content’s site already has a library full of locally-focused content across several topics. Yahoo! intends to leverage the Associated Content platform to generate content across their properties, including local content. This deal will help Yahoo! provide useful local content, as Associated Content has the unique ability to tap 380,000 ‘man-on-the-street’ contributors who are experts in their locale and can produce high-quality content in real time from any DMA.” Read more


Thursday, June 10, 2010

Miami Herald Marks Anniversary of Mariel Boatlift with Database of Passengers, Vessels

A Miami Herald database has publicized in-depth information on one of the most important events of Cuban emigration. A reporter, data analyst and Web developer worked for months to digitize and organize little-known data about the 1980 Mariel boatlift, published in late May to commemorate the 30th anniversary of the vessels’ arrivals in the United States.

The data sets are more than mere numbers and names; every record hints at the story of someone beginning a new chapter of his or her life. It’s a powerful example that demonstrates that data-driven projects can be much more than stark, emotionless series of numbers.

The project tracks more than 125,000 passengers of the 1980 Mariel boatlift from Cuba to Florida, which was one of three post-Castro exoduses. The idea behind the database was to create a master list of people who arrived during the boatlift, culled from data obtained from an unknown government source of raw, unstandardized logs. The Herald planned to encourage people who were part of the boatlift to help create a comprehensive list of vessels that made the trip and match people to vessels.

For the reporter who compiled the data, this was more than a special assignment; it was an opportunity to bring in-depth coverage to an experience relevant to her own life.

Staff writer Luisa Yanez came to the U.S. on the Freedom Flights, another exodus from Cuba to Florida. “Today, there is no master list, no Ellis Island-type record to mark the arrival of Cubans in Miami,” Yanez wrote in an e-mail. “The goal of the Mariel Database is to fill that hole for one of our best-known exoduses by creating a passenger list for each vessel.”

Cleaning the list of refugee names, which mostly meant double-checking every record for accuracy and removing obvious errors, took Yanez about five months. She said she was freed from her daily deadlines to work with the data.

As part of her research, Yanez said she had hoped to find more complete information about who was on which boat. About four months into the project, she requested records related to the Mariel boatlift from a U.S. Coast Guard historian. He mentioned a document called the Marine Safety Log, a list of boat manifests.

At the time, it was only available in handwritten form, although it was scheduled to be digitized. To expedite the process, Yanez hired a researcher in Washington, D.C., to copy and send the data to her.

After ensuring the information was relevant, Yanez and a group of transcribers hired for the project digitized the boat names. The process took about two weeks.

While not comprehensive, the Marine Safety Log provided more information than Yanez, Database Editor Rob Barry and Web Developer Stephanie Rosenblatt originally expected to be able to provide.

In its final form, the Herald’s list aggregates, and makes searchable, two data sets. “One is a list of more than 130,000 names of Cubans who arrived in Key West via Cuba’s Mariel Harbor between late April and late September 1980,” Yanez wrote. “The other is a list of the names of more than 1,600 boats used during that very boatlift.”

The design of the site, which Yanez said transforms the data into a community project, encourages readers to contribute missing records and assign or remove anyone from a boat list. People can also share their anecdotes and memories.

Yanez said public reaction both online and in person has been strong and emotional, which reinforces the idea that historical databases are more than numbers.

“We had people burst into tears at the simple sight of their name on our database,” said Yanez. “I like to call this ‘the power of the list.’ There is something tremendously moving about experiencing a traumatic event in your life — war, migration, persecution — then seeing your name among all the other survivors or veterans. It’s affirmation that I was there, that I counted, that I mattered.” Read more


Wednesday, June 09, 2010

‘All Facebook’ Blogger Explains Reporting Process, Decision to Unpublish Erroneous Post

Last week I saw a refreshingly honest post on a site called All Facebook, which provides reporting and analysis of, you guessed it, all things Facebook. The post that attracted my attention was a follow-up to something that had been posted on the site the day before. Nick O’Neill, blogger and founder of the site, wrote:

“Yesterday I posted an article on here which suggested Facebook or Google had accidentally ‘leaked’ user emails, through Facebook’s opt-out system. The logic we used at the time to deduce this was completely off. …

“Since the article was so off base, we decided to pull it all together. While the logic we used was a round-about logic, we weren’t the only ones confused. However rather than updating a post which has practically become useless, we’ve pulled it all together.”

Talk about transparency being the new objectivity. O’Neill’s approach struck me as more forthright than many bloggers, who use the term “update” when they really mean “correction.” And it was more up-front than news organizations that are willing to correct a minor factual error but won’t acknowledge if the premise of a story is “completely off,” to use O’Neill’s words.

The ‘unpublishing’ question

While some areas of online publishing have developed rapidly — how user comments are handled, for example — the “unpublishing” dilemma is as perplexing as ever. A decision to unpublish a story is wrapped up in how a site handles corrections, just as newspaper and TV retractions are reserved for extreme cases in which a correction isn’t enough. And, as I learned from my conversation with O’Neill, it has a lot to do with your standards for publishing in the first place.

When my colleague Bill Mitchell wrote about unpublishing a couple of years ago, he concluded that pulling a story should be the last resort, not the first. (Many, if not most, requests we get for unpublishing — or hear about — come from people who say they’ve been harmed by some piece of ever-Googleable news, not an error by the news organization.)

I bet most online publishers, whether All Facebook or the Chicago Tribune, would agree with Mitchell. In fact, O’Neill told me that he rarely removes content from his site. (We’re learning more at Poynter about how others handle this, and as we do, we want to help you learn more.)

All Facebook is an example of what we’ve taken to calling the Fifth Estate at Poynter — an enterprise that creates journalism, but without the trappings or conventions of a traditional news outlet. More and more these niche sites are tracking the ripples of news caused by companies like Facebook. When the ripples become a big wave, their work quickly moves from the Fifth Estate to the more established Fourth Estate.

All of this makes it important to understand how such sites gather, publish and in some cases, unpublish. This is how O’Neill described his process and values.

He’s comfortable with reporting the truth as it develops

Last Thursday, O’Neill found a link on a site that aggregates news of interest to programmers. Someone had blogged that his e-mail address was exposed by a Facebook page that was apparently visible to the public and had been indexed by Google.

It appeared that someone had uncovered yet another Facebook privacy lapse, and O’Neill tried to figure out what was going on.

O’Neill said he contacted Facebook’s communications department, which usually responds quickly to his inquiries. When he didn’t hear back after 30 minutes or so, he posted the item. He temporarily pulled the item when the company asked for more time to respond, and he posted it again with a comment from the company.

Already, this looks very different from the traditional newsgathering process.

“Writing on a blog,” O’Neill said, “I’m pretty flexible with the truth evolving. Because that happens with a news story sometimes, that we get a half-piece of the story and you wonder, do you post the article or not?”

That was not the end of the process. He and someone with Facebook’s communications department started debating the accuracy of his post. He thought the company’s disagreement amounted to a technical argument that didn’t invalidate his post.

But after discussing the issue with a Facebook developer who also voiced his disagreement, O’Neill updated the post about 10 times, striking through sentences and adding new information. One of the last updates said, “This post is effectively destroyed,” and O’Neill decided to just take down the whole thing.

O’Neill felt he could effectively argue the post was accurately updated, but beneath the strikethroughs and other changes “it still showed all this false information.”

He believes there’s merit in publishing information even if it turns out to be incorrect

Despite the multiple changes and corrections and eventual removal of the post, O’Neill stands by the process. “Eventually the truth comes out in one form or another,” he said.

“I honestly think that publishing information reveals the truth quicker, as it creates an opportunity for others to come forward with more information,” he told me in a follow-up e-mail after our conversation. “In this instance it was Google and Facebook that needed to provide information; however that’s not always the case. Sometimes it’s a tipster or someone else that can help create the truth.”

He noted that he was right about one important element: If someone publicized the link to this particular Facebook page (for instance, by linking to it on a blog), Google would index that page and reveal the e-mail addresses on that page. Facebook and Google quickly worked to fix that.

You can’t really un-ring the bell online

Some bloggers have said that readers understand the information they read is often in the process of being reported. O’Neill believes this. He also believes it’s the readers’ job to make sure they’re properly informed, but he didn’t make it easy for readers to get the correct information in this case.
The first people to see the post, he acknowledged, saw incorrect information. If they checked back, the link to that post returned a 404 or “page not found” error. He didn’t explain why he removed the post until the next day.

Likewise, he couldn’t pull the story from his site’s RSS feed, so those readers didn’t know the story had been unpublished. Again, clicking on the links wouldn’t have told them anything.

And one commenter pointed out that unpublishing didn’t help to inform readers of the hundreds of blogs that had cited and linked to his post.

O’Neill said he would have liked to redirect to the post in which he explained his retraction, but he doesn’t have an easy way of doing that and didn’t have the time. “I justified it out of the perspective that it’s ultimately the user’s job to become educated.”

He knows that how he deals with these issues affects his credibility, reputation and survival

O’Neill started his site three years ago, sold it to the company that owns Mediabistro.com, and acknowledges “large ambitions” for the site (as well as related ones covering social media). He attempts to balance his desire for more traffic with his desire for credibility — among his readers and by the people who work at Facebook.

“At the end of the day, the more content I have, the more traffic I get. So I need to publish as much information as possible about Facebook,” he said.

However, he said he’s conscious about maintaining his relationship with Facebook, which he described as being cooperative over the years, even back when he was starting out and his work wasn’t as accurate. (Facebook had 15 million users when he first started blogging about it. Now it has more than 400 million.)

“If you’re going to post something damaging, then you better have the facts right,” he said. “Because the last thing I want to be doing is not only damaging their brand, but damaging their brand without any sort of legitimate backing.”

So he won’t bury the lead if he learns of a security glitch, he said, but he won’t sensationalize it, either. “That will drive traffic, but it’s not good traffic, and it’s going to hurt my reputation.”

That brings us to a value that he shares with journalists old and new: He wants to be seen as an unbiased source of news about the social network. “If you want to maintain a trustworthy reputation with your readers, you need to publish the truth eventually, right?” he asked. “As close as you can come to truth.” Read more


Saturday, June 05, 2010

How to be a social climber on the digital ladder

There are a variety of ways to participate in or experience news via social media. Twitter, Facebook, LinkedIn, FriendFeed, Yelp, Foursquare, Gowalla… the list goes on. But in what ways should a journalist utilize social technology?

A few years ago, Forrester researchers Charlene Li (a Poynter National Advisory Board member) and Josh Bernoff created the Social Technographics Ladder. This graphic (below) defines the behaviors and interactions associated with social media by placing users into overlapping categories. Each rung on the ladder represents a specific set of behaviors, and people can move up and down these rungs. (The most recent addition to the ladder is the “Conversationalists” category.)

How many of these rungs should today’s journalist climb? I say every rung above “Inactive.” Why? Because while there may be a learning curve for using specific tools, these categories describe behaviors that defined journalism before social media became the “it girl.” Here’s how each rung relates to journalism, from the top of the ladder to the bottom.

  • Creators author a story.
  • Conversationalists talk to people about stories, find sources, break news.
  • Critics review, offer opinion pieces.
  • Collectors research, create contacts and read publications on a regular basis.
  • Joiners are part of a community, professional or personal group.
  • Spectators keep up with competitors and other publications.

I am a Creator, Conversationalist, Critic, Collector, Joiner and Spectator. But, I’m not all of these things on every social network. I focus on the networks that I see being used heavily in my Lawrence, Kansas community: Twitter, Facebook, Foursquare and Gowalla.

LinkedIn, MySpace and FriendFeed are not used as often by our audience at the Lawrence Journal-World, so I’m more of a Joiner/Spectator when it comes to those. But our websites have an active presence on them all.

Being an active part of these networks keeps us in touch with a tech-savvy, information-hungry portion of our audience. They’re willing to participate in and share our content on a daily basis. On Twitter alone, if a handful of people retweet a link, it could reach hundreds of thousands of users new to LJWorld.com.

Where are you on this ladder of social interaction today? Have you been a social climber over the last few years?

Hint: If you’re reading this, you’re at least a Spectator. If you have an account on Facebook, you’re a Joiner. If you leave me a comment, you’re a Critic. Read more