The Web is a chaotic jumble of billions of connected pages. Without some clues and context -- dates, for instance -- we can't make sense of all that information. And neither, apparently, can the Google News crawler.
The
South Florida Sun-Sentinel and Google now agree on how a six-year-old story about United Airlines declaring bankruptcy surfaced, and how that story was made easily available to a reporter working for an investors reading service.
The article in question (a
Chicago Tribune article that also ran on the Tribune Co.'s
Sun-Sentinel site) did not have any date of publication associated with it,
as visible on this Google image. Google News determined that it was a new article -- new since it had visited the site 20 minutes earlier -- and placed the date of Sept. 6 on the story in its index,
according to a screen image provided by Gary Weitman, Tribune Co.'s senior vice president for corporate relations.
That made the story readily available for the reporter whose job it is to find and summarize news stories about distressed companies. She wrote a summary on Monday morning, "we put it up on Bloomberg and within two minutes the phones started ringing," said Richard Lehmann, president of
Income Securities Advisors Inc.
Lehmann looked on Bloomberg, saw that his service was the only one reporting a United bankruptcy, and quickly pulled the story down and issued a correction. "Something that big would have 10 stories on Bloomberg by that time. So we knew this was a bad story," Lehmann said. "But we didn't realize how bad it was."
The first reports Monday were that the
Sun-Sentinel had mistakenly published an old story.
Sun-Sentinel Editor Earl Maucker was adamant that his site was not responsible for an archived story being unearthed and spread through a business wire service. "We didn't alter or touch or update or change that story at all," he said.
But late Monday, Google released
an image of SunSentinel.com's business page. In the lower right corner, a link to "UAL files for bankruptcy" is listed at the bottom of the most-viewed business stories.
So how did old news appear to be new?
Google spokesman Gabriel Stricker said the date that appears on a Google News listing is determined by information associated with the article -- when it was published or updated.
Google says its crawler indexed the site at 10:36 p.m. PDT Saturday (Sept. 6), which was 1:36 a.m. EDT Sunday. The only date on the United article was the one just below the SunSentinel.com logo, which merely reflects the current date.
"I don't think it's unreasonable to clearly note what the correct date is for that story, in some easy-to-find, prominent location," Stricker said Tuesday.
Maucker acknowledged that the story was undated. He said that no stories on his site are dated, but that's not so. Other articles have a date and time stamp below the headline. "This was a six-year-old story that got picked up and nobody bothered to verify it as being current," he said.
Once Google News added the story to its index, Monday's events were possible. That morning, Lehmann said, one of his employees did a Google search for "bankruptcy 2008." She found the 2002 article, wrote up a summary and sent it to an editor. The item was posted late Monday morning to Income Securities Advisors' wire service, which is available to Bloomberg subscribers. (The wire service is different from Bloomberg News.) Lehmann said he pulled the story down 13 minutes after it was posted and
investors had already started selling off United shares.
The
Sun-Sentinel and the Tribune Co. have blamed the reporter for reading what she should have known was an old story and for not verifying her facts. "It's hard for me to imagine how any wire service or investors' specialist would just pick up content off the Web without verifying its veracity -- and something that big," Maucker said. "Even if it's not dated properly, that's no excuse."
But Lehmann said his employee isn't expected to verify the stories she finds. In essence, she does what Google and other scrapers do with "bots": find stories, summarize them, provide a link. As long as the story appears in a credible newspaper, he said, "we go with it. We're only saying what appeared in that story. ... I can't fault her."
Until late Tuesday, how the story appeared in the "most viewed" section of SunSentinel.com's business
page was a source of disagreement.
In a Chicago Tribune article, Weitman said that there was no activity on the article until right around the time that the Google crawler indexed the page early Sunday morning.
Google's Stricker said the search engine cannot have been responsible for the story appearing in "most viewed" because its first referral came three minutes after the page was indexed: 10:39 p.m. PDT Saturday. And Google says the story was not seen during the crawler's previous visit at 10:17 p.m.
Late Tuesday afternoon, the Tribune Co.
issued a news release stating that traffic volume to the 2002 story increased in the half-hour before Google's visit, which moved its headline to the automatically-updated "most viewed" portion of the business page. Its appearance there caused Google to re-index the article.
Google wasn't the only news scraper to find the story. The news aggregation service
SmartBrief.com, based in Washington, D.C., indexed it about 11 hours after Google. Chris McNeilly, the site's vice president for technology, said the site searches hundreds of news sites, including SunSentinel.com's home page and business home page, for various industry-specific newsletters.
On Monday, a SmartBrief.com editor decided not to include the United story in its aviation industry e-mail newsletter, said Jennifer McNally, director of editorial operations. "Thankfully, our editor in that space thought this story sounded too similar to a story she had seen previously."
CORRECTION: Sun-Sentinel Editor Earl Maucker's name was spelled incorrectly in the original version of this article.
Steve, you've done a great job of pulling the pieces...