Texas Tribune databases drive majority of site’s traffic, help citizens make sense of government data
When The Texas Tribune launched, Matt Stiles said the site was the .10 version of what it would be six months out. Sixteen months later, the site's traffic and audience have grown tremendously, in large part because of its work with data.
The Tribune has created more than 50 data-driven projects that readers are using to locate their lawmakers in the Capitol, access information about prison inmates, and see how minorities have driven population growth in Texas.
Stiles said by phone that in addition to driving about two-thirds of The Texas Tribune's traffic, the databases have attracted new audiences and provided readers with an interactive way to access information that's public but not always easy to find.
"We're sort of like an OpenSecrets slash online news organization," said Stiles, reporter and data applications editor. "We're not nearly as good as OpenSecrets, because they've been around longer, but I hope that we can be that resource for people. We're getting there."
The Texas Tribune's database of annual salaries for more than 550,000 public employees has generated a lot of attention among taxpayers. The database is designed so that users can search for salaries by entering a public figure's name, job title or the agency for which the public figure works.
Some public figures requested that the information be removed after finding out that their salary information was the first result that appeared when they Googled themselves. In response, Texas Tribune CEO and Editor-in-Chief Evan Smith wrote a piece explaining that the Tribune creates such databases because it values transparency, open government, and greater access to information -- and because it believes people have a right to know where their tax dollars are being spent.
The Tribune is working on a new data project involving Texas Senate and Texas House floor proceedings. With a $150,000 grant from the Open Society Institute, The Texas Tribune hired a legal transcription company to produce same-day transcripts of the proceedings.
Later this month, the Tribune plans to provide live video streams of the proceedings, which will make it easier for people who are researching a bill to hear lawmakers' arguments for or against it. Stiles is optimistic that, eventually, the site will be able to marry the transcript to the video.
"I've never not been able to do any project because we didn't have the software or the technical ability," he said. "It seems like everything we think of, we can do."
Databases as public service journalism
Smith said the databases fulfill the Tribune's mission of giving Texans greater access to government information; he even nominated them for this year's Pulitzer Prize for public service.
"We understand that access to information is what makes people more thoughtful and productive and engaged citizens," Smith told me by phone. "Without access to information you have disengagement from the political process and the policy process."
Part of the databases' value also lies in their timeliness and relevance. Responding to news late last month about whether public schools are spending too much on administration, Stiles helped create a database showing the salaries of superintendents -- some of whom make more than $200,000. Users can sort the records in the database by district enrollment, salary and pay per student, and they can see how each superintendent ranks.
Several student-run publications at colleges and universities in Texas have featured the Tribune's data work on their sites. Stiles has also heard from local bloggers and journalists who use the information to find ideas and advance their reporting.
"Everything we do is under a Creative Commons license, so we encourage people to use the databases as a reporting tool," Stiles said. "Sometimes instead of directly using the database, people will e-mail me with questions. I'll send them back a spreadsheet with the data they want and then they run it on their site and attribute it to us."
Stiles often creates explanatory blog posts or videos to show people how to use the databases. The more familiar they are with how the databases work, he said, the more likely they are to stay on the site and use them. He explained that the site's bounce rate has decreased from 70 percent to 50 percent since the site launched, in part because of the databases.
Eventually, Stiles said, he'd like to make more of the databases embeddable and create APIs so that people can access the data directly.
Working with other news organizations
As part of its efforts to share data, the Texas Tribune has worked with other news organizations that don't have the resources to place as great an emphasis on data-driven projects. Recently, he helped the Austin American-Statesman advance a series it produced on the Texas Lottery. The Statesman had built a database to go along with the series, but the data wasn't downloadable.
"That's the kind of thing that's frustrating if you wanted to take that data and play with it," Stiles said. "When you can make the data available in an interactive way, it empowers readers to make their own conclusions about data, assuming you navigate them through it so they understand what's happening."
Stiles asked online projects editor Christian McDonald, who made the database, if he could have the data and map it. He ended up creating interactive maps that broke down lottery sales by ZIP code and per-capita income levels. Both the Statesman and the Tribune linked to the maps on their sites.
McDonald, one of two Statesman staffers who regularly build databases using Caspio, said a developer on staff had built an app for a reporter to use internally while writing the lottery series. The developer left the paper before the series was finished, though, and has not been replaced.
Other news organizations have also lost key developers in recent months, prompting the question: How much are traditional outlets willing to invest in development work, if at all?
"I wish I had a mentor who I could work with on stuff like this, but we don't have any developers in the newsroom right now," McDonald said by phone. "I know that it's an area that this paper really believes in, and that's one of the reasons why they want me to do this more and learn more about it so we can build skills from within."
Staffing for database development
The Tribune has just two newsroom employees who work on databases -- Stiles and Data Assistant Ryan Murphy. It also has three developers and is hiring a few more who can help build news apps and improve the Tribune's content management system.
Ideally, Smith would like to find a donor to underwrite the cost of adding a full-time developer.
"It would be great if there's a foundation or somebody out there who has $50,000 burning a hole in his pocket," Smith said. "I think it would be valuable for us to add a developer whose entire focus could be churning out news applications."
The Tribune has already received some funding for its database work -- including $50,000 in grant money from the Ethics and Ethics in Journalism Foundation and $100,000 from the Hobby Family Foundation. And overall, it's doing well financially. The site ended 2009 with nearly $4 million pledged and has raised a total of $8.2 million to date.
"After 16 months, it's working to the point where I can say we have exceeded our expectations and done extremely well in pulling in a steady stream," Smith said. "We're on a path to sustainability. Between years two and three -- the end of 2010 and the end of 2011 -- we will be cash-flow positive."
Moving forward, The Texas Tribune will continue to place a heavy emphasis on data-driven work.
"Any journalistic organization is theoretically in a position to do journalism," Smith pointed out. "But the most forward thinking ones are the ones who say 'How do we apply the tools of technology to solve the problems of disengagement and low voter-turnout?' "
There's enormous value, he said, in answering that question.