October 11, 2016

The recent emergence of the bombshell “Access Hollywood” clip that sat in the NBC vaults for over a decade before wreaking havoc on the Trump campaign illustrates the value of keeping archival material searchable and retrievable.

The Washington Post’s Margaret Sullivan recently listed several “journalism lessons” that reporters could take from the current presidential campaign. Among her many takeaways was this: “Research your own company’s archives. If you find something extraordinarily newsworthy there, don’t sit on it at a crucial moment in the nation’s history.”

I’ve written before about the importance and value of archives for news organizations, as have many others, including Josh Stearns, then at the Dodge Foundation, Associated Press corporate archivist Valerie Komor, and Library of Congress Program Officer Abbey Potter.

But for many news organizations, archiving material for easy retrieval is easier said than done. In 2014, a survey released by the Missouri School of Journalism’s Donald W. Reynolds Journalism Institute revealed that “27 percent of hybrid news organizations and 17 percent of online-only enterprises said they’ve experienced a significant loss of news content due to technical failure.”

It’s hard to keep digital archives for a number of technical and financial reasons. But we have an imperative to do so, says Edward McCain of the Reynolds Institute and University of Missouri Libraries. McCain leads the Journalism Digital News Archive program and thinks a lot of about issues around access and preservation of digital news. I reached out to him for a conversation about archives in this election cycle, and how newsrooms might prepare themselves for future election cycles.

Donald Trump and Billy Bush’s 2005 bus ride to the set of “Days of our Lives” has completely upended the presidential campaign, and will likely be one of the biggest political stories of the year. But it’s also a story about archives and preservation. How do you see the value of the tapes in those terms?

The biggest explosion that I’ve seen in social media in a long time has been around this Billy Bush archival footage and audio of Donald Trump, and it really does, I think, speak to the value of keeping things. You never know what the value of a piece might be. You know that Billy Bush and Donald Trump weren’t thinking at that time that this was something that might be pulled down during a future presidential campaign. Who would know? But as it turns out, these however-many-seconds may have turned the tide of the election.

This points out that even relatively small, seemingly insignificant kinds of content can be incredibly valuable and important. People can judge for themselves but I would argue that this kind of archival material provides us with insight about the character of a candidate.

And we can think about these tapes in terms of other seemingly-mundane stuff that’s recorded all the time. That’s the stuff we don’t think about afterwards, but could be important. What was said in the city council meeting? What promises were made when we passed a bond for a sewage plant? What was the public told? What was the understanding at that time?

Those mundane things might become important over the long haul. I think that’s one of the qualities that journalists can use to distinguish ourselves, if we have archives. We can say: “We’re not just taking someone else’s stuff and blogging about it. We took pictures, we interviewed people, and we kept that footage because we think it’s important to our communities and our nation that we are accurately informed about this topic.”

So much of what news organizations record and tape now is digital, so there likely aren’t tapes sitting on a shelf. They’re on a hard drive somewhere. How do you see the current state of things across the industry?

News organizations are doing all kinds of things with existing content. As we have moved into the digital age, I know of some cases where the content that was created for the digital side for online or social media — that material is just being basically tossed. Preservation of that material is just being left to chance, and there are two issues with that: whether or not it’s going to survive, first of all, and secondly, whether anybody can actually find it. It’s so easy with the tools that we have to create content that we just have an oversupply of it, and there’s so much clutter that it’s hard to find the valuable stuff when you need it.

I realize some newsrooms are doing a very good job of preserving their content, but these are larger newsrooms with the resources to have librarians and archivists on staff. If they don’t have those resources, what can they do?

First of all, I think they need to try to incorporate archiving in the process, as material is being created because that’s the best time and place to add information useful for retrieval. You want to at least keyword your notes, your photos — to add archival value.

And that’s not always easy, but at the Missouri School of Journalism, I’m working with the photojournalism department to teach young journalists to think more like a librarian.

We tell them: “It’s not enough to say ‘This is a picture of a city council meeting’” That might be helpful but it might not get you where you need to go in the future. So if it’s a picture of the mayor, what is the mayor’s name? That kind of keywording is not necessarily something that comes naturally to journalists. I think there’s a lot of overlap between information science and journalism, but the idea of how we can reuse this content, that’s not something we’ve focused on, on a daily basis. I think we can benefit from doing that because the information then becomes easier to retrieve. I think the public can certainly benefit.

I see the most archival material currently used in obituaries or in stories about anniversary events. I’m wondering what newsrooms might be able to do to get more archival footage into everyday stories.

First they have to start with the technology they’re using to create and retrieve the stories they’re making. The CMSes we have built to push out content are not particularly good at helping us hold onto that stuff for future use. You might be backing it up somewhere, but in the long term, that’s not going to save it or necessarily make it easy to retrieve.

It’s likely the formats are going to change, and other parts of the system are going to change. What that means is that fairly quickly, you’re not going to be able to access that stuff. A whole chain of events need to be considered to retrieve the stories for use. Maybe that’s the role of another type of system, but to make that happen, the demand has to come first from the journalists who can see the value in using the archives.

I think they also need to play up their archives as a value for their readers. They have cultivated their archive. And so they’re authorities, they’re experts. They’re not just shooting from the hip — which happens in many cases today.

To help people be good information consumers, there could even be something in stories that feature archival material that says “We pulled this from our archives at some expense and effort” and that’s why you should you should attach some higher value to content that has been thoroughly researched, checked, sourced, etc. I think news consumers right now — they’re not really guided in that direction, to understand the value of archives and why they might be useful.

That’s a missed opportunity. If you’re out buying a product, let’s say coffee beans, you might ask yourself “What’s the difference between this coffee and this coffee?” It could be the source, it could be the labor, it could be the pesticide usage. As news producers, it doesn’t seem like we’re explaining the difference between our blend and somebody’s else blend to the audience.

It’s relatively easy to archive paper but much harder to archive data-driven journalism applications. Could you talk a little about how journalists can think about archiving their data apps?

We have gotten people who are interested in it, like Ben Welsh at the LATimes and Scott Klein at ProPublica and Meredith Broussard and Katherine Boss at NYU.

We’ve established a working group here at Dodging the Memory Hole of people who are pushing forward of how to find a practical way to preserve news apps. News apps are hard because they’re bespoke creations. So newsrooms might be using this tool over here and mixing it together with something else over there and tying it together with Python, and then there’s a database — which could be ongoing and changing. They’re very difficult. I don’t think we have good answers for how to do that. I think we’re trying to have a grasp on it. I’m excited to have people together later this week at UCLA for Dodging the Memory Hole: Saving Online News to talk more about this.

How can freelancers and smaller newsrooms think about archiving?

This is true for everyone: at the very least, archive your own stuff. I don’t think as journalists we can depend on the corporate structure, the existing paradigm, to keep our stuff.

These stories are our babies, these are our creations. We need to find ways of keeping them safe. One thing people can do is just to keep a copy of it. I’m beginning to think that on a practical level that places like Dropbox or unlimited backup services where you pay $80 bucks a year that suck up whatever you’re working on, those are a pretty good deal.

You want to keep those things in a format that will still open in the future. I’ve been working with a Missouri journalism librarian named Dorothy Carner to teach students this. We call it Journalism Archive Management or JAM. We go into all of the journalism intro classes at Mizzou and take them through a digital preservation lifecycle — creating it, labeling it, and then storing it — and then we give them tips on how to do that.

Going back to the election for a second, I’m wondering if you could talk a little bit about the importance of archival information to our democratic processes.

As we’ve seen in the election, having an accurate source of information of what people have said and when they said it, and perhaps what they did is really valuable. It helps us make better decisions. It tells us: What they were like 20 years, 30 years ago? What was their decision process like? What they were doing with their time, how did they invest their energy and their whole life? I think it’s really important when you think about who you’re going to vote for, the promises they make, what kind of country we want to have.

We can take a hard look at their word, which is difficult these days. I think it’s up to us as consumers of information — how do we decide who to vote for if we don’t know what the truth is? So we either say something and mean it or we don’t, and that tells us something about an individual.

And archives are really helpful with that. What’s scary to me is that there are already chunks of the internet where we’re losing access to that content for one reason or another. Technology systems are not designed for keeping information for long periods of time, just for getting it out there. In the old days, you could keep papers in a cool, dry and dark place and they would last for a long while. The days of benign neglect are over. If you want to see something digital survive, you have to take care of it forever.

Based on the fact that we all record everything, everywhere now, I can imagine that in four or five election cycles that a surprise tape might come from an individual and not a news organization. If people want to preserve their own videos, what do you suggest?

At least make sure that most of the stuff is tagged. Some stuff on your phone will be tagged with date and geolocation. But the question to think about is this: What are you going to look for in the future?

If you’re taking a picture of an election, were these Democratic candidates? Republicans? How are you going to search for this? Why was it important? If you have the dates, names, and location — those pieces you can triangulate. We’re getting better but there are no algorithms that can go into a photo or video and tell you why it’s important. So you have to make a note to yourself in the future. How am I going to find this?

It’s kind of like those pictures of your family — old pictures — where nobody wrote anything on them. Wouldn’t it be great if they had? This photo could be someone’s birthday. But we don’t know which birthday. Where was it taken? There are all these potential mysteries that could be resolved and their meaning could be passed onto future generations with a minimal amount of extra effort.

Support high-integrity, independent journalism that serves democracy. Make a gift to Poynter today. The Poynter Institute is a nonpartisan, nonprofit organization, and your gift helps us make good journalism better.
Mel leads audience growth and development for the Wikimedia Foundation and frequently works with journalism organizations on projects related to audience development, engagement, and analytics.…
Melody Kramer

More News

Back to News