We have become so accustomed to accessing information online whenever and wherever we want that we mostly behave as if all information will be preserved forever. We are dead wrong.
Journalists need to do more to preserve their own work, but they also need to do much more to safeguard the information they are linking to in their articles.
Dead links, or "link rot," is a familiar malady of many a Web post: Clicking on a link to something you'd like more information about often leads to a URL that is no longer there. The website may have closed down or its structure might have been changed. Alternatively, the URL may function but lead to content that was edited or updated (a phenomenon sometimes called "reference rot").
A new tool hopes to fix this. Amber is a project of the Berkman Center for Internet and Society at Harvard University. The WordPress plugin works by taking snapshots of the pages that are being linked to at the moment of publication. It tests the links to see whether they are still active by regularly cycling through them. If they are not, Amber shows a snapshot of the webpage when readers hover over the link.
Amber does seem to offer two great advantages: it is nimble to operate and it allows for local storage. I gave it a test-run on WordPress and found it extremely easy to install and configure (it is also available for Drupal and is open source).
The Amber dashboard gives you an overview of all the links you've saved, their status and the size of the snapshot. Individual links don't occupy much spaces; the snapshot I preserved were below 1 MB each. Of course this scales up and could become unsustainable, so there are different ways to configure the storage to avoid overloading your storage capabilities. Amber allows you to save the snapshots on the Internet Archive, Perma.cc or Amazon web services.
This plugin aspires to be a solution to a pervasive problem. A 2014 study by Jonathan Zittrain, Kendra Albert and Lawrence Lessig found that "more than 70% of the URLs within the Harvard Law Review and other journals, and 50% of the URLs within United States Supreme Court opinions, do not link to the originally cited information."
Dead links are common on Wikipedia, too. The online encyclopedia lists 100,355 pages as having dead links.
This is a problem for all journalists, but one particularly crucial for fact-checkers. Fact-checking is entirely evidence-based and therefore link-driven. A fundamental appeal of fact-checking is the explicit reference to sources, meant to make it more accountable to readers who can in turn verify them.
By removing the source of a fact, broken links hinder fact-checkers' future work and readers' capacity to be informed. They are a central threat to the shelf life of fact checks.
Fact-checkers are aware of this. In a short survey I sent out to members of the International Fact-Checking Network three out of of four respondents rated this a top concern.
Factcheck.org, which launched in 2004 now has almost 6,000 dead links. Roughly one third of all the links on Pagella Politica, the Italian fact-checking website I edited before joining Poynter, are currently broken. At the same time, trying to manually keep tabs on the state of a site's links is too time-consuming to be feasible.
These woes won't be completely fixed by Amber. I flagged a couple of bugs that I noticed while doing research for this piece. Also, not all websites may allow the app to index their content. Some may decide to ban it willingly or automatically because they confuse Amber's snapshots for automatic phishing activity.
There could also be copyright issues. Genève Campbell of the Berkman Center told me that fact-checking likely qualifies as "fair use" in the United States, but this varies from country to country. Campbell recommends the Digital Media Law Project and the Electronic Frontier Foundation as resources to find more information on these issues.
Still, Amber seems to bring us one step closer to solving the scourge of dead links. Provided it can sort out current bugs — and it will need all the feedback it can get from interested organizations to do so — it could be a solution for link rot.
There are also some hopes for journalists and fact-checkers looking to save (some) of their old broken links. Campbell envisages a scenario in which a tool like Memento is used to crawl old posts to find and save a version of the page at the moment of publication. "The technology is there," she says.
The time for journalists in general and fact-checkers in particular to start addressing this problem is now. Even if Amber does not end up being the silver bullet, producers of linked online content need to be at the forefront of the quest for storing links better.
Let's start preserving our links now before we bequeath posterity the Internet equivalent of a massive library where every book has had its bibliography torn out.