What does the future of automated fact-checking look like?
DURHAM, N.C. — There’s nothing new about trying to correct the record in real time. Even a couple of decades ago, campaign aides would walk through the press sections at debates, circulating freshly printed pages that sought to debunk claims the opposing candidate had just made.
That approach, even if it represented an attempt by campaigns to spin the facts in their own favor, anticipated the goals of modern fact-checkers: drawing attention to questionable statements and correcting them in as close to real time as possible.
Fact-checking organizations are able to do this under similar conditions — debates and other occasions when a politician's talking points can easily be anticipated. Corrective copy in such cases has already been written and can be spread quickly. It's harder to do instant fact-checking when conditions are less predictable. It's also difficult keeping up with the flood of statements and information constantly flooding both old and new media.
To get closer to the goal of providing fact checks almost as quickly as claims are made, preferably sending them along the same communication channels, will require a great deal of automation. Last week, a conference at Duke University brought together journalists and computer scientists to discuss how to envision and create such tools. The “Tech & Check” conference was presented by Poynter's International Fact-Checking Network and the Duke Reporters' Lab.
Automating the processes involved in fact-checking claims would help not only full-time fact-checking sites, but journalists in general. Even as fact-checking sites have proliferated, jobs for internal fact-checkers and copy editors have largely been eliminated, leaving reporters and editors on their own to guarantee the accuracy of their own content. If news organizations don't have the time to do quality control the old-fashioned way, more automated methods would be a boon when it comes to ensuring accuracy.
Already, developers have come up with a number of tools and prototypes to help automate aspects of the fact-checking process. RumorLens, a system developed at the University of Michigan, can be fed rumors and then will check for tweets that are either spreading or correcting them. FactMinder is a browser extension created by French scientists that allows users to easily extract data from the web and provide context about individuals mentioned on a webpage.
IBM's Watson group is weeks away from releasing the beta version of an app called Watson Angles, which checks stories against a trove of 55 million previously published news articles. It assembles basic facts relevant to the topic at hand and offers context, a timeline and key quotes that are germane to the topic at hand. "The ultimate goal is to make anyone a fact-checker," IBM's Ben Fletcher said at the conference.
Exploring ways to overcome the boundaries that separate readers and viewers from journalists — as well as the barriers that keep technologists and reporters apart — was a central theme of the conference. Everyone is swimming against torrents of news and misinformation, and separating fact from fiction will have to be a group effort.
Annotating content in more robust ways is clearly a viable next step. News outlets will soon be using widgets to embed published fact-checks in their copy, just as they already embed links. Having fact checks pop up alongside claims on the web or during broadcasts could go a long way not only toward making politicians think twice about repeating known falsehoods, but encouraging the public to remain vigilant in its skepticism. "Getting fact-checks in front of people who want to see them — or should see them — could really help," says Angie Drobnic Holan, editor of PolitiFact. "If we're only reaching the people who are checking our stuff, we're missing a lot of people."
Conferees tried to blue-sky ideas about other tools that could push information toward readers and viewers, or allow them to fact-check statements for themselves. Lawmakers and interest groups could be awarded "parrot scores," which would give readers a visual clue if they are prone to regularly repeating dubious claims. Before sharing a link via social media, users could hit a button that would automatically check whether that content has already been found to be false. Similarly, reporters and editors could run their own stories through a database that would check for accuracy before they're published or put on air, the way academics already use plagiarism checkers to see if their students are cheating.
Before misinformation goes viral, in other words, new tools could provide the disinfectant of truth. It's a worthy goal, but all of these ideas run right into large technological hurdles. For one thing, checking new material against databases requires having those databases in place. There already are huge archives of published news articles, but no repository that's anywhere near complete, while also containing everything that's handy in terms of government documents, think tank studies and the like.
Even if you manage to gather all known information that's relevant to a topic, you still face a difficult task in teaching computers how to compare claims against prior statements. How do you automate nuance? There are ways and ways of saying the same thing. Computers are great at spotting repetitions of prior claims, but they could miss remarks that paraphrase something that's already been said in a slightly different way.
The longer the phrase, the more complicated it is to match it to prior statements. That's why it's key for automated tools to be able to recognize what parts of a sentence make up a checkable fact, trimming away the rhetorical fat. "If we want to do it, we can really do this kind of system," says Alessandro Moschitti, a computer scientist at the Qatar Computing Research Institute and the University of Trento in Italy. "The technology is there, but we also need resources."
That's always a key hurdle. Once you figure out what it is you want to do, you have to figure out how to pay for it. Given the amount of data collecting and massaging involved in fact-checking, it's going to require a fair amount of money to automate parts of the process. Tech & Check made it clear there's interest among journalists, academics and tech giants in finding better and faster ways of helping people navigate the vast amounts of information constantly thrown their way. There doubtless will be breakthroughs, if somebody will pay for them.