Here's how close automated fact-checking is to reality
A new fact sheet from the Reuters Institute takes stock of automated fact-checking around the world — and the future looks bright.
Released today, the report draws upon interviews with fact-checkers and computer scientists, as well as an overview of existing technology, to detail how automated fact-checking could change the practice in the immediate future.
“The last year has seen growing attention among journalists, policymakers, and technology companies to the problem of finding effective, large-scale responses to online misinformation,” senior research fellow Lucas Graves writes in the report. “However, deciding the truth of public claims and separating legitimate views from misinformation is difficult and often controversial work … challenges that carry over into (automated fact-checking).”
Among those challenges, Graves notes that fully automated fact-checking isn’t even close to being capable of the judgment that journalists apply on a day-to-day basis. Additionally, support from foundations, universities and platforms is essential in developing better capabilities and large-scale systems.
But the potential for automation is great — and it’s already happening in some newsrooms.
The short document offers a categorization of the latest developments in automated fact-checking technology and research:
(Automated fact-checking) initiatives and research generally focus on one or more of three overlapping objectives: to spot false or questionable claims circulating online and in other media; to authoritatively verify claims or stories that are in doubt, or to facilitate their verification by journalists and members of the public; and to deliver corrections instantaneously, across different media, to audiences exposed to misinformation. End-to-end systems aim to address all three elements — identification, verification, and correction.
British fact-checking charity Full Fact has developed a tool that automatically scans media and Parliament transcripts for claims and matches them against existing fact checks. The Duke Reporters’ Lab and Chequeado have both built similar tools that scan media transcripts for checkable claims, later notifying fact-checkers to potential fact checks. (Disclosure: The Reporters’ Lab helps pay for the Global Fact-Checking Summit).
The first two organizations are featured in the International Fact-Checking Network’s third “Check It” video:
That methodology — automatically scraping for and finding claims in transcripts, then matching them against libraries of existing fact checks like Share the Facts — is the most effective and a product of successful research, according to Graves. But the technology still isn’t perfect.
However, so far these systems can only identify simple declarative statements, missing implied claims or claims embedded in complex sentences which humans recognise easily. This is a particular challenge with conversational sources, like discussion programmes, in which people often use pronouns and refer back to earlier points.
It also has the potential to misinterpret paraphrasing and subtle changes in wording, timing and context. Beyond that, verification remains outside the scope of existing automated fact-checking tools today and still relies on humans to sift through potential fact checks, so expectations should be kept modest, according to the report.
Going forward, an ongoing challenge for automation is finding ways to match claims against official sources of information, which is essentially what fact-checkers do manually. Graves wrote that artificial intelligence researchers could look into how automated fact-checking systems could identify which sources of data are appropriate for any given claim.
But that poses other problems. Data isn’t always available and, even when it is, correctly discerning what the data means for the veracity of a claim is hard, as one highlighted study shows:
… a way to test the claim that ‘Lesotho is the smallest country in Africa’ without logically interpreting it is to search for similar language across a large textual source, or across the entire Web. In experiments using Wikipedia as a trusted source and a dataset of 125,000 claims, for example, a team led by one of [Andreas Vlachos’] students can predict correctly whether a single-predicate claim is supported or refuted (or whether there is not enough evidence) about 25% of the time (Thorne et al. 2018).
In many ways, that kind of academic insight has proved essential in helping practitioners develop automated platforms.
“(Automated fact-checking) has been an area of unusually close collaboration between researchers and practitioners,” Graves wrote. “Further progress will depend mainly on two factors: continued financial support for both basic research and real-world experiments, and progress by government and civil society groups in establishing open data standards.”