Fact-checkers usually roll their eyes when they need to verify an audio file extracted from WhatsApp. They know it’s a time-consuming task and there is a lack of tools to help them reach a verdict about the voice they hear. This scenario, however, has just changed. Forensia is up and running in Buenos Aires, and ready to work in Saxon and Romance languages — but not for free.
Launched by the Laboratory of Sensory Research (LIS), part of Argentina’s National Council of Scientific and Technical Research (Conicet), Forensia is — as the name suggests — a forensic software used for fact-checking for the first time last week.
At the beginning of the month, the Argentinian fact-checking organization Chequeado saw an audio file becoming viral on WhatsApp and decided to spend a 10,000 pesos (because Chequeado is a national NGO) on a report from LIS.
Laura Zommer, the platform’s director, was amazed by the results her team obtained and decided to share this experience with the International Fact-Checking Network.
“We always receive many audio files to be fact-checked and although we always want to verify them, we have never been able to do so,” she said. “Forensia isn’t a cheap solution, but should definitely be used to verify important topics and when important characters are involved.”
The case Chequeado had in hand last week was really strong. In the audio file they wanted to verify, a specific politician was supposedly accusing the entire non-white community of having voted for Kirchnerism because “black* people want barbecue, cheap wine, beer, lots of beer, pot, and cocaine.” Chequeado needed to confirm if the voice heard was actually from congressman Guillermo Montenegro — as was spreading on WhatsApp and other social media channels.
Jorge Gurlekian, the research scientist who directs LIS, received the WhatsApp file and asked fact-checkers to provide other real audio files from Montenegro’s voice. He plugged all of them into Forensia and, in a few minutes, concluded there very little chances of that voice being Montenegro’s. On a scale that ranges from -5 to +5, the audio clip was graded -1.
“We first compare the questioned file to the ones that carry the real voice of the candidate. Then we compare the questioned file to a huge database of voices and sounds from people who were born and who live in the region where the candidate is from,” explained Gurlekian. “Our final answer is never binary. It is a probability and we strongly suggest fact-checkers to use Forensia as another evidence for their work — just like a judge uses a DNA test.”
Gurlekian has been studying voice recognition for decades and has helped security forces and the judiciary system in his country for a long time. He is now excited to see his knowledge — and his tool — can serve in the battle against online misinformation.
“My team, composed of Miguel Martinez Soler, Pedro Univaso and I, is 100% ready to work with fact-checkers and experiment in this field,” he said. “We only need to keep in mind that some technical requirements are necessary. Questioned audios, for example, must be at least 15 seconds long to be verified and must be phonetically complex. A file where you only hear a person saying ‘yes, yes, yes’, for example, wouldn’t be suitable for Forensia.”
The perfect file format to be run in the software is .wav, but Gurlekian knows this is unrealistic in the false news universe and is ready to deal with WhatsApp recordings. About 90 different indicators are verified in every single file and most of them have nothing to do with the content of what is being said, which is why the tool can deal with many idioms.
Forensia can be licensed and installed in computers but Gurlekian suggests the international fact-checking community takes a baby step for now and allow Gurlekian to run the tests.
To fully understand the report offered by the software, some training is needed. It would also be important to input local databases of voices into Forensia to have more a more precise report from the machine — and only Gurlekian and his team can do that.
“The most developed countries in the world already have public databases on the sound of citizens’ voices. Some of them are even divided into regions and some of these databases are public. But others aren’t,” Gurlekian said.
Forensia can also point out editions made in a file and help identify when a change was made in a sentence.
So what are its limitations?
“The limitation could be aging. The voice of boys and girls change over time. If we question an audio recorded when they are young, we will need examples of real voices of that time to compare and that could be hard.”
* The term used in Spanish was “los negros”. That often refers to non-white, poor and immigrant communities.
Cristina Tardáguila is the associate director of the International Fact-Checking Network. She can be reached at firstname.lastname@example.org.