Welcome to The Observer Innovation section’s new interview series, Changing the World. Every week, we’ll talk to an innovator, entrepreneur, and/or company that is making the world better through science, business, and technology.
The technology of fabricating information online has come a long way. Nowadays, not only do we have fake Twitter accounts and photoshopped news images, but we also have fake videos—and deepfake videos, which are ultra-realistic videos synthesized using machine learning algorithms.
Most deepfake videos are created using a subset of machine learning algorithms called generative neural network architectures, such as auto-encoders or generative adversarial networks (GANs). They can and have already been used for nefarious purposes, and as a serious security risk, they’ve inspired a new class of startups focused on identifying and taking them down. One such firm is Cyabra, an 11-person team made up of veteran cybersecurity experts from the Israeli Defense Forces’ Special Operations Department. The company currently has dozens of corporate clients in the U.S.
“What we are giving the world is a better way to distinguish real and fake information campaigns and communities online,” said Yossef Daar, Cyabra’s cofounder and chief product officer.
In a recent interview with Observer, Daar explained how Cyabra’s algorithm works, why deepfake videos are increasingly difficult to detect, and the threat of disinformation “pyramids” we constantly face on the internet.
Deepfake is a term commonly used in describing misinformation nowadays. Can you briefly explain how deepfake technology has evolved over time? What’s the main challenge in detecting it?
In 2017, the internet saw the first appearance of deepfakes. These manipulated videos used a family of algorithms called autoencoders. An example of these kinds of algorithms is face swap.
One year ago, as the threat of deepfakes became greater, Cyabra began developing advanced technology to detect the ever-evolving technology of deepfakes. Face swap and reenactment deepfakes were direct extensions of Cyabra’s detection technology. In 2018, we began seeing more complex videos using generative adversarial networks (GAN)—a class of machine learning that bad actors use to create realistic images.
In late 2019, Facebook released a dataset called the Deepfake Detection Challenge Dataset, which included more than 100,000 deepfake videos, for developers worldwide to test and improve their detection technology. Those videos have the largest variance of methods and parameters today.
How does Cyabra’s detection tool work? How accurate can it get?
Our solution focuses on detecting generative adversarial networks (GAN). We created an algorithm called Facial Reenactment Manipulation (FAM) that can differentiate between a real photo and a fake photo created with GAN. This is a mechanism that gives you the ability to get more parameters out of the video, instead of just relying on frame by frame.
In terms of accuracy, Cyabra’s deepfake detection technology is able to identify 91 percent of fake videos and images, and 98 percent real ones.
As for the campaign authenticator, we have 90 percent accuracy rate for specific profile analysis, depending on the social media platform.
About a year ago, I interviewed the founder of a U.S.-based deepfake detection firm called Amber Videos. He said fighting against deepfake is ultimately a losing battle, because the nature of the “adversarial network” creating deepfake in the first place is to constantly improve to circumvent detection. Do you agree?
I think it has some truth. The attacker would always have some kind of advantage over the defending side. That’s true for almost anything in life. But I think there’s a way to do it.
I like to use a fishing analogy to simplify our approach. In the analogy, we compare it to two different kinds of fishing nets: The first is a small net designed to catch a specific type of fish. This net catches a large amount of this fish. However, other fish that the net isn’t geared towards would slip away. The other type is a large net that can catch a wide variety of fish. But the downside is that small fish would pass through more easily.
The clear solution is to use both nets, which is our approach. Our “small net” is a set of supervised algorithms that can provide targeted responses to specific categories of deepfakes. And the “large net” is a mix of unsupervised and semi-supervised algorithms that act as a defense against deepfakes broadly.
Cyabra is not yet a consumer product. When we browse the internet, what should we know about the main categories of disinformation out there?
Think of it as a pyramid. If you were to create a disinformation campaign, the lower level of the pyramid will be fake users on social media set up to spread and amplify a message. A level higher up will be avatars. Those would have the look and feel of real people and are harder to detect. Their purpose is usually to manipulate or change the public’s opinion about something.
And the tip of the pyramid will actually be real people. They are usually paid by someone to create content or amplify a message or whatever.
Every good disinformation campaign has all of these ingredients. What we’ve found is that some of them can be very, very effective.
Would you say the 2016 U.S. presidential election was an example of that?
That was before Cyabra was founded, so I don’t have the right metrics for that. But I can tell you that, in the trade war between China and the U.S., there are a lot of misinformation actors on both sides.
Has COVID-19 affected your business in any negative way?
Yes and no. It’s yes because there is lots of uncertainty in the market, and it’s very hard to find investors. If you want to close a deal with someone now, that will take a long time.
But on the other hand, now that we are in a crisis, I think the need for genuine and accurate information is higher than ever, which is where we come into play.