What I cannot create, I do not understand.Richard Feynman*
One of the past decade’s most important developments in deep learning is generative adversarial networks, developed by Ian Goodfellow. This new technology can also be used for ill intent, such as for generating fake images and videos.
Generative Adversarial Networks
Figure: GANs generated by a computer. The above images look real, but more than that, they look familiar.* They resemble a famous actress or actor that you may have seen on television or in the movies. They are not real, however. A new type of neural network created them.
GAN, or generative adversarial network, is a class of machine learning framework where two neural networks play a cat and mouse game. One creates fake images that look like the real ones fed into it, and the other decides if they are real.
Generative adversarial networks (GANs), sometimes called generative networks, created these fake images. The Nvidia research team used this new technique by feeding thousands of photos of celebrities to a neural network. The neural network has in turn produced thousands of pictures, like the ones above, that resemble the famous faces. They look real, but machines created them. GANs allow researchers to build images that look like the real ones by sharing many features of the images the neural network was fed. It can be fed photographs of objects from tables to animals, and after being trained, it produces pictures that resemble the originals.
Figure: Of the two images above, can you tell the real from the fake?*
For the Nvidia team to generate these images, it set up two neural networks. One that produced the pictures and the other that determined if they were real or fake. Combining these two neural networks produced a GAN, or generative adversarial network. They play a cat and mouse game, where one creates fake images that look like the real ones fed into it, and the other decides if they are real. Does this remind you of anything? The Turing test. Think of the networks as playing the guessing game of whether the images are real or fake.
After the GAN has been trained, one of the neural networks creates fake images that look like the real ones used in training. The resulting pictures look exactly like real peoples’ pictures. This technique can generate large amounts of fake data that can help researchers predict the future or even construct simulated worlds. That is why for Yann LeCun, Director of Facebook AI Research, “Generative Adversarial Networks is the most interesting idea in the last ten years in machine learning.”* GANs will be helpful for creating images and maybe creating software simulations of the real world, where developers can train and test other types of software. For example, companies writing self-driving software for cars can train and check their software in simulated worlds. I discuss this in detail later in this book.
Unlock expert knowledge.
Learn in depth. Get instant, lifetime access to the entire book. Plus online resources and future updates.
These simulated worlds and situations are now handcrafted by developers, but some believe that these scenarios will all be created by GANs in the future. GANs generate new images and videos from very compressed data. Thus, you could use a GAN’s two neural networks to save data and then reinstate it. Instead of zipping your files, you could use one neural network to compress it and the other to generate the original videos or images. It is no coincidence that in the human brain some of the apparatus used for imagination is the same as the one used for memory recall. Demis Hassabis, the founder of DeepMind, published a paper* that “showed systematically for the first time that patients with damage to their hippocampus, known to cause amnesia, were also unable to imagine themselves in new experiences.”* The finding established a link between the constructive process of imagination* and the reconstructive process of episodic memory recall.* There are more details regarding this later in this book.
Figure: Increasingly realistic synthetic faces generated by variations on generative adversarial networks through the years.
The Creator of GANs
Ian Goodfellow, the creator of GANs, came up with the idea at a bar in Montreal when he was with fellow researchers discussing what goes into creating photographs. The initial plan was to understand the statistics that determined what created photos, and then feed them to a machine so that it could produce the pictures. Goodfellow thought that the idea would never work because there are too many statistics needed. So, he thought about using a tool, a neural network. He could teach neural networks to figure out the underlying characteristics of the pictures fed to the machine and then generate new ones.
Figure: Ian Goodfellow, creator of generative adversarial networks.
Goodfellow then added two neural networks so that they could together build realistic photographs. One created fake images, and the other determined if they were real. The idea was that one of the adversary networks would teach the other how to produce images that could not be distinguished from the real ones.
On the same night that he came up with the idea, he went home, a little bit drunk, and stayed up that night coding the initial concept of a GAN on his laptop. It worked on the first try. A few months later, he and a few other researchers published the seminal paper on GANs at a conference.* The trained GAN used handwritten digits from a well-known training image set called MNIST.*
In the following years, hundreds of papers were published using the idea of GANs to produce not only images but also videos and other data. Now at Google Brain, Goodfellow leads a group that is making the training of these two neural networks very reliable. The result from this work is services that are far better at generating images and learning sounds, among other things. “The models learn to understand the structure of the world,” Goodfellow says. “And that can help systems learn without being explicitly told as much.”
Figure: Synthetically generated word images.*
GANs could eventually help neural networks learn with less data, generating more synthetic images that are then used to identify and create better neural networks. Recently, a group of researchers at Dropbox improved their mobile document scanner by using synthetically generated images. GANs produced new word images that, in turn, were used to train the neural network.
And, that is just the start. Researchers believe that the same technique can be applied to develop artificial data that can be shared openly on the internet while not revealing the primary source, making sure that the original data stays private. This would allow researchers to create and share healthcare information without sharing sensitive data about patients.
GANs also show promise for predicting the future. It may sound like science fiction now, but that might change over time. LeCun is working on writing software that can generate video of future situations based on current video. He believes that human intelligence lies in the fact that we can predict the future, and therefore, GANs will be a powerful force for artificial intelligence systems in the future.*
The Birthday Paradox Test
Even though GANs are generating new images and sounds, some people ask if GANs generate new information. Once a GAN is trained on a collection of data, can it produce data that contains information outside of its training data? Can it create images that are entirely different from the ones fed to it?
A way of analyzing that is by what is called the Birthday Paradox Test. This test derives its name from the implication that if you put 23—two soccer teams plus a referee—random people in a room, the chance that two of them have the same birthday is more than 50%.
This effect happens because with 365 days in a year, you need at least a number of people around the square root of that to see a duplicate birthday. The Birthday Paradox says that for a discrete distribution that has support N, then a random sample size of √N would likely contain a duplicate. What does that mean? Let me break it down.
If there are 365 days a year, then you need the square root of 365—√365—people to have a probable chance of two having the same birthdays, which means about 19 people. But this also works for the other side of the equation. If you do not know the number of days in a year, then you can select a fixed number of people and ask them for their birthdays. If there are two people with the same birthday, you can infer the number of days in a year with high probability based on the number of people. If you have 22 people in the room, then the number of days in a year is the square of 22—— about 484 days per year, an approximation of the actual number of days in a year.
The same test can check the size of the original distribution of a GAN’s generated images. If the result reveals that a set of K images contains duplicates with reasonable probability, then you can suspect that the number of original images is about K². So, if a test shows that it is very likely to find a duplicate in a set of 20 images, then the size of the original set of images is approximately 400. This test can be run by selecting subsets of images and checking how often we find duplicates in these subsets. If we find duplicates in more than 50% of the subsets of a certain length, we can use that size for our approximation.
With that test in hand, researchers have shown that images generated by famous GANs do not generalize beyond what the training data provides. Now, what is left to prove is whether GANs can be improved to generalize beyond the training data or if there are ways of generalizing beyond the original images by using other methods to improve the training dataset.
GANs and Bad Intent
There are concerns that people can use the GAN technique with ill intent.* With so much attention on fake media, we could face an even broader range of attacks with fake data. “The concern is that these methods will rise to the point where it becomes very difficult to discern truth from falsity,” said Tim Hwang, who previously oversaw AI policy at Google and is now director of the Ethics and Governance of Artificial Intelligence Fund, an organization supporting ethical AI research. “You might believe that accelerates problems we already have.”*
Even though this technique cannot create still images of high quality, researchers believe that the same technology could produce videos, games, and virtual reality. The work to start generating videos has already begun. Researchers are also using a wide range of other machine learning methods to generate faux data. In August of 2017, a group of researchers at the University of Washington was featured in headlines when they built a system that could put words in Barack Obama’s mouth in a video. An app with this technique is already on the Apple’s App Store.* The results were not completely convincing, but the rapid progress in the area of GANs and other techniques point to a future where it becomes tough for people to differentiate between real videos and generated ones. Some researchers claim that GANs are just another tool like others that can be used for good or evil and that there will be more technology to figure out if the newly created videos and images are real.
Not only that, but researchers have uncovered ways of using GANs to generate audio that sounds like one thing to humans but something else to machines.* For example, you can develop audio that sounds to humans like “Hi, how are you?” and to machines, “Alexa, buy me a drink.” Or, audio that sounds like a Bach symphony to a human, but for the machine, it sounds like “Alexa, go to this website.”
The future has unlimited potential. Digital media may surpass analog media by the end of this decade. We are starting to see examples of this with companies like Synthesia.* Other examples of digital media are encountered with Imma, an Instagram model that is completely generated by computers and has around 350k followers.* It won’t be surprising to see more and more digital media in the world as the cost of creating such content goes down and they can be anything that their creators want.