StyleGAN; What’s going to happen when computers start to imagine?

Ashenkalana
6 min readOct 12, 2020

Would you believe if I say that all these portraits are computer-generated? In other words, these people do not exist in real life! This is the ‘imagination’ of a computer.

I know, it is so cool and haunting at the same time when you think about how much computers and machine learning in particular has evolved over the years. This ‘miracle’ is possible thanks to the new technology called GAN architecture.

What is machine learning?

Machine learning is the use of algorithms and neural network models to improve the performance of a computer system. This is not a new technology, the roots of machine learning dates back to 1949 where Donald Hebb in a book titled ‘The organization behavior’ talks about a model of a brain cell interaction.

But in recent years machine learning has been more useful and relevant than ever. This is because of 3 main reasons. Reachability to data more than ever, computers getting more powerful and having better machine learning algorithms.

To talk about StyleGAN we have to know 2 types of learning mechanisms related to machine learning.

  1. Supervised learning

When you watch a video on YouTube, it suggests related videos to that video, or when you watch several movies on Netflix, your movie suggestions are made of the movies other people have watched who have also watched the movies you have previously watched. This type of looking similarities on categorized/labeled data is supervised learning.

2. Unsupervised learning

Here the computer looks for similarities in unlabeled data types and arranges them into groups. In 2015, Google created a model neural network that understood the concept of a cat just ‘looking’ through millions of images without any specific data/instructions provided.

GAN is a combination of these two. Incoming data for GAN in unsupervised, but GAN sets up a supervised learning problem to deal with unsupervised data. It produces fake data and tries to determine if the data are fake or real.

Generative Adversarial Network (GAN)

Simply, GANs function like a game. There are two neural networks and an unlabeled set of data. Two neural networks are a Generator and a Discriminator. The Generator tries to produce data/objects that would look like the real object and the job of the discriminator is to determine whether the incoming data to the discriminator is real or fake. This is like training for the Discriminator and Generator. In the beginning, as the generator produces fake data, discriminator can quickly dismiss them as fake. But if the training goes well, the generator starts producing data more and more similar to real-world objects. Discriminator fails to distinguish fake data from real. Or in other words, the computer starts generating realistic human images.

StyleGAN and Nvidia

If you are a gamer you already know what Nvidia is. It’s a company that designs graphics processing units (GPUs) for the gaming and professional markets, as well as chips for smartphones and automobiles. In other words, they ‘own’ global GPU market. Nvidia has a whole unit devoted to research purposes. They do research related to algorithms and numerical methods, Applied research, Circuits, and VLSI design, Computational photography and imaging, Computer architecture, Computer graphics, Computer vision, Display technology, High-performance computing, and much more areas. One of these areas is obviously, Machine learning and artificial intelligence.

As I said before, conventional GAN tries to replicate existing unlabeled data without any instructions given. But researchers from Nvidia Terro Karras, Samuli Laine, Timo Alia altered this mechanism and developed a new GAN that can extract specific data from different photos and produce a brand new photo which is a blend/fusion of all those considered characters.

The generator made by Nvidia considers an image as a collection of styles. And there are coarse, middle, and fine styles. Each of them contributes to shaping the output image in different levels.

  • Coarse styles — Pose, Hair, Face shape
  • Middle styles — Facial features, Eyes
  • Fine styles — Color scheme

StyleGAN 2

StyleGAN proved to be an excellent way for producing high-resolution images. But, it had its defects. Some images produced by StyleGAN showed some artifacts. Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila did further research on the subject and developed StyleGAN 2. Some of the problems with original StyleGAN were,

  • Droplet artifacts

Some of the images showed a blob (water droplet) shaped artifact. They redesigned the normalization used in the generator. It removed this artifact.

  • Phase artifacts

When creating details like teeth and eyes, the generator showed a strong preference for those areas. It kept fixating on those areas. StyleGAN2 proposed an alternate design that retains the benefits of progressive growth without drawbacks.

  • In addition to these major changes, StyleGAN 2 produced images faster and the quality of the images was significantly better.

For what?

Now you might be wondering about the applications of this. Although producing human-like images is fun, there is no real use for it. That is true for the most part, but something I did not to mention was GAN is not all about photos. It can generate any type of data like that. GAN just needs good training. The ability to create fake data that are so similar to real ones has a lot of potential.

GANs ability to create photos can be used by police to create portraits of missing people. Scientists are trying to develop GAN to a level that it can make photos just by description as a text or by a voice description. And this can be used to make buildings, new designs in the clothing industry and even can be used for the wiring of a house or a plumbing drawing (3D Modeling). Some researchers are trying to use this technology to create more detailed and realistic computer games.

Regarding healthcare, GANs can be used to identify anomalies in lab results that could result in a better and quicker diagnosis. GANs are used to analyze medication alterations and mixing, the order of mixing for previously incurable conditions.

There are ongoing researches about using GAN to carry out complex organic chemistry conversions which can lead to novel drug discoveries. And identifying compounds that are worth further research.

Autonomous driving can be pointed out as another interesting application of GANs. This paper shows the potential of self-driving cars developed using GANs and they test those algorithms in the GTA V computer game using it as the simulation environment.

Obstacles for developing GANs

A complex technology like this needs a lot of work, time, and brain for developing, on the other hand, an excessive budget is essential. So the introduction of business applications is crucial for the future of GAN.

Ian Goodfellow, the inventor of GAN.Also known as the man who’s given machines the gift of imagination.

--

--