I have been experimenting with various machine learning technologies recently, and I was fascinated by the generative AI technologies, especially Generative Adversarial Networks (GAN). When I started exploring more and more into this technology, I can understand its potential to transform many industries, especially health, art, forensics etc. Research has been done using GANs to identify abnormalities in digital breast tomosynthesis images of only healthy patients in order to detect cancer. Research like this is capable of revolutionizing the medical industry
Generative Adversarial Networks is a machine learning framework designed by Ian Goodfellow and his colleagues and makes use of Supervised and Unsupervised Learning techniques.

GANs consist of two parts, a Generator (unsupervised) and a Discriminator (supervised) that contest with each other like they are playing a game. A Generator creates images from scratch, while the discriminator tells if the image is real or fake based on a dataset provided to it. Based on if the Generator is found, it will improve and create better looking images so that it can fool the discriminator. Similarly, the Discriminator, will also improve if it is continually getting images wrong making its identification process more accurate. The goal is that the generator creates images so realistic, the discriminator will never know it is fake. One method of doing this is through the use of Convolutional Neural Networks (CNN). A CNN is a type of neural network that attempts to make predictions through the use of filters and data manipulation. For example, if the network takes a dog as an input, and the CNN might place a filter on it that will only look at the legs. CNNs are used in both Generator and Discriminator models to identify patterns leading to a better, more accurate algorithm.

Generative Adversarial Networks are heavily used in the field of Computer Vision. Computer vision pertains to any form of artificial intelligence learning from images, or videos.


A simple example of Computer Vision GAN’s is image identification. This is when an image is imported, and based on its characteristics, the program determines what the image is.

I worked on a project utilizing a more accurate version of GANs called StyleGANv2 which I will talk about in my next article. StyleGANv2 uses a technology known as Differential Augmentation (DiffAug). This is the idea that individual images can be adjusted to keep the model’s perspective diverse. For example, instead of putting just a single version of an image in the dataset, DiffAug will change it a little bit. It might turn the image, cut out a few parts, or decrease the image quality.

This all serves to force the discriminator and generator to keep on learning and master multiple different elements, inherently leading to a significantly higher performing network.
Libraries
A key part in writing ML algorithms is using libraries that can simplify code and make it much more efficient. Libraries commonly used by ML programmers include TensorFlow(TF) and Keras. What is TensorFlow? TF is a library that allows programmers to zoom out from a neural network.

Neural networks are made up of many intersections called nodes. Without TF, machine learning scientists would have to manage and perform large mathematical computations on a very specific level. However, TensorFlow allows for them to generalize, and categorize their neural network so coders can program more efficiently.
The Generator
It is a common misconception that generators in GANs (Generative Adversarial Networks) can immediately produce high quality, realistic images. In reality, during the early stages of training, generators often produce unidentifiable or low-quality images. This is because the generator lacks information and understanding of what is expected of it. In the beginning, a generator can be compared to a baby as it is new to the world and still learning what to do.
The role of the generator in a GAN is to create images that trick the discriminator into believing they are real. If the discriminator is able to correctly identify the image as fake, it indicates that the generator must improve in order to deceive the discriminator. If GAN is a single part system, it wouldn’t be very effective. This brings us too…
The Discriminator
The discriminator’s function is to determine whether a given image is real or generated by the generator. It is a supervised learning model that is trained on a dataset of real images. When given an image produced by the generator, the discriminator must classify it as either real or fake. If the model makes an incorrect assumption, it can learn from this mistake and adjust its algorithms accordingly.
The way the discriminator trains is a very different from the generator. The discriminator is given a dataset of real images and would also receive the image that the generator created. The discriminator then has identify the image as real or fake. If the model fails, it learns from its mistake and adjusts its identification algorithm. This process continues iteratively, with the generator and discriminator continually improving their abilities to outdo one another. The goal is for the generator to produce images that are identical to real ones. Now we know a GAN is a two part system. What connects them though?
Loss
Loss can be classified as a negative consequence for an incorrect prediction. There are two types of loss in a GAN, Discriminator-loss and Generator-loss.

Loss works in a linear way, it can go up, and it can go down. The direction it goes, depends on the validity of the output that any one of the systems, generator or discriminator, just produced. For example, if the discriminator predicts that the image that generator gives it as real, loss goes up. However, loss will go down if the discriminator successfully identifies an image as real or fake. For a generator, loss will increase if its image fools the discriminator, and will go down if it is exposed as fake. The higher the number, the worse the model is doing. Essentially, loss tells the models if their current tactic is successful, or if it needs to try something new.
Uses of GANs
My GAN when compared to the work being done by researchers around the world can be seen as basic. Here are some of the ways GANs are being used today!
Image-to-image translations

Text-to-image translations

Photograph Editing

Photo Blending

3D Object Generation
Automobile Industry
.jpeg)
ChatGPT

“ChatGPT is a variant of the GPT (Generative Pre-training Transformer) language model that has been specifically designed for generating human-like text in a conversational context. ChatGPT is trained on a large dataset of human conversations and is able to generate responses to prompts in a way that is similar to how a human might respond. It is intended to be used as a conversational chatbot or as a tool for generating responses to questions in a chat or messaging context. ChatGPT can be fine-tuned for specific tasks or domains, such as customer service or technical support, to make its responses more relevant and accurate for those contexts.”
– Written by ChatGPT
Conclusion
As you can see, Generative adversarial networks (GANs) are a revolutionary and a highly modular tool in the field of artificial intelligence. It’s ability to generate synthetic data that is identical to real data gives it the potential to transform a wide range of industries and applications, from computer graphics and data augmentation to text and audio generation. With the ability to continuously learn and adapt to new data, GANs have the potential to shape the future of machine learning and the way we interact with technology. As the use of GANs continues to grow and evolve, it is clear that they will have a lasting impact on the way we live and work in the digital age.

2 thoughts on “What is a Generative Adversarial Network”