GANs in Action
Jakub Langr and Vladimir Bok
  • MEAP began July 2018
  • Publication in September 2019 (estimated)
  • ISBN 9781617295560
  • 276 pages (estimated)
  • printed in black & white

GANs in Action strikes that rare balance between an applied programming book, an academic book heavy on theory, and a conversational blog post on machine learning techniques.

Dr. Erik Sapper, California Polytechnic State University
Deep learning systems have gotten really great at identifying patterns in text, images, and video. But applications that create realistic images, natural sentences and paragraphs, or native-quality translations have proven elusive. Generative Adversarial Networks, or GANs, offer a promising solution to these challenges by pairing two competing neural networks—one that generates content and the other that rejects samples that are of poor quality.
Table of Contents detailed table of contents

1 Introduction to GANs

1.1 What Are Generative Adversarial Networks?

1.2 How do GANs Work?

1.3 GANs in Action

1.3.1 GAN Training

1.3.2 GAN Training Visualized

1.3.3 Reaching Equilibrium

1.4 Why Study GANs?

1.5 Summary

2 Intro to Generative Modeling with Autoencoders

2.1 Introduction to Generative Modeling

2.2 How do encoders function on a high level?

2.3 So what are autoencoders to GANs?

2.4 What is an autoencoder made of?

2.5 Usage of autoencoders

2.6 Unsupervised learning

2.6.1 New take on an old idea

2.6.2 Variational autoencoder (VAE)

2.7 Code is life

2.8 Why did we try a GAN?

2.9 Summary

3 Your First GAN: Generating Handwritten Digits

3.1 Foundations of GANs: Adversarial Training

3.1.1 Cost Functions

3.1.2 Training Process

3.2 The Generator and the Discriminator

3.2.1 Conflicting Objectives

3.2.2 Confusion Matrix

3.3 GAN Training Algorithm

3.4 Tutorial: Generating Handwritten Digits

3.4.1 Import Statements

3.4.2 The Generator

3.4.3 The Discriminator

3.4.4 Build the Model

3.4.5 Training

3.4.6 Outputting Sample Images

3.4.7 Run the Model

3.4.8 Inspect the Results

3.5 Conclusion

3.6 Summary

4 Deep Convolutional GAN (DCGAN)

4.1 Convolutional Neural Networks (ConvNets)

4.1.1 Convolutional Filters

4.1.2 Parameter Sharing

4.1.3 ConvNets Visualized

4.1.4 ConvNets in Depth

4.2 Brief History of the DCGAN

4.3 Batch Normalization

4.4 Tutorial: Generating Handwritten Digits with DCGAN

4.4.1 Import Statements

4.4.2 The Generator

4.4.3 The Discriminator

4.4.4 Build & Run the DCGAN

4.4.5 Model Output

4.5 Conclusion

4.6 Summary

5 Training & Common Challenges: GANing for Success

5.1 Evaluation

5.1.1 Inception Score

5.1.2 Fréchet Inception Distance

5.2 Training challenges

5.2.1 Network depth

5.2.2 Game set-ups

5.2.3 Min-Max GAN (MM-GAN)

5.2.4 Non-Saturating GAN (NS-GAN)

5.3 Summary of game setups

5.4 Training hacks

5.4.1 Normalizations of inputs

5.4.2 Batch Normalization

5.4.3 Gradient penalties

5.4.4 Train Discriminator more

5.4.5 Avoid sparse gradients

5.4.6 Soft and noisy labels

5.5 Summary

6 Progressing with GANs

6.1 Latent space interpolation

6.1.1 They grow up so fast

6.1.2 Progressive Growing & Smoothing in of Higher Resolution Layers

6.1.3 Example implementation

6.1.4 Minibatch Standard Deviation

6.1.5 Equalized Learning Rate

6.1.6 Pixel-wise Feature Normalization in the Generator

6.2 Summary of key innovations

6.3 Tensorflow Hub and hands-on

6.4 Practical Applications

6.5 Summary

7 Semi-Supervised GAN

7.1 Semi-Supervised GAN (SGAN)

7.1.1 What is Semi-Supervised GAN?

7.2 Tutorial: Implementing Semi-Supervised GAN

7.2.1 Architecture Diagram

7.2.2 Implementation

7.2.3 Setup

7.2.4 The Dataset

7.2.5 The Generator

7.2.6 The Discriminator

7.2.7 Build the Model

7.2.8 Training

7.2.9 Train the Model

7.2.10 Model Training and Test Accuracy

7.3 Comparison to a Fully-Supervised Classifier

7.4 Conclusion

7.5 Summary

8 Conditional GAN

8.1 Motivation

8.2 What is Conditional GAN?

8.2.1 CGAN Generator

8.2.2 CGAN Discriminator

8.2.3 Summary Table

8.2.4 Architecture Diagram

8.3 Tutorial: Implementing Conditional GAN

8.3.1 Implementation

8.3.2 Setup

8.3.3 The Generator

8.3.4 The Discriminator

8.3.5 Build the Model

8.3.6 Training

8.3.7 Outputting Sample Images

8.3.8 Train the Model

8.3.9 Inspecting the Output: Targeted Data Generation

8.4 Conclusion

8.5 Summary

9 CycleGAN

9.1 Image to Image Translation

9.2 Cycle Consistency Loss: There and Back aGAN

9.3 Adversarial Loss

9.4 Identity Loss

9.5 Architecture

9.6 CycleGAN architecture: building the network

9.7 Generator architecture

9.8 Discriminator architecture

9.9 Object Oriented Design of GANs

9.10 Tutorial: CycleGAN

9.10.1 Building the network

9.10.2 Building the Discriminator

9.10.3 Running CycleGAN

9.10.4 Expansions, augmentations and applications

9.11 Applications

9.12 Summary

10 Adversarial Examples

10.1 Context of Adversarial Examples

10.2 Lies, Damned Lies and Distributions

10.3 Use and abuse of training

10.4 Signal and the noise

10.5 Not all hope is lost

10.5.1 Adversaries to GANs

10.6 Conclusion

10.7 Summary

11 Practical Applications of GANs

11.1 GANs in Medicine

11.1.1 Using GANs to Improve Diagnostic Accuracy

11.1.2 Methodology

11.1.3 Results

11.2 GANs in Fashion

11.2.1 Using GANs to Design Fashion

11.2.2 Methodology

11.2.3 Creating New Items Matching Individual Preferences

11.2.4 Adjusting Existing Items to Better Match Individual Preferences

11.3 Conclusion

11.4 Summary

12 Looking Ahead

12.1 Ethics

12.2 GAN Innovations

12.3 Relativistic GAN (RGAN)

12.3.1 Application

12.4 Self-Attention GAN (SAGAN)

12.4.1 Application

12.5 BigGAN

12.5.1 Application

12.6 Further reading

12.7 Looking Back & Closing Thoughts

12.8 Summary


Appendix A: Technical/deployments

About the Technology

GANs have already achieved remarkable results that have been thought impossible for artificial systems, such as the ability to generate realistic faces, turn a scribble into a photograph-like image, are turn video footage of a horse into a running zebra. Most importantly, GANs learn quickly without the need for vast troves of painstakingly labeled training data.

Invented by Google’s Ian Goodfellow in 2014, Generative Adversarial Networks (GANs) are one of the most important innovations in deep learning. In GANs, one neural network (the generator) generates content—images, sentences, and so on—and another (the discriminator) determines whether or not they come from the generator, and are therefore “fake,” or from the training dataset, and are therefore “real.” In the interplay between the two systems, the generator creates more realistic output as it attempts to fool the discriminator into believing the “fakes” are real. The result is a generator that can produce photorealistic images or natural text and speech, and a well-trained discriminator that can precisely identify and categorize that type of content.

About the book

GANs in Action: Deep learning with Generative Adversarial Networks teaches you how to build and train your own generative adversarial networks. First, you’ll get an introduction to generative modelling and how GANs work, along with an overview of their potential uses. Then, you’ll start building your own simple adversarial system, as you explore the foundation of GAN architecture: the generator and discriminator networks.

As you work through the book’s captivating examples and detailed illustrations, you’ll learn to train different GAN architectures for different scenarios. You’ll explore generating high-resolution images, image-to-image translation, and adversarial learning, as well as targeted data generation, as you grow your system to be smart, effective, and fast.

What's inside

  • Understanding GANs and their potential
  • Hands-on code tutorials to build GAN models
  • Common challenges for your GANs
  • Advanced GAN architectures and techniques like Cycle-Consistent Adversarial Networks
  • Handling the progressive growing of GANs
  • Practical applications of GANs

About the reader

Written for data scientists and data analysts with intermediate Python knowledge. Knowing the basics of deep learning will also be helpful.

About the author

Jakub Langr graduated from Oxford University where he also taught at OU Computing Services. He has worked in data science since 2013, most recently as a data science Tech Lead at and as a data science consultant at Mudano. Jakub also designed and teaches Data Science courses at the University of Birmingham and is a fellow of the Royal Statistical Society.

Vladimir Bok is a Senior Product Manager at Intent Media, a data science company for leading travel sites, where he helps oversee the company’s Machine Learning research and infrastructure teams. Prior to that, he was a Program Manager at Microsoft. Vladimir graduated Cum Laude with a degree in Computer Science from Harvard University. He has worked as a software engineer at early stage FinTech companies, including one founded by PayPal co-founder Max Levchin, and as a Data Scientist at a Y Combinator startup.

Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
MEAP combo
$35.00 $49.99 pBook + eBook + liveBook
MEAP eBook
$25.00 $39.99 pdf + ePub + kindle + liveBook

placing your order...

Don't refresh or navigate away from the page.

FREE domestic shipping on three or more pBooks