Deep Learning and the Game of Go
Max Pumperla and Kevin Ferguson
Foreword by Thore Graepel
  • January 2019
  • ISBN 9781617295324
  • 384 pages
  • printed in black & white

Using the game of Go as a way to teach machine learning is inspired and inspiring. Highly recommended!

Burk Hufnagel, Daugherty Business Solutions

Deep Learning and the Game of Go teaches you how to apply the power of deep learning to complex reasoning tasks by building a Go-playing AI. After exposing you to the foundations of machine and deep learning, you'll use Python to build a bot and then teach it the rules of the game.

About the Technology

The ancient strategy game of Go is an incredible case study for AI. In 2016, a deep learning–based system shocked the Go world by defeating a world champion. Shortly after that, the upgraded AlphaGo Zero crushed the original bot by using deep reinforcement learning to master the game. Now, you can learn those same deep learning techniques by building your own Go bot!

About the book

Deep Learning and the Game of Go introduces deep learning by teaching you to build a Go-winning bot. As you progress, you’ll apply increasingly complex training techniques and strategies using the Python deep learning library Keras. You’ll enjoy watching your bot master the game of Go, and along the way, you’ll discover how to apply your new deep learning skills to a wide range of other scenarios!

Table of Contents detailed table of contents

Part 1: Foundations

1 Toward deep learning: a machine-learning introduction

1.1 What is machine learning?

1.1.1 How does machine learning relate to AI?

1.1.2 What you can and cannot do with machine learning

1.2 Machine learning by example

1.2.1 Using machine learning in software applications

1.2.2 Supervised learning

1.2.3 Unsupervised learning

1.2.4 Reinforcement learning

1.3 Deep learning

1.4 What you will learn in this book

1.5 Summary

2 Go as a machine-learning problem

2.1 Why games?

2.2 A lightning introduction to the game of Go

2.2.1 The board

2.2.2 Placing and capturing stones

2.2.3 Ending the game and counting

2.2.4 Ko

2.3 Handicaps

2.4 Where to learn more

2.5 What can we teach a machine?

2.5.1 Selecting moves in the opening

2.5.2 Searching game states

2.5.3 Reducing the number of moves to consider

2.5.4 Evaluating game states

2.6 How to measure our Go AI’s strength

2.6.1 Traditional Go ranks

2.6.2 Benchmarking our Go AI

2.7 Summary

3 Implementing your first Go bot

3.1 Representing a game of Go in Python

3.1.1 Implementing the Go Board

3.1.2 Connected groups of stones in Go: Strings

3.1.3 Placing and capturing stones on a go board

3.2 Go game state and checking for illegal moves

3.2.1 Self-capture

3.2.2 Ko

3.3 Ending a game

3.4 Creating your first bot: the weakest Go AI imaginable

3.5 Speeding up gameplay with Zobrist hashing

3.6 Playing against your bot

3.7 Summary

Part 2: Machine learning and game AI

4 Playing games with tree search

4.1 Classifying games

4.3 Solving tic-tac-toe: a minimax example

4.4 Reducing search space with pruning

4.4.1 Reducing search depth with position evaluation

4.4.2 Reducing search width with alpha-beta pruning

4.5 Evaluating game states with the Monte Carlo tree search algorithm

4.5.1 Implementing Monte Carlo tree search in Python

4.5.2 How to select which branch to explore

4.5.3 Practical considerations for applying Monte Carlo tree search to Go

4.6 Summary

5 Getting started with neural networks

5.1 A simple use case: Classifying handwritten digits

5.1.1 The MNIST data set of handwritten digits

5.1.2 MNIST data preprocessing

5.2 The basics of neural networks

5.2.1 Logistic regression as simple artificial neural network

5.2.2 Networks with more than one output dimension

5.3 Feed-forward networks

5.4 How good are our predictions? Loss functions and optimization

5.4.1 What is a loss function?

5.4.2 Mean-squared error

5.4.3 Finding minima in loss functions

5.4.4 Gradient descent to find minima

5.4.5 Stochastic gradient descent for loss functions

5.4.6 Propagate gradients back through our network

5.5 Training a neural network step-by-step in Python

5.5.1 Neural network layers in Python

5.5.2 Activation layers in neural networks

5.5.3 Dense layers in Python as building block for feed-forward networks

5.5.4 Sequential neural networks with Python

5.5.5 Applying our network handwritten digit classification

5.6 Summary

6 Designing a neural network for Go data

6.1 Encoding a Go game position for neural networks

6.2 Generating tree-search games as network training data

6.3 The Keras deep learning library

6.3.1 Keras design principles

6.3.2 Installing the Keras deep-learning library

6.3.3 Running a familiar first example with Keras

6.3.4 Go move prediction with feed-forward neural networks in Keras

6.4 Analyzing space with convolutional networks

6.4.1 What convolutions do intuitively

6.4.2 Building convolutional neural networks with Keras

6.4.3 Reducing space with pooling layers

6.5 Predicting Go move probabilities

6.5.1 Using the softmax activation function in the last layer

6.5.2 Cross-entropy loss for classification problems

6.6 Building deeper networks with dropout and rectified linear units

6.6.1 Dropping neurons for regularization

6.6.2 The rectified linear unit activation function

6.7 Putting it all together for a stronger Go move prediction network

6.8 Summary

7 Learning from data: a deep learning bot

7.1 Importing Go game records

7.1.1 The SGF file format

7.1.2 Downloading and replaying Go game records from KGS

7.2 Preparing Go data for deep learning

7.2.1 Replaying a Go game from an SGF record

7.2.2 Building a Go data processor

7.2.3 Building a Go data generator to load data efficiently

7.2.4 Parallel Go data processing and generators

7.3 Training a deep-learning model on human game-play data

7.4 Building more-realistic Go data encoders

7.5 Training efficiently with adaptive gradients

7.5.1 Decay and momentum in SGD

7.5.2 Optimizing neural networks with Adagrad

7.5.3 Refining adaptive gradients with Adadelta

7.6 Running your own experiments and evaluating performance

7.6.1 A guideline to testing architectures and hyperparameters

7.6.2 Evaluating performance metrics for training and test data

7.7 Summary

8 Deploying bots in the wild

8.1 Creating a move prediction agent from a deep neural network

8.2 Serving your Go bot to a web front-end

8.2.1 An end-to-end Go bot example

8.3 Training and deploying a Go bot in the cloud

8.4 Talking to other bots: the Go Text Protocol

8.5 Competing against other bots locally

8.5.1 When a bot should pass or resign

8.5.2 Let your bot play against other Go programs

8.6 Deploying a Go bot at an online Go server

8.6.1 Registering a bot at the Online Go Server (OGS)

8.7 Summary

9 Learning by practice: reinforcement learning

9.1 The reinforcement-learning cycle

9.2 What goes into experience?

9.3 Building an agent that can learn

9.3.1 Sampling from a probability distribution

9.3.2 Clipping a probability distribution

9.3.3 Initializing an agent

9.3.4 Loading and saving your agent from disk

9.3.5 Implementing move selection

9.4 Self-play: how a computer program practices

9.4.1 Representing experience data

9.4.2 Simulating games

9.5 Summary

10 Reinforcement learning with policy gradients

10.1 How random games can identify good decisions

10.2 Modifying neural network policies with gradient descent

10.3 Tips for training with self-play

10.3.1 Evaluating your progress

10.3.2 Measuring small differences in strength

10.3.3 Tuning a stochastic gradient descent optimizer

10.4 Summary

11 Reinforcement learning with value methods

11.1 Playing games with Q-learning

11.2 Q-learning with Keras

11.2.1 Building two-input networks in Keras

11.2.2 Implementing the ϵ-greedy policy with Keras

11.2.3 Training an action-value function

11.3 Summary

12 Reinforcement learning with actor-critic methods

12.1 Advantage tells you which decisions are important

12.1.1 What is advantage?

12.1.2 Calculating advantage during self-play

12.2 Designing a neural network for actor-critic learning

12.3 Playing games with an actor-critic agent

12.4 Training an actor-critic agent from experience data

12.5 Summary

Part 3: Greater than the sum of its parts

13 AlphaGo: Bringing it all together

13.1 Training deep neural networks for AlphaGo

13.1.1 Network architectures in AlphaGo

13.1.2 The AlphaGo board encoder

13.1.3 Training AlphaGo style policy networks

13.2 Bootstrapping self-play from policy networks

13.3 Deriving a value network from self-play data

13.4 Better search with policy and value networks

13.4.1 Using neural networks to improve Monte Carlo rollouts

13.4.2 Tree search with a combined value function

13.4.3 Implementing AlphaGo’s search algorithm

13.5 Practical considerations for training your own AlphaGo

13.6 Summary

14 AlphaGo Zero: Integrating tree search with reinforcement learning

14.2.1 Walking down the tree

14.2.2 Expanding the tree

14.2.3 Selecting a move

14.3 Training

14.4 Improving exploration with Dirichlet noise

14.5 Modern techniques for deeper neural networks

14.5.1 Batch normalization

14.5.2 Residual networks

14.6. Additional resources

14.7 Wrapping up

14.8 Summary

Appendix A: A Mathematical foundations

A.1 Vectors, matrices and beyond: a linear algebra primer

A.1.1 Vectors: one-dimensional data

A.1.2 Matrices: two-dimensional data

Rank 3 tensors

Rank 4 tensors

A.2 Calculus in five minutes: derivatives and finding maxima

Appendix B: B The backpropagation algorithm

B.1 A bit of notation

B.2 The backpropagation algorithm for feed-forward networks

B.3 Backpropagation for sequential neural networks

B.4 Backpropagation for neural networks in general

B.5 Computational challenges with backpropagation

Appendix C: C Go programs and servers

C.1 Go programs

C.1.1 GNU Go

C.1.2 Pachi

C.2 Go servers



C.5 Tygem

Appendix D: D Training and deploying bots using Amazon Web Services

D.1 Model training on AWS

D.2 Hosting a bot on AWS over HTTP

Appendix E: E Submitting a bot to the Online Go Server (OGS)

E.1 Registering and activating your bot at OGS

E.2 Testing your OGS bot locally

E.3 Deploying your OGS bot on AWS

What's inside

  • Build and teach a self-improving game AI
  • Enhance classical game AI systems with deep learning
  • Implement neural networks for deep learning

About the reader

All you need are basic Python skills and high school–level math. No deep learning experience required.

About the authors

Max Pumperla and Kevin Ferguson are experienced deep learning specialists skilled in distributed systems and data science. Together, Max and Kevin built the open source bot BetaGo.

We interviewed Kevin as a part of our Six Questions series. Check it out here.

placing your order...

Don't refresh or navigate away from the page.
print book $32.99 $54.99 pBook + eBook + liveBook
Additional shipping charges may apply
Prints and ships within 3-5 days
Deep Learning and the Game of Go (print book) added to cart
continue shopping
go to cart

eBook $35.19 $43.99 3 formats + liveBook
Deep Learning and the Game of Go (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.
customers also reading

This book 1-hop 2-hops 3-hops

FREE domestic shipping on three or more pBooks