Math and Architectures of Deep Learning
Krishnendu Chaudhury
  • MEAP began March 2020
  • Publication in Early 2021 (estimated)
  • ISBN 9781617296482
  • 450 pages (estimated)
  • printed in black & white

This is a book that will reward your patience and perseverance with a clear and detailed knowledge of deep learning mathematics and associated techniques.

Tony Holdroyd
The mathematical paradigms that underlie deep learning typically start out as hard-to-read academic papers, often leaving engineers in the dark about how their models actually function. Math and Architectures of Deep Learning bridges the gap between theory and practice, laying out the math of deep learning side by side with practical implementations in Python and PyTorch. Written by deep learning expert Krishnendu Chaudhury, you’ll peer inside the “black box” to understand how your code is working, and learn to comprehend cutting-edge research you can turn into practical applications.

About the Technology

It’s important to understand how your deep learning models work, both so that you can maintain them efficiently and explain them to other stakeholders. Learning mathematical foundations and neural network architecture can be challenging, but the payoff is big. You’ll be free from blind reliance on prepackaged DL models and able to build, customize, and re-architect for your specific needs. And when things go wrong, you’ll be glad you can quickly identify and fix problems.

About the book

Math and Architectures of Deep Learning sets out the foundations of DL in a way that’s both useful and accessible to working practitioners. Each chapter explores a new fundamental DL concept or architectural pattern, explaining the underpinning mathematics and demonstrating how they work in practice with well-annotated Python code. You’ll start with a primer of basic algebra, calculus, and statistics, working your way up to state-of-the-art DL paradigms taken from the latest research. By the time you’re done, you’ll have a combined theoretical insight and practical skills to identify and implement DL architecture for almost any real-world challenge.
Table of Contents detailed table of contents

Introduction: Importance of mathematical principles underlying deep learning

1 An overview of machine learning and deep learning

1.1 A first look at machine/deep learning - a paradigm shift in computation

1.2 A Function Approximation View of Machine Learning: Models and their Training

1.3 A simple machine learning model - the cat brain

1.4 Geometrical View of Machine Learning

1.5 Regression vs Classification in Machine Learning

1.6 Linear vs Nonlinear Models

1.7 Higher Expressive Power through multiple non-linear layers: Deep Neural Networks

1.8 Summary

2 Introduction to Vectors, Matrices and Tensors from Machine Learning and Data Science point of view

2.1 Vectors and their role in Machine Learning and Data Science

2.1.1 Geometric View of Vectors and its significance in Machine Learning and Data Science

2.2 Python code to create and access vectors and sub-vectors, slice and dice vectors, via Numpy and PyTorch parallel code

2.2.1 Python Numpy code for introduction to Vectors

2.2.2 PyTorch code for introduction to Vectors

2.3 Matrices and their role in Machine Learning and Data Science

2.4 Python Code: Introduction to Matrices, Tensors and Images via Numpy and PyTorch parallel code

2.4.1 Python Numpy code for introduction to Tensors, Matrices and Images

2.4.2 PyTorch code for introduction to Tensors and Matrices

2.5 Basic Vector and Matrix operations in Machine Learning and Data Science

2.5.1 Matrix and Vector Transpose

2.5.2 Dot Product of two vectors and its role in Machine Learning and Data Science

2.5.3 Matrix Multiplication and Machine Learning, Data Science

2.5.4 Length of a Vector aka L2 norm and its role in Machine Learning

2.5.5 Geometric intuitions for Vector Length - Model Error in Machine Learning

2.5.6 Geometric intuitions for the Dot Product - Feature Similarity in Machine Learning and Data Science

2.6 Orthogonality of Vectors and its physical significance

2.7 Python code: Basic Vector and Matrix operations via Numpy

2.7.1 Python numpy code for Matrix Transpose

2.7.2 Python numpy code for Dot product

2.7.3 Python numpy code for Matrix vector multiplication

2.7.4 Python numpy code for Matrix Matrix Multiplication

2.7.5 Python numpy code for Transpose of Matrix Product

2.7.6 Python numpy code for Matrix Inverse

2.8 Multidimensional Line and Plane Equations and their role in Machine Learning

2.8.1 Multidimensional Line Equation

2.8.2 Multidimensional Planes and their role in Machine Learning

2.9 Linear Combination, Linear Dependence, Vector Span and Basis Vectors, their Geometrical Significance, Collinearity Preservation

2.10 Linear Transforms - Geometric and Algebraic interpretations

2.11 Multidimensional Arrays, Multi-linear Transforms and Tensors

2.11.1 Array View: Multidimensional arrays of numbers

2.12 Linear Systems and Matrix Inverse

2.12.1 Linear Systems with zero or near zero Determinants; Ill Conditioned Systems

2.12.2 Over and Under Determined Linear Systems in Machine Learning and Data Science

2.12.3 Moore Penrose Pseudo-Inverse of a Matrix: solving Over or Under Determined Linear Systems

2.12.4 Pseudo Inverse of a Matrix: A Beautiful Geometric Intuition

2.12.5 Python numpy code to solve over-determined systems

2.13 Eigenvalues and Eigenvectors - swiss army knives in Machine Learning and Data Science

2.13.1 Python numpy code to compute eigenvectors and eigenvalues

2.14 Orthogonal (Rotation) Matrices and their Eigenvalues and Eigenvectors

2.14.1 Python numpy code for orthogonality of rotation matrices

2.15 Matrix Diagonalization

2.15.1 Python Numpy code for Matrix diagonalization

2.15.2 Solving Linear Systems without Inverse via Diagonalization

2.15.3 Python Numpy code for Solving Linear Systems via diagonalization

2.15.4 Matrix powers using diagonalization

2.16 Spectral Decomposition of a Symmetric Matrix

2.16.1 Python numpy code for Spectral Decomposition of Matrix

2.17 An application relevant to Machine Learning - finding the axes of a hyper-ellipse

2.17.1 Python numpy code for Hyper Ellipses

2.18 Summary

3 Introduction to Vector Calculus from Machine Learning point of view

3.1 Significance of the sign of the separating surface in binary classification

3.2 Estimating Model Parameters: Training

3.3 Minimizing Error during Training a Machine Learning Model: Gradient Vectors

3.3.1 Derivatives, Partial Derivatives, Change in function value and Tangents

3.3.2 Level Surface representation and Loss Minimization

3.4 Python numpy and PyTorch code for Gradient Descent, Error Minimization and Model Training

3.4.1 Numpy and PyTorch code for Linear Models

3.4.2 Non-linear Models in PyTorch

3.4.3 A Linear Model for the cat-brain in PyTorch

3.5 Convex, Non-convex functions; Global and Local Minima

3.6 Multi-dimensional Taylor series and Hessian Matrix

3.6.1 1D Taylor Series recap

3.6.2 Multi-dimensional Taylor series and Hessian Matrix

3.7 Convex sets and functions

3.8 Chapter Summary

4 Linear Algbraic Tools in Machine Learning and Data Science

4.1 Quadratic Forms and their Minimization

4.1.1 Symmetric Positive (Semi)definite Matrices

4.2 Spectral and Frobenius Norm of a Matrix

4.3 Principal Component Analysis

4.3.1 Application of PCA in Data Science: Dimensionality Reduction

4.3.2 Python Numpy code: PCA and dimensionality reduction

4.3.3 Drawback of PCA from Data Science viewpoint

4.3.4 Application of PCA in Data Science: Data Compression

4.4 Singular Value Decomposition

4.4.1 Application of SVD: PCA computation

4.4.2 Application of SVD: Solving arbitrary Linear System

4.4.3 Rank of a Matrix

4.4.4 Python numpy code for linear system solving via SVD

4.4.5 Python numpy code for PCA computation via SVD

4.4.6 Application of SVD: Best low rank approximation of a matrix

4.5 Machine Learning Application: Document Retrieval

4.5.1 TF-IDF and Cosine Similarity in Machine Learning based Document Retrieval

4.5.2 Latent Semantic Analysis (LSA)

4.5.3 Python/Numpy code to compute LSA on a toy dataset

4.5.4 Python/Numpy code to compute and visualize LSA/SVD on a 500 × 3 dataset

4.6 Summary

5 Probability Distributions for Machine Learning and Data Science

5.1 Probability - the classical frequentist view

5.1.1 Random Variables

5.1.2 Population Histograms

5.2 Probability Distributions

5.3 Impossible and certain events, Sum of probabilities of exhaustive, mutually exclusive events, Independent events

5.3.1 Probabilities of Impossible and Certain Events

5.3.2 Exhaustive and mutually exclusive events

5.3.3 5.3.3 Independent Events

5.4 Joint Probabilities and their distributions

5.4.1 Marginal Probabilities

5.4.2 Dependent Events and their Joint Probability Distribution

5.5 Geometrical View: Sample point distributions for dependent and independent variables

5.5.1 Python Numpy code to draw random samples from a discrete joint probability distribution

5.6 Continuous Random Variables and Probability Density

5.7 Properties of distributions - Expected Value, Variance and Covariance

5.7.1 Expected Value aka Mean

5.7.2 Variance, Covariance, Standard Deviation

5.8 Sampling from a Distribution

5.9 Some famous probability distributions

5.9.1 Uniform Random Distributions

5.9.2 Gaussian (aka Normal) Distribution

5.9.3 Binomial Distribution

5.9.4 Multinomial Distribution

5.9.5 Bernoulli Distribution

5.9.6 Categorical Distribution and one-hot vectors

5.10 Chapter Summary

6 Neural Networks Basics

7 Neural Network Optimizers

8 Non Fully Connected Layers in Neural Networks

9 Deep Learning based Object Recognition and Detection

10 Deep Learning based Image Digests and Image Similarity Estimation

11 Towards self training

12 Spatio Temporal Deep Neural Networks

Appendix A: Appendix: Automatic Differentiation - forward and reverse mode

Conclusion: Future Directions

What's inside

  • Math, theory, and programming principles side by side
  • Linear algebra, vector calculus and multivariate statistics for deep learning
  • The structure of neural networks
  • Implementing deep learning architectures with Python and PyTorch
  • Troubleshooting underperforming models
  • Working code samples in downloadable Jupyter notebooks

About the reader

For Python programmers with algebra and calculus basics.

About the author

Krishnendu Chaudhury is a deep learning and computer vision expert with decade-long stints at both Google and Adobe Systems. He is presently CTO and co-founder of Drishti Technologies. He has a PhD in computer science from the University of Kentucky at Lexington.

placing your order...

Don't refresh or navigate away from the page.
Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
print book $29.99 $49.99 pBook + eBook + liveBook
Additional shipping charges may apply
Math and Architectures of Deep Learning (print book) added to cart
continue shopping
go to cart

eBook $24.99 $39.99 3 formats + liveBook
Math and Architectures of Deep Learning (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.

FREE domestic shipping on three or more pBooks