An important step in moving probabilistic programming from research laboratories out into the real world.

*Practical Probabilistic Programming* introduces the working programmer to probabilistic programming. In it, you'll learn how to use the PP paradigm to model application domains and then express those probabilistic models in code. Although PP can seem abstract, in this book you'll immediately work on practical examples, like using the Figaro language to build a spam filter and applying Bayesian and Markov networks, to diagnose computer system data problems and recover digital images.

# Part 1: Introducing probabilistic programming and Figaro

## 1. Probabilistic programming in a nutshell

### 1.1. What is probabilistic programming?

#### 1.1.1. How do we make judgment calls?

#### 1.1.2. Probabilistic reasoning systems help make decisions

#### 1.1.3. Probabilistic reasoning systems can reason in three ways

#### 1.1.4. Probabilistic programming systems: probabilistic reasoning systems expressed in a programming language

### 1.2. Why probabilistic programming?

#### 1.2.1. Better probabilistic reasoning

#### 1.2.2. Better simulation languages

### 1.3. Introducing Figaro: a probabilistic programming language

#### 1.3.1. Figaro vs. Java: building a simple probabilistic programming system

### 1.4. Summary

### 1.5. Exercises

## 2. A quick Figaro tutorial

### 2.1. Introducing Figaro

### 2.2. Creating models and running inference: Hello World revisited

#### 2.2.1. Building your first model

#### 2.2.2. Running inference and answering a query

#### 2.2.3. Building up models and making observations

#### 2.2.4. Understanding how the model is built

#### 2.2.5. Understanding repeated elements: when are they the same and when are they different?

### 2.3. Working with basic building blocks: atomic elements

#### 2.3.1. Discrete atomic elements

#### 2.3.2. Continuous atomic elements

### 2.4. Combining atomic elements by using compound elements

#### 2.4.1. If

#### 2.4.2. Dist

#### 2.4.3. Compound versions of atomic elements

### 2.5. Building more-complex models with Apply and Chain

#### 2.5.1. Apply

#### 2.5.2. Chain

### 2.6. Specifying evidence by using conditions and constraints

#### 2.6.1. Observations

#### 2.6.2. Conditions

#### 2.6.3. Constraints

### 2.7. Summary

### 2.8. Exercises

## 3. Creating a probabilistic programming application

### 3.1. Understanding the big picture

### 3.2. Running the code

### 3.3. Exploring the architecture of a spam-filter application

#### 3.3.1. Reasoning component architecture

#### 3.3.2. Learning component architecture

### 3.4. Designing an email model

#### 3.4.1. Choosing the elements

#### 3.4.2. Defining the dependencies

#### 3.4.3. Defining the functional forms

#### 3.4.4. Using numerical parameters

#### 3.4.5. Working with auxiliary knowledge

### 3.5. Building the reasoning component

### 3.6. Creating the learning component

### 3.7. Summary

### 3.8. Exercises

# Part 2: Writing probabilistic programs

## 4. Probabilistic models and probabilistic programs

### 4.1. Probabilistic models defined

#### 4.1.1. Expressing general knowledge as a probability distribution over possible worlds

#### 4.1.2. Exploring probability distributions further

### 4.2. Using a probabilistic model to answer queries

#### 4.2.1. Conditioning on the evidence to produce the posterior probability distribution

#### 4.2.2. Answering queries

#### 4.2.3. Using probabilistic inference

### 4.3. The ingredients of probabilistic models

#### 4.3.1. Variables

#### 4.3.2. Dependencies

#### 4.3.3. Functional forms

#### 4.3.4. Numerical parameters

### 4.4. Generative processes

### 4.5. Models with continuous variables

#### 4.5.1. Using the beta-binomial model

#### 4.5.2. Representing continuous variables

### 4.6. Summary

### 4.7. Exercises

## 5. Modeling dependencies with Bayesian and Markov networks

### 5.1. Modeling dependencies

#### 5.1.1. Directed dependencies

#### 5.1.2. Undirected dependencies

#### 5.1.3. Direct and indirect dependencies

### 5.2. Using Bayesian networks

#### 5.2.1. Bayesian networks defined

#### 5.2.2. How a Bayesian Network defines a probability distribution

#### 5.2.3. Reasoning with Bayesian networks

#### 5.2.4. Designing a computer system diagnosis model

#### 5.2.5. Reasoning with the computer system diagnosis model

### 5.3. Exploring a Bayesian network example

#### 5.3.1. Designing a computer system diagnosis model

#### 5.3.2. Reasoning with the computer system diagnosis model

### 5.4. Using probabilistic programming to extend Bayesian networks: predicting product success

#### 5.4.1. Designing a product success prediction model

#### 5.4.2. Reasoning with the product success prediction model

### 5.5. Using Markov networks

#### 5.5.1. Markov networks defined

#### 5.5.2. Representing and reasoning with Markov networks

### 5.6. Summary

### 5.7. Exercises

## 6. Using Scala and Figaro collections to build up models

### 6.1. Using Scala collections

#### 6.1.1. Modeling dependence of many variables on a single variable

#### 6.1.2. Creating hierarchical models

#### 6.1.3. Modeling simultaneous dependence on two variables

### 6.2. Using Figaro collections

#### 6.2.1. Understanding why Figaro collections are useful

#### 6.2.2. Revisiting the hierarchical model with Figaro collections

#### 6.2.3. Using Scala and Figaro collections together

### 6.3. Modeling situations with an unknown number of objects

#### 6.3.1. Open universe situations with an unknown number of objects

#### 6.3.2. Variable-size arrays

#### 6.3.3. Operations on variable-size arrays

#### 6.3.4. Example: predicting sales of an unknown number of new products

### 6.4. Working with infinite processes

#### 6.4.1. The Process trait

#### 6.4.2. Example: a temporal health process

#### 6.4.3. Using the process

### 6.5. Summary

### 6.6. Exercises

## 7. Object-oriented probabilistic modeling

### 7.1. Using object-oriented probabilistic models

#### 7.1.1. Understanding elements of object-oriented modeling

#### 7.1.2. Revisiting the printer model

#### 7.1.3. Reasoning about multiple printers

### 7.2. Extending OO probability models with relations

#### 7.2.1. Describing general class-level models

#### 7.2.2. Describing a situation

#### 7.2.3. Representing the social media model in Figaro

### 7.3. Modeling relational and type uncertainty

#### 7.3.1. Element collections and references

#### 7.3.2. Social media model with relational uncertainty

#### 7.3.3. Printer model with type uncertainty

### 7.4. Summary

### 7.5. Exercises

## 8. Modeling dynamic systems

### 8.1. Dynamic probabilistic models

### 8.2. Types of dynamic models

#### 8.2.1. Markov chains

#### 8.2.2. Hidden Markov models

#### 8.2.3. Dynamic Bayesian networks

#### 8.2.4. Models with variable structure over time

### 8.3. Modeling systems that go on indefinitely

#### 8.3.1. Understanding Figaro universes

#### 8.3.2. Using universes to model ongoing systems

#### 8.3.3. Running a monitoring application

### 8.4. Summary

### 8.5. Exercises

# Part 3: Inference

## 9. The three rules of probabilistic inference

### 9.1. The chain rule: building joint distributions from conditional probability distributions

### 9.2. The total probability rule: getting simple query results from a joint distribution

### 9.3. Bayes rule: inferring causes from effects

#### 9.3.1. Understanding, cause, effect and inference

#### 9.3.2. Bayes rule in practice

### 9.4. Bayesian modeling

#### 9.4.1. Estimating the bias of a coin

#### 9.4.2. Predicting the next coin toss

### 9.5. Summary

### 9.6. Exercises

## 10. Factored inference algorithms

### 10.1. Factors

#### 10.1.1. What is a factor?

#### 10.1.2. Factoring a probability distribution by using the chain rule

#### 10.1.3. Defining queries with factors by using the total probability rule

### 10.2. The variable elimination algorithm

#### 10.2.1. Graphical interpretation of VE

#### 10.2.2. VE as algebraic operations

### 10.3. Using VE

#### 10.3.1. Figaro-specific considerations for VE

#### 10.3.2. Designing your model to support efficient VE

#### 10.3.3. Applications of VE

### 10.4. Belief propagation

#### 10.4.1. The essential idea of BP

#### 10.4.2. Properties of loopy BP

### 10.5. Using BP

#### 10.5.1. Figaro-specific considerations for BP

#### 10.5.2. Designing your model to support effective BP

#### 10.5.3. Applications of BP

### 10.6. Summary

### 10.7. Exercises

## 11. Sampling algorithms

### 11.1. The sampling principle

#### 11.1.1. Forward sampling

#### 11.1.2. Rejection sampling

### 11.2. Importance sampling

#### 11.2.1. How importance sampling works

#### 11.2.2. Using importance sampling in Figaro

#### 11.2.3. Making importance sampling work for you

#### 11.2.4. Applications of importance sampling

### 11.3. Markov chain Monte Carlo sampling

#### 11.3.1. How MCMC works

#### 11.3.2. Figaro's MCMC algorithm: Metropolis-Hastings

### 11.4. Getting MH to work well

#### 11.4.1. Customized proposals

#### 11.4.2. Avoiding hard conditions

#### 11.4.3. Applications of MH

### 11.5. Summary

### 11.6. Exercises

## 12. Solving other inference tasks

### 12.1. Computing joint distributions

### 12.2. Computing the most probable explanation

#### 12.2.1. Computing and querying the MPE in Figaro

#### 12.2.2. Using algorithms for solving MPE queries

#### 12.2.3. Exploring applications of MPE algorithms

### 12.3. Computing the probability of evidence

#### 12.3.1. Observing evidence for probability-of-evidence computation

#### 12.3.2. Running probability-of-evidence algorithms

### 12.4. Summary

### 12.5. Exercises

## 13. Dynamic reasoning and parameter learning

### 13.1. Monitoring the state of a dynamic system

#### 13.1.1. Mechanics of monitoring

#### 13.1.2. The particle-filtering algorithm

#### 13.1.3. Applications of filtering

### 13.2. Learning model parameters

#### 13.2.1. Bayesian learning

#### 13.2.2. Maximum likelihood and MAP learning

### 13.3. Going further with Figaro

### 13.4. Summary

### 13.5. Exercises

# Appendixes

## Appendix A: Obtaining and installing Scala and Figaro

## Appendix B: A Brief survey of probabilistic programming systems

## About the Technology

The data you accumulate about your customers, products, and website users can help you not only to interpret your past, it can also help you predict your future! Probabilistic programming uses code to draw probabilistic inferences from data. By applying specialized algorithms, your programs assign degrees of probability to conclusions. This means you can forecast future events like sales trends, computer system failures, experimental outcomes, and many other critical concerns.

## About the book

*Practical Probabilistic Programming* introduces the working programmer to probabilistic programming. In this book, you?ll immediately work on practical examples like building a spam filter, diagnosing computer system data problems, and recovering digital images. You?ll discover probabilistic inference, where algorithms help make extended predictions about issues like social media usage. Along the way, you?ll learn to use functional-style programming for text analysis, object-oriented models to predict social phenomena like the spread of tweets, and open universe models to gauge real-life social media usage. The book also has chapters on how probabilistic models can help in decision making and modeling of dynamic systems.

## What's inside

- Introduction to probabilistic modeling
- Writing probabilistic programs in Figaro
- Building Bayesian networks
- Predicting product lifecycles
- Decision-making algorithms