Overview

1 Why you should care about statistics

Statistics matters because data is now part of nearly every profession, product, and decision-making process. The chapter frames statistics as the practice of describing data and using samples to infer truths about larger populations or domains. Rather than treating statistics as a memorization-heavy classroom subject, the book emphasizes intuition, practical examples, and Python-supported calculations so readers can focus on concepts, uncertainty, and real-world usefulness.

The chapter explains that modern life produces massive streams of data through digital systems, businesses, devices, and online interactions, but data has value only when people can interpret it well. Statistical thinking helps professionals make better decisions under uncertainty, such as forecasting inventory, evaluating marketing campaigns, measuring system reliability, assessing product tolerances, or validating machine learning models. It is especially useful for analysts, researchers, data scientists, engineers, software developers, consultants, and AI practitioners who need to reason from incomplete or noisy evidence.

The chapter also stresses that statistics must be used carefully and ethically. Studies, business claims, and model results can be distorted by bad incentives, biased sampling, ignored confounding variables, selective reporting, or pressure to support a preferred conclusion. A strong statistical mindset helps people scrutinize claims, understand uncertainty, test models against new data, and recognize when machine learning or statistical methods are appropriate. The overall mental model is to form or discover a hypothesis, gather data, fit a model, and evaluate whether it generalizes beyond the data used to build it.

Instead of the classroom approach using lookup tables, we will use Python to simplify our statistics calculations.
Digital databases, the Internet, and portable electronic devices have enabled data gathering at a global scale.
An example of the four steps in statistics, studying whether temperature has an impact on sports drinks sales.

Summary

  • Statistics is describing and inferring truths from data, which takes the form of analyzing a sample representing a larger population or domain.
  • Statistics is relevant to any profession that involves data, from analysts to machine learning practitioners and software engineers.
  • Statistics and machine learning have a lot in common, sharing the same techniques but with different mindsets and approaches.
  • Python is a practical and employable platform for practicing statistical concepts, and it can use readily available, stable libraries for tasks such as plotting (matplotlib), data wrangling (pandas), and numerical computing (NumPy).
  • This book will cover a mix of theory, practical hands-on, and “real-world” advice, so you never miss the big picture but still be actionable in the implementation details.
  1. “Statistics.” Merriam-Webster.com Dictionary, Merriam-Webster, https://www.merriam-webster.com/dictionary/statistics. Accessed 28 Apr. 2025.
  2. https://www.youtube.com/watch?v=tm3lZJdEvCc
  3. https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
  4. https://www.thestreet.com/automotive/car-insurance-companies-quietly-use-these-apps-to-hike-your-rates
  5. https://www.statlearning.com/

FAQ

What is statistics?Statistics is the practice of describing and inferring truths from data. It involves collecting, analyzing, interpreting, and quantifying numerical data, often by studying a sample that represents a larger population or domain.
Why should I care about learning statistics?Statistics helps you extract insight from data, make better decisions under uncertainty, and identify patterns that may otherwise be missed. It can improve employability, increase the value of data, support decision-making, improve machine learning and AI work, and enable better sampling and experimentation.
How does statistics help with decision-making?Statistics provides tools for reasoning about uncertainty. For example, instead of assuming last year’s sales will repeat exactly, statistical models can help estimate future inventory needs, analyze seasonal patterns, test whether promotions worked, and measure confidence in predictions.
Why is data considered so important today?Modern digital life generates massive amounts of data through websites, mobile devices, apps, business systems, financial markets, and online services. Statistics is essential because raw data alone has limited value; statistical methods turn data into insights, predictions, alerts, and better decisions.
How can statistics help professionals interpret studies and media claims?Statistics helps people scrutinize study methods, question assumptions, detect biased sampling, identify ignored confounding variables, and evaluate whether claims are supported by the data. This is especially useful when media headlines oversimplify or exaggerate research findings.
What ethical problems can arise in statistics?Ethical problems often come from misaligned incentives. Organizations, researchers, or sponsors may pressure analysts to produce favorable results, bury inconvenient findings, cherry-pick studies, remove data too aggressively, or “torture” data until it supports a desired conclusion.
Who benefits from learning statistics?Anyone who works with data can benefit, including analysts, researchers, data scientists, data engineers, software engineers, hardware engineers, machine learning and AI engineers, consultants, and anyone who works with spreadsheets, charts, or SQL queries.
Why should software engineers learn statistics?Software engineers often work with uncertain real-world data, even if their code feels deterministic. Statistics can help them evaluate uptime, design A/B tests, understand sampled user behavior, measure data reliability, and handle noisy inputs such as analog signals in embedded systems.
How are statistics and machine learning related?Both statistics and machine learning use data to find patterns and make predictions. Machine learning often emphasizes prediction and algorithm optimization, while statistics traditionally emphasizes understanding uncertainty, explaining models, and justifying decisions. The two fields overlap heavily, and machine learning methods are sometimes called statistical learning.
What is the basic mental model of statistics used in the book?The book presents statistics as a four-stage process: hypothesize, gather data, fit a model, and test or evaluate the model. For example, you might hypothesize that temperature affects sports drink sales, collect relevant data, fit a linear regression model, and then test whether the model performs well on new data.

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Grokking Statistics ebook for free
choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • Grokking Statistics ebook for free