AI Model Evaluation

you own this product
Leemay Nassery
  • MEAP began August 2025
  • Last updated August 2025
  • Publication in Spring 2026 (estimated)
  • ISBN 9781633435674
  • 250 pages (estimated)
  • printed in black & white

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


Look inside
De-risk AI models, validate real-world performance, and align output with product goals.

Before you trust critical business systems to an AI model, you need to answer a few questions. Will it be fast enough? Will the system satisfy user expectations? Is it safe? Can you trust the output? This book will help you answer these questions and more before you roll out an AI system—and make sure it runs smoothly after you deploy.

In AI Model Evaluation you’ll learn how to:

  • Build diagnostic offline evaluations that uncover model behavior
  • Use shadow traffic to simulate production conditions
  • Design A/B tests that validate model impact on key product metrics
  • Spot nuanced failures with human-in-the-loop feedback
  • Use LLMs as automated judges to scale your evaluation pipeline

In AI Model Evaluation author Leemay Nassery shares her hard-won experiences specializing in experimentation and personalization across companies such as Spotify, Comcast, Dropbox, and Etsy. The book is packed with insights on what it really takes to get a model ready for production. You’ll go beyond basic performance evaluations to discover how you can measure model effectiveness on the product , spot latency issues as you introduce the model in your end-to-end architecture, and understand the model’s real‑world impact.

about the book

AI Model Evaluation teaches you how to effectively evaluate and assess machine learning models for better scaling and integration into production systems. Each chapter tackles a different evaluation method. You'll start with offline evaluations, then move into live A/B tests, shadow traffic deployments, qualitative evaluations, and LLM-based feedback loops. You’ll learn how to evaluate both model behavior and engineering system performance, with a hands-on example grounded in a movie recommendation engine.

about the reader

For practitioners with experience in machine learning, data science, or software engineering. Familiarity with Python is recommended.

about the author

Leemay Nassery is an engineering leader specializing in experimentation and personalization. With a notable track record that includes evolving Spotify's A/B testing strategy for the Homepage, launching Comcast's For You page, and establishing data warehousing teams at Etsy, she firmly believes that the key to innovation at any company is the ability to experiment effectively.
choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Model Evaluation ebook for free
choose your plan

team

monthly
annual
$49.99
$399.99
only $33.33 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • renews monthly, pause or cancel renewal anytime
  • renews annually, pause or cancel renewal anytime
  • AI Model Evaluation ebook for free