click to
look inside
Look inside
Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
FREE
You can see any available part of this book for free.
Click the table of contents to start reading.
ASK me anything...
we'll search our titles
to answer your question

Effective Data Science Infrastructure you own this product

How to make data scientists more productive
Ville Tuulos
  • MEAP began May 2021
  • Publication in Summer 2022 (estimated)
  • ISBN 9781617299193
  • 325 pages (estimated)
  • printed in black & white
filed under

placing your order...

Don't refresh or navigate away from the page.
eBook Our eBooks come in Kindle, ePub, and DRM-free PDF formats + liveBook, our enhanced eBook format accessible from any web browser.

Get One, Give One  
This December, for every book, video, or liveProject you buy, you’ll get a free second one to give away. You can use these free gifts for your friends, coworkers, or anyone you want to help, nudge, or encourage.
$27.99 $39.99 you save $12 (30%)
+ get a free copy to give away
Effective Data Science Infrastructure (eBook) added to cart
continue shopping
go to cart

print book Receive a print copy shipped to your door + the eBook in Kindle, ePub, & PDF formats + liveBook, our enhanced eBook format accessible from any web browser.

Get One, Give One  
This December, for every book, video, or liveProject you buy, you’ll get a free second one to give away. You can use these free gifts for your friends, coworkers, or anyone you want to help, nudge, or encourage.
$34.99 $49.99 you save $15 (30%)
+ get a free copy to give away
FREE domestic shipping on orders of three or more print books
Effective Data Science Infrastructure (print book + eBook) added to cart
continue shopping
go to cart

Do not miss the opportunity to cover all key aspects of data science infrastructure on your next project.

Jesús A. Juárez Guerrero
Look inside
Simplify data science infrastructure to give data scientists an efficient path from prototype to production.

In Effective Data Science Infrastructure you will learn how to:

  • Design data science infrastructure that boosts productivity
  • Handle compute and orchestration in the cloud
  • Deploy machine learning to production
  • Monitor and manage performance and results
  • Combine cloud-based tools into a cohesive data science environment
  • Develop reproducible data science projects using Metaflow, Conda, and Docker
  • Architect complex applications for multiple teams and large datasets
  • Customize and grow data science infrastructure

Effective Data Science Infrastructure: How to make data scientists more productive is a hands-on guide to assembling infrastructure for data science and machine learning applications. It reveals the processes used at Netflix and other data-driven companies to manage their cutting edge data infrastructure. In it, you’ll master scalable techniques for data storage, computation, experiment tracking, and orchestration. You’ll also learn how to collaborate with data scientists to deliver exactly what they need to succeed.

The author is donating proceeds from this book to charities that support women and underrepresented groups in data science.

about the technology

Turning data science projects from small prototypes to sustainable business processes requires scalable and reliable infrastructure. This book lays out the workflows, components, and methods of the full infrastructure stack for data science, from data warehousing and scalable compute to modeling frameworks.

about the book

Effective Data Science Infrastructure: How to make data scientists more productive is a guide to building infrastructure that will supercharge data science projects and data scientists. Based on state-of-the-art practices that power the massive data operations of Netflix, this book offers techniques and patterns relevant to companies of all shapes and sizes. You’ll learn how you can make data scientists more productive with your existing cloud infrastructure, a stack of open source software, and idiomatic Python.

As you work through this easy-to-follow guide, you’ll set up end-to-end infrastructure from the ground up, with a fully customizable process you can easily adapt to your company. You’ll build a cloud-based development environment that covers local prototyping and deployment to production, set up infrastructure that supports a real-world machine learning application, and handle a large-scale application for processing hundreds of gigabytes of data. Throughout, you’ll follow a human-centric approach focused on user experience and meeting the unique needs of data scientists.

about the reader

For infrastructure and DevOps engineers, and engineering-minded data scientists, who are familiar with Python.

about the author

Ville Tuulos has been developing tools and infrastructure for data science and machine learning for over two decades. At Netflix, he designed and built Metaflow, a full-stack framework for data science. Currently, he is the CEO of a startup focusing on data science infrastructure.

FREE domestic shipping on orders of three or more print books

Useful book that provides tactical guidance on how to use Metaflow to streamline data science workflows but also includes great frameworks and abstractions to consider when defining your data science infrastructure stack.

Sarah Catanzaro

This is the ultimate book to learn how to handle infrastructure in data science!

Ninoslav Cerkez

If you need a workflow management tool to glue your data code, look at metaflow. It's simple yet efficient.

Mikael Dautrey
RECENTLY VIEWED