Azure Data Engineering
Vlad Riscutia
  • MEAP began October 2020
  • Publication in Fall 2021 (estimated)
  • ISBN 9781617298929
  • 375 pages (estimated)
  • printed in black & white

A book that sucks you in at the start and doesn't want you to put it down until it has imparted all the knowledge you need to understand the topic.

Richard Vaughan
Azure Data Engineering reveals the architectural, operational, and data management techniques that power cloud-based data infrastructure built on the Microsoft Azure platform. Author Vlad Riscuita, a data engineer at Microsoft, teaches you the patterns and techniques that support Microsoft’s own massive data infrastructure. You'll learn to bring an engineering rigor to your data platform, ensuring that your theoretical data tools function just as well under the pressures of production. You'll implement common data modeling patterns, stand up cloud-native data platforms on Azure, get to grips with DevOps for both analytics and machine learning, and more.

About the Technology

There’s a big gap between running machine learning and data processes as prototypes, and deploying them to a production cloud environment. Robust data engineering practices are essential to ensuring that your carefully designed data tools have what it takes to work in the real world. Encompassing everything from architecture and design, to operations, monitoring, and scaling, a proper approach to data engineering ensures your systems are reliable and flexible to handle the different issues that messy data can throw at them.

About the book

Azure Data Engineering teaches you to build a scalable and robust data platform to industry-leading standards. All examples are based on the production big data platform that powers Microsoft's customer-growth operations. You'll learn techniques and best practices that author Vlad Riscutia and his team use on a daily basis, including automation and DevOps, running a reliable machine learning pipeline, and managing your data inventory. Examples are illustrated with Azure. The patterns and techniques are transferable to other cloud platforms.
Table of Contents detailed table of contents

1. Introduction

1.1. 1.1 What is data engineering?

1.2. 1.2 Who this book is for

1.3. 1.3 In this book

1.3.1. 1.3.1 Anatomy of a data platform

1.3.2. 1.3.2 Infrastructure as code, codeless infrastructure

1.4. 1.4 Building in the cloud

1.4.1. 1.4.1 IaaS, PaaS, SaaS

1.4.2. 1.4.2 Network, storage, compute

1.4.3. 1.4.3 Getting started with Azure

1.4.4. 1.4.4 Interacting with Azure

1.5. 1.5 An Azure data platform

1.6. 1.6 Summary

PART 1 Infrastructure

2. Storage

2.1. 2.1 Storing data in a data platform

2.1.1. 2.1.1 Storing data across multiple data fabrics

2.1.2. 2.1.2 Having a single source of truth

2.2. 2.2 Introducing Azure Data Explorer

2.2.1. 2.2.1 Deploying an Azure Data Explorer cluster

2.2.2. 2.2.2 Using Azure Data Explorer

2.2.3. 2.2.3 Working around query limits

2.3. 2.3 Introducing Azure Data Lake Storage

2.3.1. 2.3.1 Creating an Azure Data Lake Storage account

2.3.2. 2.3.2 Using Azure Data Lake Storage

2.3.3. 2.3.3 Integrating with Azure Data Explorer

2.4. 2.4 Ingesting data

2.4.1. 2.4.1 Ingestion frequency

2.4.2. 2.4.2 Load type

2.4.3. 2.4.3 Restatements and reloads

2.5. 2.5 Summary

3. DevOps

3.1. 3.1 What is DevOps?

3.1.1. 3.1.1 DevOps in Data Engineering

3.2. 3.2 Introducing Azure DevOps

3.2.1. 3.2.1 Using the az azure-devops extension

3.3. 3.3 Deploying infrastructure

3.3.1. 3.3.1 Exporting ARM templates

3.3.2. 3.3.2 Creating Azure DevOps service connections

3.3.3. 3.3.3 Deploying ARM templates

3.3.4. 3.3.4 Understanding Azure Pipelines

3.4. 3.4 Deploying analytics

3.4.1. 3.4.1 Using Azure DevOps marketplace extensions

3.4.2. 3.4.2 Everything in Git, everything deployed automatically

3.5. 3.5 Summary

4. Orchestration

PART 2 Workloads

5. Modeling

6. Analytics

7. Machine learning

PART 3 Governance

8. Metadata

9. Data quality

10. Compliance

11. Distributing data


Appendix A: Azure services

Appendix B: KQL cheat sheet

What's inside

  • Pick the right Azure services for different data scenarios
  • Implement production quality data modeling, analytics, and machine learning workloads
  • Handle data governance
  • Apply best practices for compliance and access control

About the reader

For data engineers familiar with cloud and DevOps.

About the author

Vlad Riscutia is a software architect and data engineer at Microsoft on the Customer Growth and Analytics team.

placing your order...

Don't refresh or navigate away from the page.
Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
print book $24.99 $49.99 pBook + eBook + liveBook
Additional shipping charges may apply
Azure Data Engineering (print book) added to cart
continue shopping
go to cart

eBook $19.99 $39.99 3 formats + liveBook
Azure Data Engineering (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.
customers also reading

This book 1-hop 2-hops 3-hops

FREE domestic shipping on three or more pBooks