Five-Project Series

Build a Small Dockerized Data Mesh you own this product

prerequisites
intermediate Python • intermediate JSON • basic Docker • basic Bash • basic familiarity with typical data systems
skills learned
build a self-serve data platform in Python • build and maintain data products • make changes to data products • build derived data products• manage data as a product • implement computational governance elements with Python and machine learning• analyze the federated aspects of governance for distributed data organizations
Sven Balnojan
5 weeks · 5-7 hours per week average · INTERMEDIATE

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


You’re a consultant working for Messflix, a movie and TV-show streaming platform. Despite having a goldmine of data, Messflix has been unsuccessful in creating a recommendation system. You’ve discovered the problem: the right data is not flowing to the right use cases. Messflix agrees with your suggestion of implementing a data mesh to decentralize data and treat it as a product instead of a byproduct.

You’ll build a Python prototype to explore a self-serve data platform and add functionality for publishing data products. You’ll create a derived recommendation data product that shows a list of recommended movies. Taking on a data product management perspective, you’ll learn to solve and prevent breaking changes. Last but not least, you’ll implement federated computational governance that balances the usefulness, interoperability, and security aspects of data products with the benefits of the data mesh. By the end of this series, you’ll have learned key principles of the data mesh and worked through all its major use cases.

These projects are designed for learning purposes and are not complete, production-ready applications or solutions.

This is an excellent liveProject to pair with the book. The content in the book was more theory and what is here was more hands on—so they complement each other.

Shiro Kulatilake, Growth Hacker, WSO2

here's what's included

Project 1 Build a Self-Serve Data Platform

You’re a consultant working for Messflix Inc., a movie and TV-show streaming platform. Your task is to set up a Python prototype implementation of the data mesh to roll out the technical components of a data mesh. Using Python and pandas, you’ll write a Python function that creates an empty CSV file with the predefined attributes of your data product, builds a data catalog by creating Python functions that write to the CSV file, and sets up standardized access to the CSV datasets. When you’re done, you’ll have hands-on experience building a minimal self-serve data platform using simple techniques.

Project 2 Push Data into Data Products

Messflix Inc., a movie and TV-show streaming platform, is implementing a data mesh. So far, it has a self-serve data platform prototype where development teams can register their domain data products. As a consultant, your task is to build on that basic platform prototype with additional functionality: You’ll write a script in Python that will enable the development teams to push fresh data into their existing data products, write a function that adds support for versioned data, and implement a function that automatically calculates specific metadata (like row count and latest timestamp), then prints it to the screen. When you’re finished, you’ll have built a well-functioning, feature-rich, self-serve data platform, and be familiar with the requirements data-producing and data-consuming teams face daily and how to fulfill them.

Project 3 Publish Data Products

Messflix Inc., a movie and TV-show streaming platform, wants to build a recommendation system for its movies and shows, but currently, its data landscape is too complex. As a consultant, your task is to implement a data mesh for an improved, accurate flow of its data. Using Python and JSON, you’ll help the data engineering teams sift more easily through Messflix’s data by creating separate, structured data products that can be pushed to the central data platform. From the organized data products, you’ll create a list of recommended movies, tailored to Messflix’s customers’ preferences.

Project 4 Manage Data Products

As a consultant for Messflix Inc., a movie and TV-show streaming platform, you’ll investigate and discover why Meshflix’s recommender system breaks. You’ll brainstorm options for changes that don’t break the system, explore their pros and cons, and choose and implement one of your options. Then you’ll create an internal versioning strategy to support all the great product changes Messflix has planned for the future.

Project 5 Ensure Computational Governance

The development teams at Messflix, a movie and TV-show streaming platform are pushing domain data products through their newly implemented data mesh. Now the CTO has tasked you, their consultant, with striking a balance between the benefits of the data mesh, the freedoms the data products have, their usefulness, their interoperability, and aspects of their security. You’ll use Python and pandas to write policies that check the registration and pushed data, helping users provide all the required registration information. You’ll create tooling to classify data into categories for improved data labeling, and protect sensitive data with pseudonymization functions. When you’re done, you’ll have learned skills for federated computational governance that balance the benefits of data products with the benefits of the data mesh.

book resources

When you start each of the projects in this series, you'll get full access to the following book for 90 days.

choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • Build a Small Dockerized Data Mesh project for free

This is a fantastic project to help people learn data mesh.

Ramanan Natarajan, Principal Enterprise Architect, Lowe’s India

project author

Sven Balnojan

Sven Balnojan is a data technologist and product-person focused on helping the world extract more value from the exponentially growing amount of data. He’s passionate about all things data, including machine learning, AI, and business intelligence. As a manager of internal data teams, he’s been integral in easing their transitions from being a service-oriented to platform-oriented data team. He enjoys getting his hands dirty as a data developer, whether in the field of machine learning, data engineering, or data DevOps.

For his PhD in mathematics, Sven wrote a thesis in the field of singularity theory. He’s the co-author of Manning’s Data Mesh in Action and author of the Three Data Point Thursday newsletter. He blogs on datacisions.com and makes appearances in various online talks.

Prerequisites

These liveProjects are for architects, developers, and data team members who want to understand the workings of a data mesh. To begin these liveProjects you’ll need to be familiar with the following:

TOOLS
  • Intermediate Python
  • Basics pandas
  • Basic JSON
  • Basic Docker
  • Basic Bash
TECHNIQUES
  • Understand key principles of the data mesh (including data products, data product thinking, federated computational governance, data catalogs, and distributed domain ownership)

you will learn

In this liveProject series, you’ll learn to build Dockerized mini data mesh to play through all of its major use cases.

  • Progressively build a self-serve data platform
  • Analyze how platforms in general benefit data producers and data consumers
  • Make changes to data products as a data producer
  • Build a recommendation system using data products as a data consumer
  • Implement computational governance elements
  • Analyze the federated aspects of governance for distributed data organizations

features

Self-paced
You choose the schedule and decide how much time to invest as you build your project.
Project roadmap
Each project is divided into several achievable steps.
Get Help
While within the liveProject platform, get help from other participants and our expert mentors.
Compare with others
For each step, compare your deliverable to the solutions by the author and other participants.
book resources
Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.