You’re a consultant working for Messflix, a movie and TV-show streaming platform. Despite having a goldmine of data, Messflix has been unsuccessful in creating a recommendation system. You’ve discovered the problem: the right data is not flowing to the right use cases. Messflix agrees with your suggestion of implementing a data mesh to decentralize data and treat it as a product instead of a byproduct.
You’ll build a Python prototype to explore a self-serve data platform and add functionality for publishing data products. You’ll create a derived recommendation data product that shows a list of recommended movies. Taking on a data product management perspective, you’ll learn to solve and prevent breaking changes. Last but not least, you’ll implement federated computational governance that balances the usefulness, interoperability, and security aspects of data products with the benefits of the data mesh. By the end of this series, you’ll have learned key principles of the data mesh and worked through all its major use cases.
This is an excellent liveProject to pair with the book. The content in the book was more theory and what is here was more hands on—so they complement each other.
You’re a consultant working for Messflix Inc., a movie and TV-show streaming platform. Your task is to set up a Python prototype implementation of the data mesh to roll out the technical components of a data mesh. Using Python and pandas, you’ll write a Python function that creates an empty CSV file with the predefined attributes of your data product, builds a data catalog by creating Python functions that write to the CSV file, and sets up standardized access to the CSV datasets. When you’re done, you’ll have hands-on experience building a minimal self-serve data platform using simple techniques.
Messflix Inc., a movie and TV-show streaming platform, is implementing a data mesh. So far, it has a self-serve data platform prototype where development teams can register their domain data products. As a consultant, your task is to build on that basic platform prototype with additional functionality: You’ll write a script in Python that will enable the development teams to push fresh data into their existing data products, write a function that adds support for versioned data, and implement a function that automatically calculates specific metadata (like row count and latest timestamp), then prints it to the screen. When you’re finished, you’ll have built a well-functioning, feature-rich, self-serve data platform, and be familiar with the requirements data-producing and data-consuming teams face daily and how to fulfill them.
Messflix Inc., a movie and TV-show streaming platform, wants to build a recommendation system for its movies and shows, but currently, its data landscape is too complex. As a consultant, your task is to implement a data mesh for an improved, accurate flow of its data. Using Python and JSON, you’ll help the data engineering teams sift more easily through Messflix’s data by creating separate, structured data products that can be pushed to the central data platform. From the organized data products, you’ll create a list of recommended movies, tailored to Messflix’s customers’ preferences.
As a consultant for Messflix Inc., a movie and TV-show streaming platform, you’ll investigate and discover why Meshflix’s recommender system breaks. You’ll brainstorm options for changes that don’t break the system, explore their pros and cons, and choose and implement one of your options. Then you’ll create an internal versioning strategy to support all the great product changes Messflix has planned for the future.
The development teams at Messflix, a movie and TV-show streaming platform are pushing domain data products through their newly implemented data mesh. Now the CTO has tasked you, their consultant, with striking a balance between the benefits of the data mesh, the freedoms the data products have, their usefulness, their interoperability, and aspects of their security. You’ll use Python and pandas to write policies that check the registration and pushed data, helping users provide all the required registration information. You’ll create tooling to classify data into categories for improved data labeling, and protect sensitive data with pseudonymization functions. When you’re done, you’ll have learned skills for federated computational governance that balance the benefits of data products with the benefits of the data mesh.
This is a fantastic project to help people learn data mesh.
These liveProjects are for architects, developers, and data team members who want to understand the workings of a data mesh. To begin these liveProjects you’ll need to be familiar with the following:TOOLS
In this liveProject series, you’ll learn to build Dockerized mini data mesh to play through all of its major use cases.
geekle is based on a wordle clone.