Look inside
In this liveProject, you'll take on the role of a backend data engineer working for a rapidly growing startup. Your company installs battery packs for solar panels, as well as IoT devices monitor energy usage. These devices can help users and utilities companies better manage their energy, saving your customers money and providing clean power to the wider grid. You’ve been given a big task: build the infrastructure from the ground up that can handle the streaming events of thousands of these IoT devices. This infrastructure will be expected to scale as your company grows, and so your team has chosen to work with Apache Kafka for stream processing of all the data. Your challenges will include efficiently ingesting IoT events, managing corner cases in IoT data processing, developing fleet-wide monitoring, and providing REST services to answer questions about battery energy capacity.
A percentage of every sale of this liveProject will be donated to the Rocky Mountain Institute to support their good work towards a zero-carbon future.
This project is designed for learning purposes and is not a complete, production-ready application or solution.
Prerequisites
For developers with several years of Java or Scala experience and an understanding of the core principles of Apache Kafka. Knowledge of web-servers, databases, build systems and Docker will be helpful, but not essential.
TOOLS
- Intermediate with Java/Scala
- Basics of Maven, SBT, or other JVM build systems
- Basics of Docker
- Basics of Apache Kafka
TECHNIQUES
- Basics of partitioning and hashing data
- Basics of pub/sub system design
you will learn
In this liveProject, you’ll learn how to get started with stream processing and master advanced techniques around real-time data access, and streaming fault management.
- Kafka Streams
- Fundamentals of IoT stream processing
- Advanced IoT stream processing techniques
- Error handling for IoT devices
- Schema management & evolution with Apache Avro
- Trading off latency and throughput in Kafka
- Design patterns for resilient stream processing
- Evolving real-time processing for real-world use cases
- Advanced partitioning schemes
- Distributed, low-latency streaming queries