Getting Started with Spark

Spark gives you the speed you need to run top-notch analytics on your distributed data sets. Master the basics, then get to grips with data partitioning, processing near-real-time streaming data, applying machine learning and more.

  • Spark in Action, Second Edition
  • Spark in Motion
$97.98$39.99 Getting Started with Spark Bundles are not eligible for additional discounts.

Spark in Action, Second Edition

Spark in Action, Second Edition is an entirely new book that teaches you everything you need to create end-to-end analytics pipelines in Spark. Rewritten from the ground up with lots of helpful graphics, you’ll learn the roles of DAGs and dataframes, the advantages of “lazy evaluation”, and ingestion from files, databases, and streams.

By working through carefully-designed Java-based examples, you’ll delve into Spark SQL, interface with Python, and cache and checkpoint your data. Along the way, you’ll learn to interact with common enterprise data technologies like HDFS and file formats like Parquet, ORC, and Avro.

You’ll also discover interesting Spark use cases, like interactive reporting, machine learning pipelines, and even monitoring players in online games. You’ll even get a quick look at machine learning techniques you can apply without a PhD in mathematics! All examples are available in GitHub for you to explore and adapt as you learn. The demand for Spark-savvy developers is so steep, they’re among the highest paid in the industry today!

Spark in Motion

See it. Do it. Learn it! Spark in Motion teaches you to use Spark for big data analytics through high-quality video-based lessons and built-in exercises, so you can put what you learn into practice.

Spark in Motion teaches you how to use Spark for batch and streaming data analytics. In nearly 3 hours of hands-on video lessons, you'll get up and running with Spark, starting with the basic architecture of a Spark application. You'll explore data partitioning and accessing common application state, and then you'll deep-dive into using Spark SQL and dataframes for structured analytics. Finally, you'll use Spark Streaming to handle and process real-time data flowing into your application.

$97.98$39.99 Getting Started with Spark Bundles are not eligible for additional discounts.
Some bundled books and liveVideos are part of the Manning Early Access Program. You'll get all the available content now, new content as it's created, and the final product when it's ready.