click to
look inside
Look inside
FREE
You can see this entire book for free.
Click the table of contents to start reading.
ASK me anything...
we'll search our titles
to answer your question

Spark in Action, Second Edition

Covers Apache Spark 3 with Examples in Java, Python, and Scala
Jean-Georges Perrin
Foreword by Rob Thomas
  • May 2020
  • ISBN 9781617295522
  • 576 pages
  • printed in black & white
filed under

placing your order...

Don't refresh or navigate away from the page.
eBook This book is part of the Getting Started with Spark bundle. Our eBooks come in Kindle, ePub, and DRM-free PDF formats + liveBook, our enhanced eBook format accessible from any web browser. $33.59 $47.99 you save: $14 (30%)
Spark in Action, Second Edition (eBook) added to cart
continue shopping
go to cart

print book This book is part of the Getting Started with Spark bundle. Receive a print copy shipped to your door + the eBook in Kindle, ePub, & PDF formats + liveBook, our enhanced eBook format accessible from any web browser. $41.99 $59.99 you save: $18 (30%)
FREE domestic shipping on orders of three or more print books
Spark in Action, Second Edition (print book + eBook) added to cart
continue shopping
go to cart

Free previous edition eBook included! An eBook copy of the previous edition of this book is included at no additional cost. It will be automatically added to your Manning Bookshelf within 24 hours of purchase.

This book reveals the tools and secrets you need to drive innovation in your company or community.

Rob Thomas, IBM
Look inside
The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, Second Edition, you’ll learn to take advantage of Spark’s core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in enterprises worldwide, and with Spark’s powerful and flexible Java APIs, you can reap all the benefits without first learning Scala or Hadoop.
This book is one of two products included in the Getting Started with Spark bundle. Get the entire bundle for only $39.99.

about the technology

Analyzing enterprise data starts by reading, filtering, and merging files and streams from many sources. The Spark data processing engine handles this varied volume like a champ, delivering speeds 100 times faster than Hadoop systems. Thanks to SQL support, an intuitive interface, and a straightforward multilanguage API, you can use Spark without learning a complex new ecosystem.

about the book

Spark in Action, Second Edition, teaches you to create end-to-end analytics applications. In this entirely new book, you’ll learn from interesting Java-based examples, including a complete data pipeline for processing NASA satellite data. And you’ll discover Java, Python, and Scala code samples hosted on GitHub that you can explore and adapt, plus appendixes that give you a cheat sheet for installing tools and understanding Spark-specific terms.

what's inside

  • Writing Spark applications in Java
  • Spark application architecture
  • Ingestion through files, databases, streaming, and Elasticsearch
  • Querying distributed datasets with Spark SQL

about the reader

This book does not assume previous experience with Spark, Scala, or Hadoop.

about the author

Jean-Georges Perrin is an experienced data and software architect. He is France’s first IBM Champion and has been honored for 12 consecutive years.

FREE domestic shipping on orders of three or more print books

An indispensable, well-paced, and in-depth guide. A must-have for anyone into big data and real-time stream processing.

Anupam Sengupta, GuardHat Inc.

This book will help spark a love affair with distributed processing.

Conor Redmond, InComm Product Control

Currently the best book on the subject!

Markus Breuer, Materna IPS
RECENTLY VIEWED