Reactive Data Handling
With chapters selected by Manuel Bernhardt
  • June 2016
  • ISBN 9781617294198
  • 139 pages

We depend on web applications to be highly-available and to provide us with up-to-the-second data. This shift toward real-time data processing is also a key aspect of the Internet of Things, which the Gartner Group predicts by 2020 will include 26 billion actively-connected physical devices sending, receiving, and processing streams. That's a lot of data. The reactive application architecture is an answer to the requirements of high availability and resource efficiency.

Reactive Data Handling is a collection of five hand-picked chapters introducing you to building reactive applications capable of handling real-time processing with large data loads. Manuel Bernhardt, author of Reactive Web Applications , selected these chapters to show you how reactive application architecture solves real-time data demands. You'll start with the high-level architecture of reactive applications and then look at low-level practical aspects. After you read these chapters, you'll understand the benefits of using the reactive application architecture to manage and process vast quantities of data at a fast pace. Along the way, you'll get a sample of Manning books you may want to add to your library.

Table of Contents detailed table of contents


Analyzing streaming data

1. Analyzing streaming data

1.1. Understanding in-flight data analysis

1.2. Distributed stream processing architecture

1.3. Key features of stream-processing frameworks

1.3.1. Message delivery semantics

1.4. Summary

1.4.1. What's inside

Fault tolerance and recovery patterns

2. Fault tolerance and recovery patterns

2.1. The Simple Component Pattern

2.1.1. The Problem Setting

2.1.2. Applying the Pattern

2.1.3. The Pattern Revisited

2.1.4. Applicability

2.2. The Error Kernel Pattern

2.2.1. The Problem Setting

2.2.2. Applying the Pattern

2.2.3. The Pattern Revisited

2.2.4. Applicability

2.3. The Let-It-Crash Pattern

2.3.1. The Problem Setting

2.3.2. Applying the Pattern

2.3.3. The Pattern Revisited

2.3.4. Implementation Considerations

2.3.5. Corollary: the Heartbeat Pattern

2.3.6. Corollary: The Proactive Failure Signal Pattern

2.4. The Circuit Breaker Pattern

2.4.1. The Problem Setting

2.4.2. Applying The Pattern

2.4.3. The Pattern Revisited

2.4.4. Applicability

2.5. Summary

2.5.1. What's inside

Your first reactive web application

3. Your first reactive web application

3.1. Creating and running a new project

3.2. Connecting to Twitter's streaming API

3.2.1. Getting the connection credentials to the Twitter API

3.2.2. Working around a bug with OAuth authentication

3.2.3. Streaming data from the Twitter API

3.2.4. Asynchronously transforming the Twitter stream

3.3. Streaming tweets to clients using a WebSocket

3.3.1. Creating an actor

3.3.2. Setting up the WebSocket connection and interacting with it

3.3.3. Sending tweets to the WebSocket

3.4. Making the application resilient and scaling out

3.4.1. Making the client resilient

3.4.2. Scaling out

3.5. Summary

3.5.1. What's inside

Getting smart with MLlib

4. Getting smart with MLlib

4.1. Introduction to machine learning

4.1.1. Definition of machine learning

4.1.2. Classification of machine learning algorithms

4.1.3. Machine learning with Spark

4.2. Linear algebra in Spark

4.2.1. Local vector and matrix implementations

4.2.2. Distributed matrices

4.3. Linear regression

4.3.1. About linear regression

4.3.2. Simple linear regression

4.3.3. Expanding the model to multiple linear regression

4.4. Analyzing and preparing the data

4.4.1. Analyzing data distribution

4.4.2. Analyzing column cosine similarities

4.4.3. Computing the covariance matrix

4.4.4. Transforming to labeled points

4.4.5. Splitting the data

4.4.6. Feature scaling and mean normalization

4.5. Fitting and using a linear regression model

4.5.1. Predicting the target values

4.5.2. Evaluating the model's performance

4.5.3. Interpreting the model parameters

4.5.4. Loading and saving the model

4.6. Tweaking the algorithm

4.6.1. Finding the right step size and number of iterations

4.6.2. Adding higher-order polynomials

4.6.3. Bias-variance tradeoff and model complexity

4.6.4. Plotting residual plots

4.6.5. Avoiding overfitting by using regularization

4.6.6. K-fold cross-validation

4.7. Optimizing linear regression

4.7.1. Mini-batch stochastic gradient descent

4.7.2. LBFGS optimizer

4.8. Summary

4.8.1. What's inside

Managing datacenter resources with Mesos

5. Managing datacenter resources with Mesos

5.1. A brief introduction to Spark

5.1.1. Spark on a standalone cluster

5.1.2. Spark on Mesos

5.2. Running a Spark job on Mesos

5.2.1. Finding prime numbers in a set

5.2.2. Getting and packaging up the code

5.2.3. Submitting the job

5.2.4. Observing the output

5.3. Exploring further

5.3.1. Mesos UI

5.3.2. Spark UI

5.4. Summary

5.4.1. What's inside



About the author

Manuel Bernhardt is a software engineer who specializes in reactive web applications using Play, Scala and Akka. He's been using Play framework since its introduction.

eBook $0.00 PDF only

FREE domestic shipping on three or more pBooks