Data Wrangling with JavaScript
Ashley Davis
  • MEAP began January 2018
  • Publication in Fall 2018 (estimated)
  • ISBN 9781617294846
  • 375 pages (estimated)
  • printed in black & white

It's as if the author was really showing us every step. Reading seems like doing.

David Krief
If you're a JavaScript developer, you already know that working with data is a big deal. Why let the Python and R coders get all the glory? JavaScript isn't just good at data visualization, you can move your entire data wrangling pipeline to JavaScript and work more effectively. Data Wrangling with JavaScript teaches you core data munging techniques in JavaScript, along with many libraries and tools that will make your data tasks even easier.
Table of Contents detailed table of contents

1. Getting started: establishing your data pipeline

1.1. Why data wrangling?

1.2. What is data wrangling?

1.3. Why a book on JavaScript data wrangling?

1.4. What will you get out of this book?

1.5. Why use JavaScript for data wrangling?

1.6. Is JavaScript appropriate for data analysis?

1.7. Navigating the JavaScript ecosystem

1.8. Assembling your toolbox

1.9. Establishing your data pipeline

1.9.1. Setting the stage

1.9.2. The data wrangling process

1.9.3. Planning

1.9.4. Acquisition, storage and retrieval

1.9.5. Exploratory coding

1.9.6. Clean and prepare

1.9.7. Analysis

1.9.8. Visualization

1.9.9. Getting to production

1.10. Summary

2. Getting started with Node.js

2.1. Starting your toolkit

2.2. Building a simple reporting system

2.3. Getting the code and data

2.4. Installing Node.js

2.5. Working with Node.js

2.5.1. Creating a Node.js project

2.5.2. Creating a command line application

2.5.3. Creating a code library

2.5.4. Creating a simple web server

2.6. Asynchronous coding primer

2.6.1. Loading a single file

2.6.2. Loading multiple files

2.6.3. Error handling

2.6.4. Asynchronous coding with promises

2.6.5. Wrapping asynchronous operations in promises

2.6.6. Available now in the latest version of Node.js: "async" and "await"

2.6.7. Coming soon to a JavaScript near you: "async" and "await"

2.7. Summary

3. Acquisition, storage and retrieval

3.1. Building out your toolkit

3.2. Getting the code and data

3.3. The core data representation

3.3.1. The earthquakes web site

3.3.2. Data formats covered

3.3.3. Power and flexibility

3.4. Importing data

3.4.1. Loading data from text files

3.4.2. Loading data from a REST API

3.4.3. Parsing JSON text data

3.4.4. Parsing CSV text data

3.4.5. Importing data from databases

3.4.6. Importing data from MongoDB

3.4.7. Importing data from MySQL

3.5. Exporting data

3.5.1. We need some data to export!

3.5.2. Exporting data to text files

3.5.3. Exporting data to JSON text files

3.5.4. Exporting data to CSV text files

3.5.5. Exporting data to a database

3.5.6. Exporting data to MongoDB

3.5.7. Exporting data to MySQL

3.6. Building complete data conversions

3.7. Expanding the process

3.8. Summary

4. Working with unusual data

4.1. Getting the code and data

4.2. Importing custom data from text files

4.3. Importing data by scraping web pages

4.3.1. Identifying the data to scrape

4.4. Working with binary data

4.4.1. Unpacking a custom binary file

4.4.2. Packing a custom binary file

4.4.3. Replacing JSON with BSON

4.5. Summary

5. Exploratory coding

5.1. Expanding your toolkit

5.2. Analyzing car accidents

5.3. Getting the code and data

5.4. Iteration and your feedback loop

5.5. A first pass at understanding our data

5.6. Working with a reduced data sample

5.7. Prototyping with Excel

5.8. Exploratory coding with Node.js

5.9. Exploratory coding in the browser

5.10. Putting it all together

5.11. Summary

6. Clean and prepare

6.1. Expanding our toolbox

6.2. Preparing the reef data

6.3. Getting the code

6.4. The need for data clean-up and preparation

6.5. Where does broken data come from?

6.6. How does data clean-up plug into your pipeline?

6.7. Identifying bad data

6.8. Kinds of problems

6.9. Responses to bad data

6.10. Techniques for fixing bad data

6.11. Cleaning our data set

6.11.1. Rewriting bad rows

6.11.2. Filtering rows of data

6.11.3. Filtering columns of data

6.12. Preparing our data for effective use

6.12.1. Aggregating rows of data

6.12.2. Combining data from different files using Globby

6.12.3. Splitting data into separate files

6.13. Building a data processing pipeline with Data-Forge

6.14. Summary

7. Dealing with huge data files

7.1. Expanding our toolbox

7.2. Fixing the temperature data

7.3. Getting the code and data

7.4. When conventional data processing breaks down

7.5. The limits of Node.js

7.6. Incremental data processing

7.6.1. Incremental core data representation

7.6.2. Node.js file streams basics primer

7.6.3. Transforming huge CSV files

7.6.4. Transforming huge JSON files

7.6.5. Mix and match

7.6.6. Incremental file transformation with Data-Forge

7.7. Summary

8. Working with a mountain of data

8.1. Expanding our toolbox

8.2. Dealing with a mountain of data

8.3. Getting the code and data

8.4. Techniques for working with big data

8.5. New limitations

8.6. Divide and conquer

8.7. Working with large databases

8.7.1. Database setup

8.7.2. Opening a connection to the database

8.7.3. Moving large files to your database

8.7.4. Incremental processing with a database cursor

8.7.5. Incremental processing with data windows

8.7.6. Creating an index

8.7.7. Filtering using queries

8.7.8. Discarding data with projection

8.7.9. Sorting large data sets

8.8. Achieving better data throughput

8.9. Optimize your code

8.9.1. Optimize your algorithm

8.9.2. Processing data in parallel

8.10. Summary

9. Practical data analysis

9.1. Toolbox upgrades

9.2. Analyzing the weather data

9.3. Getting the code and data

9.4. Basic data summarization

9.4.1. Sum

9.4.2. Average

9.4.3. Standard deviation

9.5. Group and summarize

9.6. The frequency distribution of temperatures

9.7. Time series

9.7.1. Yearlyaverage temperature

9.7.2. Rolling average

9.7.3. Rolling standard deviation

9.7.4. Linear regression

9.7.5. Comparing time series

9.7.6. Stacking time series operations

9.8. Understanding relationships

9.9. Summary

10. Browser-based interactive visualization (with C3)

11. Server-side static visualization (server-side chart rendering)

12. Live data (dealing with incoming data in your production system)

13. Advanced interactive visualization (with D3)

14. Getting to production


Appendix A: JavaScript cheat sheet

Appendix B: Data-Forge cheat sheet

Appendix C: Data wrangling toolset

Appendix D: Getting Started with Vagrant

About the Technology

JavaScript is capable of handling most common data collection, cleaning, analysis and presentation tasks just as easily as R or Python. With a growing ecosystem of tools and libraries available, and the flexibility to run on many platforms (web, desktop and mobile), JavaScript is a terrific all-round environment for all your data wrangling needs!

About the book

Data Wrangling with JavaScript teaches you the art of collecting, managing, cleaning, and analyzing data with JavaScript. In this practical book written with existing JavaScript developers in mind, you'll start by setting up your JavaScript and Node.js-based data wrangling pipeline. Then, you'll systematically work through core techniques for acquiring, storing, and retrieving data of all sorts, ranging from text and .csv files to databases and REST APIs. You'll explore JavaScript-based data tools like Globby and Data-Forge, manipulate huge datasets with Node.js, and deal with strange data types including web scraping and custom binary files. Master data wrangler Ashley Davis guides you through the most important data analysis skills and teaches you how to explore, understand and visualize your data. Because you'll be using real-world data at each step of the process, you'll be confident that you can apply your new skills immediately.

What's inside

  • Establishing a data pipeline
  • Acquisition, storage, and retrieval
  • How to handle unusual data sets
  • Cleaning and preparing raw data
  • Visualizing your results

About the reader

Written for developers with experience using JavaScript. No prior knowledge of data analytics is needed.

About the author

Ashley Davis is a software developer, entrepreneur, writer, and a stock trader. He is the creator of Data-Forge, a data transformation and analysis toolkit for JavaScript inspired by Pandas and Microsoft LINQ.

Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
Data Wrangling with JavaScript (combo) added to cart
continue shopping
go to cart

MEAP combo $49.99 pBook + eBook + liveBook
MEAP eBook $39.99 pdf + ePub + kindle + liveBook

FREE domestic shipping on three or more pBooks

Not only did I learn the data pipeline around data analysis, but also details on concepts and tools I've either heard of or thought I knew. This book is not only insightful but also invaluable to developers that need to analyze data; especially in the Javascript domain!

James Wang

Whoever said Python is the data language was deeply wrong! JavaScript's capabilities and tools for processing, visualizing and analyzing data are unrivaled.

Pablo Farias Navarro, Founder of Zenva

If you are after building production-ready, data-powered apps and dashboards, Ashley's book will provide you with battle-tested techniques, frameworks and good practices to developing professional server and client-side data projects from the ground-up.

Pablo Farias Navarro, Founder of Zenva