Exploring Data with R
Richard Iannone
  • MEAP began October 2019
  • Publication in Spring 2021 (estimated)
  • ISBN 9781617295904
  • 475 pages (estimated)
  • printed in black & white

An excellent and imminently workable introduction to R that rivals the best books out there.

Erik Sapper
Explore data like a pro using R! R is a powerful, open source programming language built specifically for data analysis. It includes a huge array of standard functions for common data science tasks, along with an incredible ecosystem of free tools and packages for organizing, interpreting, and presenting your data. Exploring Data with R is an easy to use book that teaches you how to explore business and research data using R without unnecessarily complicated programming or specialized mathematics. Filled with clear explanations of key data analysis tasks, reusable code snippets, and hands-on exercises, it’s the perfect way to upgrade your data wrangling skills with R!

About the Technology

R is a data analysis platform designed to make statistical computing more practical and accessible. The R language and its extensive ecosystem includes a massive toolbox of libraries and packages for almost any challenge you’ll need to crack! While it’s powerful enough to do complex, large-scale data science projects, you can get excellent results with minimal programming through experience you already have from working with tools like Excel. R scripts are easy to reuse by yourself and others, and nearly every component of R is free and open source.

About the book

In Exploring Data with R, veteran data enthusiast Richard Iannone teaches you to explore the most common types of data using the powerful combination of R and the RStudio IDE. Focusing on the skills you’ll need for data analysis, you’ll start by using the RStudio environment to write your first R scripts. From there, you’ll learn how to load and use R Tidyverse packages such as dplyr, ggplot, and tidyr to transform and clean up messy data, create beautiful charts, generate reports, and develop components you can reuse in later projects. You’ll even get to grips with some advanced topics such as automation of reporting, data validation, and creating your own R packages! Filled with fun exercises to test your knowledge and hands-on projects to work through each chapter, this book is the perfect place to start your journey into data science!
Table of Contents detailed table of contents

Part 1: First Steps

1 Starting with R

1.1 Understanding the RStudio Desktop Environment

1.2 Using R Markdown for Reproducible Reporting

1.3 Our Learning Path

1.4 Where to go for Help

1.5 Summary

2 Introductory Data Transformation with dplyr

2.1 Programming Basics and Package Installation

2.1.1 The Basics of Assignment

2.1.2 Installing and Loading Packages in R

2.2 Using dplyr to Transform Data

2.2.1 The Main dplyr Functions

2.2.2 The sw Dataset and the Elements of a Table

2.2.3 filter: Picking Observations by their Values

2.2.4 arrange: Reordering Rows

2.2.5 select: Picking Variables by their Names

2.2.6 mutate: Creating New Variables with Expressions

2.2.7 summarize: Collapsing Many Values Down to a Single Summary

2.2.8 Bringing All of This Together with the Pipe (&>&)

2.3 Creating Our Own Tabular Data

2.3.1 Creating Tibbles with the tibble() Function

2.3.2 Creating Tibbles a Different Way with the tribble() Function

2.4 Exercises

2.5 Answers to Exercises

2.6 Summary

3 Introductory Data Visualization with ggplot

3.1 Using ggplot to Create Plots

3.1.1 Making Simple ggplot Scatterplots

3.1.2 Facets and the act of faceting in ggplot

3.1.3 Working with labels and titles

3.1.4 Modifying the location of legends

3.1.5 Modifying your dataset, plotting…​ and modifying again.

3.2 Exercises

3.3 Answers to Exercises

3.4 Summary

Part 2: The Core

4 Tidying data for analysis

4.1 What Is Tidy Data?

4.2 Using tidyr to Tidy Our Tables

4.2.1 Identifying Untidiness and Proposing Some Solutions to Tidy Up

4.2.2 Addressing Untidyness by Using the pivot_longer() Function

4.2.3 Using the separate() Function to Split a Column into Several

4.2.4 Inspecting Our Tidied Data by Plotting with ggplot

4.2.5 Replacing Missing Values with Actual NAs

4.3 Exercises

4.4 Answers to Exercises

4.5 Summary

5 Importing Data from Common Formats

5.1 Managing Projects and Files

5.1.1 Creating an RStudio Project to Better Manage our Working Files

5.1.2 Writing a sample CSV file to the project directory

5.2 Using readr to Import Data from CSV Files

5.2.1 Reading in our sample CSV file with the read_csv() function

5.2.2 Additional read_csv() options for non-standard CSVs

5.3 Using readxl to import Excel data

5.3.1 Writing an example Excel file to the project directory

5.3.2 Reading in our sample Excel file with the read_excel() function

5.4 Exercises

5.5 Answers to Exercises

5.6 Summary

6 Transformations Involving Dates and Times

6.1 Dates, Date-Times, and ISO 8601

6.2 Using lubridate to Parse Date and Date-Time Strings

6.2.1 Parsing Dates and Getting R Date Values

6.2.2 Parsing Date-Times and Getting POSIXct Values

6.3 Transforming Dates and Times in Tabular Data

6.3.1 Creating a Reasonable Time Series Plot with ggplot

6.3.2 Further Transforming the Time Series Data to Make a New Plot

6.4 Summary

7 A Closer Look at R Programming

7.1 Vectors in R

7.1.1 Creating Vectors with the c() Function

7.1.2 Doing Math with Vectors and Understanding Recycling

7.1.3 Subsetting Vectors

7.2 Data Frames and Tibbles

7.2.1 Extracting Vectors from Tibbles

7.2.2 Using the summary() Function with Tibbles and Vectors

7.3 Useful Base R Functions

7.3.1 Assignment and Operators

7.3.2 Logical Operators and Set Functions

7.3.3 Comparisons

7.3.4 Math

7.3.5 Vectors, Data Frames, and Lists

7.3.6 Control Flow

7.3.7 Creating Functions

7.3.8 Working with Character Strings

7.4 Writing Our Own Functions

7.4.1 Creating our First R Functions

7.4.2 A More Realistic Scenario for Writing a Function

7.4.3 Coding Defensively and Checking Inputs within the Function Body

7.5 Summary

8 String Transformations

8.1 How Strings and Character Vectors Work in R

8.1.1 Making Simple Strings and Character Vectors

8.1.2 Strings in Data Frames and Tibbles

8.2 Different Ways to Format Text

8.2.1 Formatting Numbers to Strings with formatC()

8.2.2 Simple String Transformations with base R Functions

8.3 Using Regular Expressions to Work with Text

8.3.1 Regex Basics: Matching Characters and Using Escapes

8.4 Character Sets and Character Classes

8.5 Repetition, Laziness, and Greediness

8.6 Anchors

8.7 Grouping, Capturing, and Backreferences

8.8 Using Regular Expression in stringr Functions

8.9 Summary

9 R Lists and Factors

9.1 Lists in R

9.1.1 Making Named Lists and Accessing Their Elements

9.1.2 Working with Unnamed Lists

9.1.3 Modifying Elements of Lists

9.1.4 Transforming Lists

9.1.5 Creating Functions that Involve Lists

9.2 All About Factors

9.2.1 Factors Basics

9.2.2 Plotting data with factor variables

9.2.3 Plotting data with more advanced treatments of factors

9.3 Summary

Part 3: Digging Deeper

10 Advanced Plotting with ggplot – I

10.1 Making Line Graphs

10.1.1 Using geom_line()

10.1.2 Using geom_line() and geom_point()

10.1.3 Using geom_area() to make a line graph

10.2 Working with Bar Plots

10.2.1 Vertical Bar Plots

10.2.2 Clustered Bar Plots

10.2.3 Stacked Bar Plots

10.3 Summary

11 Advanced Plotting with ggplot – II

11.1 Lollipop Plots and Cleveland Dot Plots

11.2 Creating Effective Scatter Plots

11.3 Plotting Distributions

11.3.1 Histograms

11.3.2 Box Plots

11.3.3 Violin Plots

11.3.4 Density Plots

11.3.5 Ridgeline Plots

11.4 Summary

12 Making it Look Good

13 Revisiting Data Transformation with dplyr

14 Creating an R Package

What's inside

  • Navigating the RStudio interface like a pro
  • Loading and using R Tidyverse packages to do useful data analysis work
  • Communicating and visualizing data using R Markdown and presentation software
  • Full examples using custom datasets and functions
  • Reference materials you can use again and again
  • Recommendations on where to go next

About the reader

For anyone interested in working with data. No programming experience required.

About the author

Richard Iannone is a software engineer at RStudio and he’s been using R for almost 10 years. He has developed numerous R packages focusing on data visualization, data validation, and data transformation.

placing your order...

Don't refresh or navigate away from the page.
Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
print book $29.99 $49.99 pBook + eBook + liveBook
Additional shipping charges may apply
Exploring Data with R (print book) added to cart
continue shopping
go to cart

eBook $24.99 $39.99 3 formats + liveBook
Exploring Data with R (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.

FREE domestic shipping on three or more pBooks