Graph Databases in Action
Examples in Gremlin
Dave Bechberger, Josh Perryman
  • October 2020
  • ISBN 9781617296376
  • 336 pages
  • printed in black & white
pBook available Nov 2, 2020
ePub + Kindle available Nov 16, 2020

A comprehensive overview of graph databases and how to build them using Apache tools.

Richard Vaughan, Purple Monkey Collective
Relationships in data often look far more like a web than an orderly set of rows and columns. Graph databases shine when it comes to revealing valuable insights within complex, interconnected data such as demographics, financial records, or computer networks. In Graph Databases in Action, experts Dave Bechberger and Josh Perryman illuminate the design and implementation of graph databases in real-world applications. You'll learn how to choose the right database solutions for your tasks, and how to use your new knowledge to build agile, flexible, and high-performing graph-powered applications!

About the Technology

Isolated data is a thing of the past! Now, data is connected, and graph databases—like Amazon Neptune, Microsoft Cosmos DB, and Neo4j—are the essential tools of this new reality. Graph databases represent relationships naturally, speeding the discovery of insights and driving business value.

About the book

Graph Databases in Action introduces you to graph database concepts by comparing them with relational database constructs. You'll learn just enough theory to get started, then progress to hands-on development. Discover use cases involving social networking, recommendation engines, and personalization.
Table of Contents detailed table of contents

Part 1: Getting Started with Graph Databases

1 Introduction to Graphs

1.1 What is a graph?

1.1.1 What is a graph database?

1.1.2 Comparison with other types of databases

1.1.3 Why Can’t I Use SQL?

1.2 Is my problem a graph problem?

1.2.1 Explore the questions

1.2.2 I’m still confused… Is this a graph problem?

1.4 Summary

2 Graph Data Modeling

2.1 The Data Modeling Process

2.1.1 Data Modelling Terms

2.1.2 Four Step Process for Data Modeling

2.2. Understand the problem

2.3 Developing the whiteboard model

2.3.1 Identifying and grouping entities

2.3.2 Identifying relationships between entities

2.4 Constructing the logical data model

2.4.1 Translate entities to vertices

2.4.2 Translate relationships to edges

2.4.3 Find and assign properties

2.5 Check our model

2.6 Summary

3 Running Basic and Recursive Traversals

3.1 Setting up your Environment

3.2 Traversing a graph

3.2.1 Fundamental Concepts of Traversing a Graph

3.2.2 Writing traversals in Gremlin

3.3 Recursive Traversals

3.3.1 Writing Recursive Traversals in Gremlin

3.4 Summary

4 Pathfinding Traversals and Mutating a Graph

4.1 Mutating a Graph

4.1.1 Creating Vertices and Edges

4.1.2 Removing Data From our Graph

4.1.3 Updating a Graph

4.1.4 Extending our Graph

4.2 Paths

4.2.1 Cycles in Graphs

4.2.2 Finding the Simple Path

4.3 Traversing and Filtering Edges

4.3.1 Introduction of “E” and “V” steps for Traversing Edges

4.3.2 Filtering with Edge Properties

4.3.3 Include Edges in Path Results

4.3.4 Performant Edge Counts and Denormalization

4.4 Summary

5 Formatting Results

5.1 Review of Values Steps

5.2 Constructing our Result Payload

5.2.1 Applying Aliases in Gremlin

5.3 Organizing our Results

5.3.1 Ordering results returned from a graph traversal

5.3.2 Grouping results returned from a graph traversal

5.3.3 Limiting Results

5.4 Combing steps into complex traversals

5.5 Summary

6 Developing an Application

6.1 Starting the project

6.1.1 Selecting our Tools

6.1.2 Software Project Setup

6.1.3 Obtaining a Driver: Apache TinkerPop’s Gremlin Driver

6.1.4 Prepare Database Server Instance

6.2 Connecting to our database

6.2.1 Build the cluster configuration

6.2.2 Setup the GraphTraversalSource

6.3 Retrieving Data

6.3.1 Retrieving a Vertex

6.4 Adding/Modifying/Deleting data

6.4.1 Adding Vertices

6.4.2 Adding Edges

6.4.3 Updating Properties

6.4.4 Deleting Elements

6.5 Translating our List and Path Traversals

6.5.1 Lists of Results

6.5.2 Implement recursive traversals

6.5.3 Implementing Paths

6.6 Summary

Part 2: Building on Graph Databases

7 Advanced Data Modeling Techniques

7.1 Reviewing our Current Data Models

7.2 Extending our Logical Data Model

7.3 Translate Entities to Vertices

7.3.1 Generic Labels

7.3.2 Data Denormalization

7.3.3 Translate Relationships to Edges

7.3.4 Find and Assign Properties

7.3.5 Moving Properties to Edges

7.3.6 Check our Model

7.4 Extending our Data Model for Personalization

7.5 Comparing the Results

7.6 Summary

8 Building Traversals Using Known Walks

8.1 Preparing to develop our traversals

8.1.1 Identifying the required elements

8.1.2 Selecting a starting place

8.2 Setting Up Test Data

8.3 Writing Our First Traversal

8.3.1 Designing Our Traversal

8.3.2 Developing the Traversal Code

8.4 Pagination and graph databases

8.5 Recommending the Highest Rated Restaurants

8.5.1 Designing Our Traversal

8.5.2 Developing our Traversal Code

8.6 Writing the Last Recommendation Engine Traversal

8.7 Summary

9 Working with Subgraphs

9.1 Working with Subgraphs

9.1.1 Extracting a Subgraph

9.1.2 Traversing a Subgraph

9.2 Building a subgraph for personalization

9.3 Building the traversal

9.3.1 Evaluating the individualized results of the subgraph

9.4 Implementing a subgraph() with a remote connection

9.5 Summary

Part 3: Moving Beyond the Basics

10 Performance, Pitfalls and Anti-patterns

10.1 Slow performing traversal

10.1.1 Explaining our traversal

10.1.2 Profiling our traversal

10.1.3 Indexes

10.2 Dealing with supernodes

10.2.1 What makes a supernode?

10.2.2 Monitoring for supernodes

10.2.3 What to do if you have a supernode

10.3 Application anti-patterns

10.3.1 Using graphs for non-graph use cases

10.3.2 “Dirty” Data

10.3.3 Lack of adequate testing

10.4 Traversal anti-patterns

10.4.1 Not using parameterized traversals

10.4.2 Using unlabeled filtering steps

10.5 Summary

11 What’s Next: Graph Analytics, Machine Learning, Resources

11.1 Graph Analytics

11.1.1 Path Finding

11.1.2 Centrality

11.1.3 Community Detection

11.1.4 Graphs and Machine Learning

11.1.5 Additional Resources

11.2 Final Thoughts

11.3 Summary


Appendix A: Apache TinkerPop Installation and Overview

A.1 Overview

A.1.1 Gremlin Traversal Language

A.1.2 TinkerGraph

A.1.3 Gremlin Console

A.1.4 Gremlin Language Variants

A.1.5 Gremlin Server

A.1.6 Documentation

A.2 Installation

A.2.1 Install and Verify the Java Runtime

A.2.2 Install Gremlin Console

A.2.3 Install Gremlin Server

A.2.4 Configure Gremlin Console to Connect to Gremlin Server

A.2.5 Using the Gremlin Console

What's inside

  • Graph databases vs. relational databases
  • Systematic graph data modeling
  • Querying and navigating a graph
  • Graph patterns
  • Pitfalls and antipatterns

About the reader

For software developers. No experience with graph databases required.

About the authors

Dave Bechberger and Josh Perryman have decades of experience building complex data-driven systems and have worked with graph databases since 2014.

placing your order...

Don't refresh or navigate away from the page.
print book $49.99 pBook + eBook + liveBook
Additional shipping charges may apply
Graph Databases in Action (print book) added to cart
continue shopping
go to cart

eBook $39.99 3 formats + liveBook
Graph Databases in Action (eBook) added to cart
continue shopping
go to cart

Prices displayed in rupees will be charged in USD when you check out.
customers also reading

This book 1-hop 2-hops 3-hops

FREE domestic shipping on three or more pBooks