Three-Project Series

ML for Knowledge Graphs with Neo4j you own this product

prerequisites
intermediate Python • intermediate data mining • basic graph theory • intermediate machine learning
skills learned
experience loading and querying graphs on Neo4j via the Cypher language • construct a knowledge graph • run node2vec and weakly connected components algorithms using Neo4j • encode a graph structure into an embedding space for machine learning • build transductive models with pyKEEN • build inductive models with Stellargraph
John Maiden
3 weeks · 6-8 hours per week average · INTERMEDIATE

pro $24.99 per month

  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose one free eBook per month to keep
  • exclusive 50% discount on all purchases

lite $19.99 per month

  • access to all Manning books, including MEAPs!

team

5, 10 or 20 seats+ for your team - learn more


Take a bite out of the “Big Apple.” New York City real estate is a trillion-dollar market, and you’ve got your eye on off-the-market properties that are likely to double in value in the next couple of years. Being the skilled salesperson you are, you’re confident that you’ll convince the owners to sell to you at a fair price. To determine who the owners are, you’ll leverage Neo4j and other graph libraries to build a knowledge graph using real-world public government data and identify patterns. You’ll use simple graph and NLP techniques to prepare the data for machine learning models, and you’ll create a powerful recommendation engine by using the k-nearest neighbor (kNN) algorithm to identify similar properties. When you’re finished with these intuitive liveProjects, you’ll have firsthand experience applying machine learning to knowledge graphs using Neo4j.

These projects are designed for learning purposes and are not complete, production-ready applications or solutions.

This was a really interesting project that presented a real-word question/problem and applied different technologies to tackle it. The course presents a clear learning path with specific steps, and I can definitely see how I could transfer the skills and knowledge here to my own projects.

Samantha Berk, Software Development Engineer, AdaptX

liveProject mentor Amar Viswanathan shares what he likes about the Manning liveProject platform.

here's what's included

Project 1 Graphs from Real-World Data

New York City real estate is a trillion-dollar market, and you’ve got your eye on a collection of prime NYC properties whose value will likely double in a couple of years. They’re not currently for sale, but with your excellent sales skills, you’re confident you can get the owners to sell at a fair price, if you can only determine who the owners are. To obtain the owners’ contact data, you’ll construct a knowledge graph from publicly available data that contains tax records, property deeds, and permits. You’ll scan the data for the target owners, analyze the datasets for possible relationships, develop a knowledge schema that can extract insights into your use case, and load the data into Neo4j to query and visualize the connections. When you’re done, you’ll have practical experience applying widely used graph tools to real-world data, and you’ll understand how different choices for your graph schema can lead to different insights.

Project 2 Find Hidden Connections with Graphs

You’ve set your sights on a collection of prime off-the-market New York City properties whose value is likely to double in the next couple of years. With your excellent sales skills, you’re confident you can get the owners to sell at a fair price, if you can determine who the owners are. You have a knowledge graph—built from tax records, property deeds, and permits—that identifies entities associated with the properties. These entities might not be the true owners, but identifying them could help you determine who the true owners are.

Your task is to transform the knowledge graph into a representation that can be processed by a machine learning model later. Using simple graph and NLP techniques, including node2vec—one of the most influential algorithms in the graph community—you’ll improve the quality of the data by removing the noise from the knowledge graph. You’ll convert the nodes in your graph into embeddings, analyze how well your embeddings represent the underlying knowledge graph, and develop insights on tuning the node2vec hyperparameters. When you’re done, you’ll have learned techniques for analyzing and visualizing embeddings and associating them to the original graph, helping you determine who the “hidden” owners are.

Project 3 Leverage Embeddings

Create a powerful recommendation engine built from an ensemble of graph-based models that will help you tap into New York City’s real estate market by identifying groups of similar properties. You’ll start by working with transductive graph models (TransE and TransR) that are created specifically for knowledge graphs. Transductive learning takes observations from a specific set of training data and applies it to a specific set of test data. Next you’ll build an inductive model (GraphSAGE), which allows for generalized learning on new data (i.e. predictive modeling on previously unseen properties). Lastly, you’ll build the recommender system by using the k-Nearest Neighbor (kNN) algorithm to identify similar properties. When you’re done, you’ll have hands-on experience applying machine learning techniques to real-world knowledge graphs… and possibly a lucrative side hustle.

book resources

When you start each of the projects in this series, you'll get full access to the following book for 90 days.

choose your plan

team

monthly
annual
$49.99
$499.99
only $41.67 per month
  • five seats for your team
  • access to all Manning books, MEAPs, liveVideos, liveProjects, and audiobooks!
  • choose another free product every time you renew
  • choose twelve free products per year
  • exclusive 50% discount on all purchases
  • ML for Knowledge Graphs with Neo4j project for free

It is a quite comprehensive project series starting from basics of working with a graph db and ending with several techniques of machine learning on graphs.

Maxim Volgin, Quantitative Marketing Manager, KLM

The problem was set out well, and you understood what the project was about and what you would achieve during it.

Richard Vaughan, CTO, Purple Monkey Collective

project author

John Maiden

John Maiden is a software engineer with a focus on building recommendation systems in the social media space. He’s given presentations about his work at Data Council and ML Conf, and he’s talked about building knowledge graphs on the Data Engineering Podcast. John has built knowledge graphs for real estate at a startup and has worked at JP Morgan Chase, where he led a team that produced personalized insights that were delivered to millions of Chase customers. He has a BA from Hamilton College and a PhD in Physics from University of Wisconsin–Madison.

Prerequisites

These liveProjects are for data scientists who have a background in graph theory and machine learning and are interested in applying these techniques to knowledge graphs. To begin these liveProjects, you will need to be familiar with the following:

TOOLS
  • Intermediate Python (min. version 3.8), particularly the pandas and scikit-learn libraries
  • Intermediate experience with Jupyter Notebook
TECHNIQUES
  • Intermediate data mining
  • Basic graph theory
  • Intermediate machine learning (familiarity with the concept of embeddings and the k-nearest neighbor algorithm)

you will learn

In this liveProject series, you’ll learn to apply machine learning methods, tools, and techniques to knowledge graphs using Neo4j.

  • Use pandas to manipulate datasets
  • Query a Neo4j database using Cypher
  • Analyze graph data using Neo4j’s APOC and Graph Data Science (GDS) libraries
  • Leverage a graph method (weakly connected components) and NLP to clean a noisy graph
  • Apply the node2vec model to a graph
  • Analyze a graph embedding space using the k-nearest neighbors algorithm
  • View graph embeddings in a 2d space using the t-SNE method
  • Run inductive models (TransE, TransR) on a graph using pyKEEN
  • Run a transductive model (GraphSAGE) on a graph using Stellargraph
  • Create a recommendation system by combining a graph model with k-nearest neighbors

features

Self-paced
You choose the schedule and decide how much time to invest as you build your project.
Project roadmap
Each project is divided into several achievable steps.
Get Help
While within the liveProject platform, get help from other participants and our expert mentors.
Compare with others
For each step, compare your deliverable to the solutions by the author and other participants.
book resources
Get full access to select books for 90 days. Permanent access to excerpts from Manning products are also included, as well as references to other resources.