Welcome to Manning India!

We are pleased to be able to offer regional eBook pricing for Indian residents.
All eBook prices are discounted 40% or more!
Algorithms and Data Structures in Action
Marcello La Rocca
  • MEAP began July 2019
  • Publication in Fall 2020 (estimated)
  • ISBN 9781617295485
  • 325 pages (estimated)
  • printed in black & white

A different and a great way to introduce algorithms and data structures that can be used at work.

Ursula Cervantes

As a software engineer, you’ll encounter countless programming challenges that initially seem confusing, difficult, or even impossible. Don’t despair! Many of these “new” problems already have well-established solutions. Algorithms and Data Structures in Action teaches you powerful approaches to a wide range of tricky coding challenges that you can adapt and apply to your own applications. Providing a balanced blend of classic, advanced, and new algorithms, this practical guide upgrades your programming toolbox with new perspectives and hands-on techniques.

Table of Contents detailed table of contents

1 Introducing Data Structures

1.1 Welcome to Algorithms and Data Structures in Action

1.1.1 Why should you learn about Data Structures?

1.1.2 Algorithms and Data Structures: is there a difference?

1.1.3 Why should you read this book?

1.1.4 What you need to get started

1.2 Data Structures

1.2.1 Defining a Data Structure

1.2.2 Describing a Data Structure

1.3 Setting Goals: Your Expectations After Reading this Book

1.4 Packing your Knapsack: Data Structures Meet the Real World

1.4.1 Abstracting the Problem Away

1.4.2 Looking for Solutions

1.4.3 Algorithms to the Rescue

1.4.4 Thinking (literally) out of the Box

1.4.5 Happy Ending

1.5 Structure of This Book

1.6 Summary

Part 1: Improve Over Basic Data Structures

2 Improve priority queues: d-way heaps

2.1 Structure of this chapter

2.2 The problem: handling priority

2.2.1 Priority in practice: bug tracking

2.3 Solutions at hand: keeping a sorted list

2.3.1 From sorted lists to priority queues

2.4 Describing the data structure API: Priority queues

2.4.1 Priority Queue at work

2.4.2 Priority Matters: Generalize FIFO

2.5 Concrete Data Structures

2.5.1 Comparing performance

2.5.2 What’s the right concrete data structure?

2.5.3 Heap

2.5.4 Priority, min-heap and max-heap

2.5.5 Advanced variant: d-ary heap

2.6 How to implement a heap

2.6.1 BubbleUp

2.6.2 PushDown

2.6.3 Insert

2.6.4 Top

2.6.5 Update

2.6.6 Dealing with Duplicates

2.6.7 Heapify

2.6.8 Beyond API methods: contains

2.6.9 Performance recap

2.6.10 From Pseudo-code to implementation

2.7 Use case: find the k largest elements

2.7.1 The right data structure…​

2.7.2 …​ and the right use

2.7.3 Coding it up

2.8 More use-cases

2.8.1 Minimum distance in graphs: Dijkstra

2.8.2 Beyond Dijkstra: BFS, A*

2.8.3 More graphs: Prim’s algorithm

2.8.4 Data Compression: Huffman Codes

2.9 Analysis of branching factor[27]

2.9.1 Do we need D-ary heaps?

2.9.2 Running time

2.9.3 Finding the optimal branching factor

2.9.4 Branching factor vs memory

2.10 Performance Analysis: Finding the best Branching Factor

2.10.1 Please Welcome Profiling

2.10.2 Interpreting Results

2.10.3 The Mystery with Heapify

2.10.4 Choosing the Best Branching Factor

2.11 Summary

3 Treaps Using Randomization to Balance Binary Search Trees

3.1 Problem: Multi-Indexing

3.1.1 The Gist of the Solution

3.2 Solution: Description and API

3.3 Treap

3.3.1 Rotations

3.3.2 A Few Design Questions

3.3.4 Insert

3.3.5 Delete

3.3.6 Top, Peak, Update

3.3.7 Min, Max

3.3.8 Performance Recap

3.4 Applications: Randomized Treaps

3.4.1 Balanced Trees

3.4.2 Introducing Randomization

3.4.3 Applications of Randomized Treaps

3.5 Performance Analysis and Profiling

3.5.1 Theory: Expected Height

3.5.2 Profiling Height

3.5.3 Profiling Running Time

3.5.4 Profiling Memory Usage

3.5.5 Conclusions

3.6 Summary

4 Bloom Filters: Reduce the memory needed to keep track of content

4.1 The Dictionary Problem: Keeping Track of Things

4.2 Alternatives to Implement a Dictionary

4.3 Describing the Data Structure API: Associative Array

4.4 Concrete Data Structures

4.4.3 Hash Table: Constant-Time on Average, Unless You Need Ordering

4.4.4 Binary Search Tree: Every Operation is Logarithmic

4.4.5 Bloom Filter: as Fast as Hash Tables, But Saving Memory (with a catch)

4.5 Under the Hood - How Do Bloom Filters Work

4.6 Implementation

4.6.1 Using a Bloom Filter

4.6.2 Read and Write Bits

4.6.3 Find Where a Key is Stored

4.6.4 Generate Hash Functions

4.6.5 Constructor

4.6.6 Checking a Key

4.6.7 Storing a Key

4.6.8 Estimating Accuracy

4.7 Applications

4.7.1 Cache

4.7.2 Router

4.7.3 Crawler

4.7.4 IO Fetcher

4.7.5 Spell checker

4.7.6 Distributed Databases and File Systems

4.8 Why Bloom Filters Work

4.8.1 Why There Are No False Negatives…​

4.8.2 …​But there are false positives

4.8.3 Bloom Filters as Randomized Algorithms

4.9 Performance Analysis

4.9.1 Running Time

4.9.2 Constructor

4.9.3 Storing an Element

4.9.4 Looking Up an Element

4.10 Estimating Bloom Filter Precision

4.10.1 Explanation of the False-Positive Ratio Formula

4.11 Improved Variants

4.11.1 Bloomier Filter

4.11.2 Combining Bloom Filters

4.11.3 Layered Bloom Filter

4.11.4 Compressed Bloom Filter

4.11.5 Scalable Bloom Filter

4.12 Summary

5 Disjoint Set: Sub-linear-time processing of disjoint sets

6 Trie, Radix Trie: Efficient Strings Search

7 Use case: LRU Cache

Part 2: Multi-dimensional queries

8 Nearest Neighbors search

8.1 The Nearest Neighbors Search Problem

8.2 Solutions

8.2.1 First Attempts

8.2.2 Sometimes Caching is NOT the Answer

8.2.3 Simplify Things to Get a Hint

8.2.4 Carefully Choose a Data Structure

8.3 Description and API

8.4 Moving to k-dimensional Spaces

8.4.2 Moving to Higher Dimensions

8.4.3 Modeling 2D Partitions with a Data Structure

8.5 Summary

9 K-d Trees: Multi-dimensional Data Indexing

9.1 Right Where We Left

9.2 Moving to k-D Spaces: Cycle Through Dimensions

9.2.1 Constructing the BST

9.2.2 Invariants

9.2.3 The Importance of Being Balanced

9.3 Methods

9.3.2 Insert

9.3.3 Balanced Tree

9.3.4 Remove

9.3.5 Nearest Neighbor

9.3.7 A Recap of all Methods

9.4 Limits and Possible Improvements

9.5 Summary

10 Similarity Search Trees: Approximate Nearest Neighbors Search for Image Retrieval

10.1 Right Where We Left

10.1.1 A New (More Complex) Example

10.1.2 Overcoming k-d trees Flaws

10.2 R-tree

10.2.1 A step back: Introducing B-trees

10.2.2 From B-Tree to R-tree

10.2.3 Inserting Points in an R-tree

10.3 Similarity Search Tree

10.3.2 Insert

10.3.3 Insertion: Variance, Means, and Projections

10.3.4 Insertion: Split Nodes

10.3.5 Delete

10.5 Ss+-Tree

10.5.1 Are Ss-trees Better?

10.5.2 Mitigating Hyper-Spheres Limitations

10.5.3 Improved Split Heuristic

10.5.4 Reducing Overlap

10.6 Summary

11 Applications of Nearest Neighbors Search

11.1 An Application: Find Nearest Hub

11.1.1 Sketching a Solution

11.1.2 Trouble in Paradise

11.2 Centralized Application

11.2.1 Filtering Points

11.2.2 Complex Decisions

11.3 Moving to a Distributed Application

11.3.1 Issues Handling HTTP Communication

11.3.2 Keeping the Inventory in Sync

11.3.3 Lessons Learned

11.4 Other Applications

11.4.1 Color reduction

11.4.2 Particle Interaction

11.4.3 Multidimensional DB queries optimization

11.4.4 Clustering

11.5 Summary

12 Clustering

12.1 Intro to Clustering

12.1.1 Types of Learning

12.1.2 Types of Clustering

12.2 K-means

12.2.1 Issues With K-means

12.2.2 The Curse of Dimensionality Strikes Again

12.2.3 K-means Performance Analysis

12.2.4 Boosting k-means with k-d Trees

12.2.5 Final Remarks on k-means


12.3.1 Directly vs Density-Reachable

12.3.2 From Definitions to an Algorithm

12.3.3 And Finally, an Implementation

12.3.4 PROs and CONs With DBSCAN


12.4.1 Definitions

12.4.2 OPTICS Algorithm

12.4.3 From Reachability Distance to Clustering

12.4.4 Hierarchical clustering

12.4.5 Performance Analysis and Final Considerations

12.5 Evaluating Clustering Results: Evaluation Metrics

12.5.1 Interpreting the Results

12.6 Summary

13 Parallel Clustering: Map-Reduce and Canopy Clustering

13.1 Parallelization

13.1.1 Parallel vs Distributed

13.1.2 Parallelizing k-means

13.1.3 Canopy Clustering

13.1.4 Applying Canopy Clustering

13.2 MapReduce

13.2.1 Imagine You Are Donald Duck…

13.2.2 First Map, Then Reduce

13.2.3 There is More, Under the Hood

13.3 MapReduce k-means

13.3.1 Parallelizing Canopy Clustering

13.3.2 Centroid Initialization with Canopy Clustering

13.3.3 MapReduce Canopy Clustering

13.4 MapReduce DBSCAN

13.5 Summary

Part 3: Planar Graphs and Minimum Crossing Number

14 An Introduction to Graphs: Finding Paths of Minimum Distance

15 Graph Embeddings and Planarity: Draw Graphs with Minimal Edges Intersections

16 Gradient Descent: Finding Minimum Crossing Number

17 Simulated Annealing: Extending Gradient Descent to Avoid Local Minimums

18 Genetic Algorithms: Converging Faster Towards Global Minimum


Appendix A: A Quick Guide to Pseudo-Code

A.1 Variables and Basics

A.2 Arrays

A.3 Conditional Instructions

A.3.1 Else-if

A.3.2 Switch

A.4 Loops

A.4.1 For Loop

A.4.2 While Loop

A.4.3 Break and Continue

A.5 Blocks and Indent

A.6 Functions

A.6.1 Overloading and default arguments

A.6.2 Tuples

A.6.3 Tuples and destructuring objects

Appendix B: Big-O Notation

B.1 Algorithms and Performance

B.2 The RAM Model

B.3 Order of Magnitude

B.4 Notation

B.5 Examples

Appendix C: Core Data Structures

C.1 Core Data Structures

C.1.1 Array

C.1.2 Linked List

C.1.3 Tree

C.1.4 Starting drag operations

C.2 Hash Table

C.2.1 Storing key-value pairs

C.2.2 Hashing

C.2.3 Conflicts Resolution in Hashing

C.2.4 Performance

C.3 Comparative Analysis of Core Data Structures

Appendix D: Containers as priority queues

D.1 Bag

D.2 Stack

D.3 Queue

D.4 A Comparative Analysis for Containers

Appendix E: Recursion

E.1 Simple Recursion

E.1.1 Pitfalls

E.1.2 Good Recursion

E.1.3 Tail Recursion

E.1.4 Mutual Recursion

Appendix F: Classification problems Randomized algorithms Metrics

F.1 Decision Problems

F.2 Las Vegas Algorithms

F.3 Monte Carlo Algorithms

F.4 Classification Metrics

F.4.1 Accuracy

F.4.2 Precision and Recall

F.4.3 Other Metrics + Recap

About the Technology

Data structures and algorithms are the foundations for how programs store and process information. Choosing the optimal algorithms ensures that your programs are fast, efficient, and reliable.

About the book

Algorithms and Data Structures in Action introduces you to a diverse range of algorithms you’ll use in web applications, systems programming, and data manipulation. Chapter-by-chapter, the book expands on the basic algorithms you’ll already know to give you a better selection of solutions to different programming problems. In it, you’ll discover techniques for improving priority queues, efficient caching, clustering data, and more. Each example is fully illustrated with graphics, language-agnostic pseudo-code, and code samples in various languages. When you’re done, you will be able to implement advanced and little-known algorithms to deliver better performance from your code. You’ll even have the skills to design your own data structures to solve those cases when a custom solution is required.

What's inside

  • Improving on basic data structures
  • Efficient caching
  • Nearest neighbour search, including k-d trees and S-trees
  • Full ‘pseudo-code’ and samples in multiple languages

About the reader

For programmers with basic or intermediate skills. Written in a language-agnostic manner, no specific language knowledge is required.

About the author

Marcello La Rocca is a research scientist and a full-stack engineer focused on optimization algorithms, genetic algorithms, machine learning and quantum computing. He has contributed to large-scale web applications at companies like Twitter and Microsoft, has undertaken applied research in both academia and industry, and authored the Neatsort adaptive sorting algorithm.

Manning Early Access Program (MEAP) Read chapters as they are written, get the finished eBook as soon as it’s ready, and receive the pBook long before it's in bookstores.
MEAP combo $59.99 pBook + eBook + liveBook
MEAP eBook $23.99 $47.99 pdf + ePub + kindle + liveBook
Prices displayed in rupees will be charged in USD when you check out.

placing your order...

Don't refresh or navigate away from the page.

FREE domestic shipping on three or more pBooks