HBase in Action
Nicholas Dimiduk and Amandeep Khurana
Foreword by Michael Stack
  • November 2012
  • ISBN 9781617290527
  • 360 pages
  • printed in black & white

Timely, practical ... explains in plain language how to use HBase.

From the Foreword by Michael Stack, Chair of the Apache HBase Project Management Committee

HBase in Action has all the knowledge you need to design, build, and run applications using HBase. First, it introduces you to the fundamentals of distributed systems and large scale data handling. Then, you'll explore real-world applications and code samples with just enough theory to understand the practical techniques. You'll see how to build applications with HBase and take advantage of the MapReduce processing framework. And along the way you'll learn patterns and best practices.

Table of Contents show full

foreword

letter to the HBase community

preface

acknowledgments

about this book

about the authors

about the cover illustration

Part 1 HBase fundamentals

1. Introducing HBase

1.1. Data-management systems: a crash course

1.2. HBase use cases and success stories

1.3. Hello HBase

1.4. Summary

2. Getting started

2.1. Starting from scratch

2.2. Data manipulation

2.3. Data coordinates

2.4. Putting it all together

2.5. Data models

2.6. Table scans

2.7. Atomic operations

2.8. ACID semantics

2.9. Summary

3. Distributed HBase, HDFS, and MapReduce

3.1. A case for MapReduce

3.2. An overview of Hadoop MapReduce

3.3. HBase in distributed mode

3.4. HBase and MapReduce

3.5. Putting it all together

3.6. Availability and reliability at scale

3.7. Summary

Part 2 Advanced concepts

4. HBase table design

4.1. How to approach schema design

4.2. De-normalization is the word in HBase land

4.3. Heterogeneous data in the same table

4.4. Rowkey design strategies

4.5. I/O considerations

4.6. From relational to non-relational

4.7. Advanced column family configurations

4.8. Filtering data

4.9. Summary

5. Extending HBase with coprocessors

5.1. The two kinds of coprocessors

5.2. Implementing an observer

5.3. Implementing an endpoint

5.4. Summary

6. Alternative HBase clients

6.1. Scripting the HBase shell from UNIX

6.2. Programming the HBase shell using JRuby

6.3. HBase over REST

6.4. Using the HBase Thrift gateway from Python

6.5. Asynchbase: an alternative Java HBase client

6.6. Summary

Part 3 Example applications

7. HBase by example: OpenTSDB

7.1. An overview of OpenTSDB

7.2. Designing an HBase application

7.3. Implementing an HBase application

7.4. Summary

8. Scaling GIS on HBase

8.1. Working with geographic data

8.2. Designing a spatial index

8.3. Implementing the nearest-neighbors query

8.4. Pushing work server-side

8.5. Summary

Part 4 Operationalizing HBase

9. Deploying HBase

9.1. Planning your cluster

9.2. Deploying software

9.3. Distributions

9.4. Configuration

9.5. Managing the daemons

9.6. Summary

10. Operations

10.1. Monitoring your cluster

10.2. Performance of your HBase cluster

10.3. Cluster management

10.4. Backup and replication

10.5. Summary

Appendix A: Exploring the HBase system

Appendix B: More about the workings of HDFS

index

© 2014 Manning Publications Co.

About the Technology

HBase is a NoSQL storage system designed for fast, random access to large volumes of data. It runs on commodity hardware and scales smoothly from modest datasets to billions of rows and millions of columns.

About the book

HBase in Action is an experience-driven guide that shows you how to design, build, and run applications using HBase. First, it introduces you to the fundamentals of handling big data. Then, you'll explore HBase with the help of real applications and code samples and with just enough theory to back up the practical techniques. You'll take advantage of the MapReduce processing framework and benefit from seeing HBase best practices in action.

What's inside

  • When and how to use HBase
  • Practical examples
  • Design patterns for scalable data systems
  • Deployment, integration, and design

About the reader

Written for developers and architects familiar with data storage and processing. No prior knowledge of HBase, Hadoop, or MapReduce is required.

About the authors

Nick Dimiduk is a Data Architect with experience in social media analytics, digital marketing, and GIS. Amandeep Khurana is a Solutions Architect focused on building HBase-driven solutions.


combo $39.99 pBook + eBook
eBook $31.99 pdf + ePub + kindle

FREE domestic shipping on three or more pBooks

A difficult topic lucidly explained.

John Griffin, coauthor of "Hibernate Search in Action"

Amusing tongue-in-cheek style that doesn’t detract from the substance.

Charles Pyle, APS Healthcare

Learn how to think the HBase way.

Gianluca Righetto, Menttis