In this article, we will briefly look at the capabilities of HBase, compare it against technologies that we are already familiar with and look at the underlying architecture. In the upcoming parts, we will explore the core data model and features that enable it to store and manage semi-structured data.
HBase is a column-oriented database that’s an open-source implementation of Google’s Big Table storage architecture. It can manage structured and semi-structured data and has some built-in features such as scalability, versioning, compression and garbage collection. Since its uses write-ahead logging and distributed configuration, it can provide fault-tolerance and quick recovery from individual server failures. HBase built on top of Hadoop / HDFS and the data stored in HBase can be manipulated using Hadoop’s MapReduce capabilities.