Prerequisites -
Introduction to Hadoop, Apache HBase
HBase architecture has 3 main components: HMaster, Region Server, Zookeeper.

Figure - Architecture of HBase
All the 3 components are described below:
- HMaster -
The implementation of Master Server in HBase is HMaster. It is a process in which regions are assigned to region server as well as DDL (create, delete table) operations. It monitor all Region Server instances present in the cluster. In a distributed environment, Master runs several background threads. HMaster has many features like controlling load balancing, failover etc.
- Region Server -
HBase Tables are divided horizontally by row key range into Regions. Regions are the basic building elements of HBase cluster that consists of the distribution of tables and are comprised of Column families. Region Server runs on HDFS DataNode which is present in Hadoop cluster. Regions of Region Server are responsible for several things, like handling, managing, executing as well as reads and writes HBase operations on that set of regions. The default size of a region is 256 MB.
- Zookeeper -
It is like a coordinator in HBase. It provides services like maintaining configuration information, naming, providing distributed synchronization, server failure notification etc. Clients communicate with region servers via zookeeper.
Advantages of HBase -
- Can store large data sets
- Database can be shared
- Cost-effective from gigabytes to petabytes
- High availability through failover and replication
Disadvantages of HBase -
- No support SQL structure
- No transaction support
- Sorted only on key
- Memory issues on the cluster
Comparison between HBase and HDFS:
- HBase provides low latency access while HDFS provides high latency operations.
- HBase supports random read and writes while HDFS supports Write once Read Many times.
- HBase is accessed through shell commands, Java API, REST, Avro or Thrift API while HDFS is accessed through MapReduce jobs.
Features of HBase architecture :
Distributed and Scalable: HBase is designed to be distributed and scalable, which means it can handle large datasets and can scale out horizontally by adding more nodes to the cluster.
Column-oriented Storage: HBase stores data in a column-oriented manner, which means data is organized by columns rather than rows. This allows for efficient data retrieval and aggregation.
Hadoop Integration: HBase is built on top of Hadoop, which means it can leverage Hadoop's distributed file system (HDFS) for storage and MapReduce for data processing.
Consistency and Replication: HBase provides strong consistency guarantees for read and write operations, and supports replication of data across multiple nodes for fault tolerance.
Built-in Caching: HBase has a built-in caching mechanism that can cache frequently accessed data in memory, which can improve query performance.
Compression: HBase supports compression of data, which can reduce storage requirements and improve query performance.
Flexible Schema: HBase supports flexible schemas, which means the schema can be updated on the fly without requiring a database schema migration.
Note - HBase is extensively used for online analytical operations, like in banking applications such as real-time data updates in ATM machines, HBase can be used.
Similar Reads
Apache HBase Prerequisite - Introduction to Hadoop HBase is a data model that is similar to Google's big table. It is an open source, distributed database developed by Apache software foundation written in Java. HBase is an essential part of our Hadoop ecosystem. HBase runs on top of HDFS (Hadoop Distributed Fil
5 min read
NoSQL Data Architecture Patterns Architecture Pattern is a logical way of categorizing data that will be stored on the Database. NoSQL is a type of database which helps to perform operations on big data and store it in a valid format. It is widely used because of its flexibility and a wide variety of services. Architecture Patterns
4 min read
Introduction of 3-Tier Architecture in DBMS The 3-Tier Architecture is one of the most popular and effective architectural models in the design and development of modern database-driven applications. It is widely used in Database Management Systems (DBMS) for organizing and managing complex data interactions across various layers of an applic
7 min read
DBMS Architecture 1-level, 2-Level, 3-Level A database stores important information that needs to be accessed quickly and securely. Choosing the right DBMS architecture is essential for organizing, managing, and maintaining the data efficiently. It defines how users interact with the database to read, write, or update information. The schema
7 min read
The Three-Level ANSI-SPARC Architecture In 1971, DBTG(DataBase Task Group) realized the requirement for a two-level approach having views and schema and afterward, in 1975, ANSI-SPARC realized the need for a three-level approach with the three levels of abstraction comprises of an external, a conceptual, and an internal level. The three-l
2 min read
Features of HP Vertica Below are the features of HP Vertica and why you should use it apart form the traditional databases management systems. HP Vertica is database product that is used for handling huge amounts of data or big data. It is relational database management system that is built for analytics purpose. Features
3 min read