- HBase - Security
- HBase - Count & Truncate
- HBase - Scan
- HBase - Delete Data
- HBase - Read Data
- HBase - Update Data
- HBase - Create Data
- HBase - Client API
- HBase - Shutting Down
- HBase - Drop a Table
- HBase - Exists
- HBase - Describe & Alter
- HBase - Enabling a Table
- HBase - Disabling a Table
- HBase - Listing Table
- HBase - Create Table
- HBase - Admin API
- HBase - General Commands
- HBase - Shell
- HBase - Installation
- HBase - Architecture
- HBase - Overview
- HBase - Home
HBase Resources
Selected Reading
- Who is Who
- Computer Glossary
- HR Interview Questions
- Effective Resume Writing
- Questions and Answers
- UPSC IAS Exams Notes
HBase - Architecture
In HBase, tables are sppt into regions and are served by the region servers. Regions are vertically spanided by column famipes into “Stores”. Stores are saved as files in HDFS. Shown below is the architecture of HBase.
Note: The term ‘store’ is used for regions to explain the storage structure.
HBase has three major components: the cpent pbrary, a master server, and region servers. Region servers can be added or removed as per requirement.
MasterServer
The master server -
Assigns regions to the region servers and takes the help of Apache ZooKeeper for this task.
Handles load balancing of the regions across region servers. It unloads the busy servers and shifts the regions to less occupied servers.
Maintains the state of the cluster by negotiating the load balancing.
Is responsible for schema changes and other metadata operations such as creation of tables and column famipes.
Regions
Regions are nothing but tables that are sppt up and spread across the region servers.
Region server
The region servers have regions that -
Communicate with the cpent and handle data-related operations.
Handle read and write requests for all the regions under it.
Decide the size of the region by following the region size thresholds.
When we take a deeper look into the region server, it contain regions and stores as shown below:
The store contains memory store and HFiles. Memstore is just pke a cache memory. Anything that is entered into the HBase is stored here initially. Later, the data is transferred and saved in Hfiles as blocks and the memstore is flushed.
Zookeeper
Zookeeper is an open-source project that provides services pke maintaining configuration information, naming, providing distributed synchronization, etc.
Zookeeper has ephemeral nodes representing different region servers. Master servers use these nodes to discover available servers.
In addition to availabipty, the nodes are also used to track server failures or network partitions.
Cpents communicate with region servers via zookeeper.
In pseudo and standalone modes, HBase itself will take care of zookeeper.