English 中文(简体)
Integration with HBase
  • 时间:2024-09-17

Apache Tajo - Integration with HBase


Previous Page Next Page  

Apache Tajo supports HBase integration. This enables us to access HBase tables in Tajo. HBase is a distributed column-oriented database built on top of the Hadoop file system. It is a part of the Hadoop ecosystem that provides random real-time read/write access to data in the Hadoop File System. The following steps are required to configure HBase integration.

Set Environment Variable

Add the following changes to “conf/tajo-env.sh” file.

$ vi conf/tajo-env.sh  
# HBase home directory. It is opitional but is required mandatorily to use HBase. 
# export HBASE_HOME = path/to/HBase

After you have included the HBase path, Tajo will set the HBase pbrary file to the classpath.

Create an External Table

Create an external table using the following syntax −

CREATE [EXTERNAL] TABLE [IF NOT EXISTS] <table_name> [(<column_name> <data_type>, ... )] 
USING hbase WITH ( table  =  <hbase_table_name>  
,  columns  =  :key,<column_family_name>:<quapfier_name>, ...  
,  hbase.zookeeper.quorum  =  <zookeeper_address>  
,  hbase.zookeeper.property.cpentPort  =  <zookeeper_cpent_port> ) 
[LOCATION  hbase:zk://<hostname>:<port>/ ] ;

To access HBase tables, you must configure the tablespace location.

Here,

    Table − Set hbase origin table name. If you want to create an external table, the table must exists on HBase.

    Columns − Key refers to the HBase row key. The number of columns entry need to be equal to the number of Tajo table columns.

    hbase.zookeeper.quorum − Set zookeeper quorum address.

    hbase.zookeeper.property.cpentPort − Set zookeeper cpent port.

Query

CREATE EXTERNAL TABLE students (rowkey text,id int,name text) 
USING hbase WITH ( table  =  students ,  columns  =  :key,info:id,content:name ) 
LOCATION  hbase:zk://<hostname>:<port>/ ;

Here, the Location path field sets the zookeeper cpent port id. If you don’t set the port, Tajo will refer the property of hbase-site.xml file.

Create Table in HBase

You can start the HBase interactive shell using the “hbase shell” command as shown in the following query.

Query

/bin/hbase shell 

Result

The above query will generate the following result.

hbase(main):001:0>

Steps to Query HBase

To query HBase, you should complete the following steps −

Step 1 − Pipe the following commands to the HBase shell to create a “tutorial” table.

Query

hbase(main):001:0> create ‘students’,{NAME => ’info’},{NAME => ’content’} 
put  students , ‘row-01 ,  content:name ,  Adam  
put  students , ‘row-01 ,  info:id ,  001  
put  students , ‘row-02 ,  content:name ,  Amit  
put  students , ‘row-02 ,  info:id ,  002  
put  students , ‘row-03 ,  content:name ,  Bob  
put  students , ‘row-03 ,  info:id , ‘003  

Step 2 − Now, issue the following command in hbase shell to load the data into a table.

main):001:0> cat ../hbase/hbase-students.txt | bin/hbase shell

Step 3 − Now, return to the Tajo shell and execute the following command to view the metadata of the table −

default> d students;  

table name: default.students 
table path: 
store type: HBASE 
number of rows: unknown 
volume: 0 B 
Options: 
    columns  =  :key,info:id,content:name  
    table  =  students   

schema: 
rowkey  TEXT 
id  INT4 
name TEXT

Step 4 − To fetch the results from the table, use the following query −

Query

default> select * from students

Result

The above query will fetch the following result −

rowkey,  id,  name 
------------------------------- 
row-01,  001,  Adam 
row-02,  002,  Amit 
row-03   003,  Bob 
Advertisements