Apache HBase NoSQL on Hadoop

Transcription

1 Apache HBase NoSQL on Hadoop Project Report Advanced Databases Muhammed Mohiyudheen Ziyad ( ) Sharma Akshi ( ) IT4BI Masters /12/2015

2 Apache HBase NoSQL on Hadoop 1 Table of Contents ABSTRACT... 3 APACHE HBASE INTRODUCTION... 4 HISTORY... 4 THE NEED FOR HBASE... 4 WHEN TO USE HBASE?... 4 QUICK START GUIDE... 5 RUN MODES... 5 ACCESSING HBASE... 5 HBASE CONFIGURATION PROPERTIES... 5 WEB CONSOLES... 5 ARCHITECTURE... 6 MASTER SERVER... 6 REGION SERVER... 6 REGION... 6 HBASE DATA ARCHITECTURE... 6 DATA MODEL... 7 NAMESPACE... 7 TABLE... 7 ROW... 7 COLUMN... 8 COLUMN FAMILY... 8 COLUMN QUALIFIER... 8 CELL... 8 TIMESTAMP... 8 PROJECT ENVIRONMENT... 8 HBASE SETUP... 8 DATA SET: NORTHWIND... 9 DATA DEFINITION LANGUAGE... 9 CREATE A NAMESPACE... 9 CREATE A TABLE... 9 DROP TABLE/NAMESPACE TRUNCATE TABLE ALTER TABLE DATA MODEL OPERATIONS LOADING DATA Using Java API Using importtsv for HDFS Files BulkLoad (for huge data sets)... 13

3 Apache HBase NoSQL on Hadoop 2 DATA RETRIEVAL Get Scans DELETE AGGREGATIONS JOINS FEATURES OF HBASE AUTOMATIC VERSIONING DYNAMIC SCHEMA PRE REGION SPLITS COMPRESSION ENCODING TIME TO LIVE (TTL) HBASE AND MAPREDUCE CLIENT FILTERS Partial Key Scan SingleColumnValueFilter RegexStringComparator FamilyFilter QualifierFilter RowFilter ACID IN HBASE HIGH AVAILABILITY HBASE AS AN OBJECT STORE NorthwindCustomer Class Data Loading Data Retrieval SCALABILITY AND PERFORMANCE DATA LOAD OPERATION UPDATE AND READS READ ONLY WORKLOAD EXAMPLE FOR HBASE SCHEMA DESIGN CORE DESIGN CONCEPTS DESIGNED SCHEMA Customers Employees Products Orders PHOENIX REFERENCES... 40

4 Apache HBase NoSQL on Hadoop 3 Abstract With data volumes and varieties constantly increasing, especially from social media and the Internet of Things (IoT), ability to store and process huge amounts of any kind of data has become critical. Apache Hadoop is a framework supporting an ecosystem of tools used to store and manage such huge volumes of data. Apache HBase, the NoSQL database on HDFS (Hadoop distributed file system), provides real-time read/write access to the large datasets managed with Hadoop. In this project, we try to understand the architecture of HBase, and explore the different features offered with suitable applications. This project also aims to identify how to use HBase effectively for various use cases, and where it should be avoided.

5 Apache HBase NoSQL on Hadoop 4 Apache HBase Introduction Apache HBase provides random, real time access to your data in Hadoop. It was created for hosting very large tables, making it a great choice to store multi-structured or sparse data. History The HBase story begins in 2006, when the San Francisco-based startup Powerset was trying to build a natural language search engine for the Web. They were looking for an alternative. The Google BigTable paper had just been published. Building an open source system to run on top of Hadoop s Distributed Filesystem (HDFS) in much the same way that BigTable ran on top of the Google File System seemed like a good approach because: 1. it was a proven scalable architecture 2. could leverage existing work on Hadoop s HDFS 3. could both contribute to and get additional leverage from the growing Hadoop ecosystem. The need for HBase Out of the box Hadoop can handle a high volume of multi-structured data. But it can not handle a high velocity of random reads and writes and it is unable to change a file without completely rewriting it. Fast random reads require the data to be stored structured (ordered). The only possibility to modify a file stored on HDFS without rewriting is appending. Fast random writes into sorted files only by appending seems to be impossible. The solution to this problem is the Log-Structured Merge Tree (LSM Tree). Designed on top of Hadoop as a NoSQL database, HBase deals with these drawbacks of HDFS. The HBase data structure is based on LSM Trees. When to use HBase? Of course, the first pre-requisite to use HBase is to have an existing Hadoop infrastructure. After that, the following points needs to be considered in order to use HBase for your use case. 1. Huge amount of data (hundreds of millions or billions of rows) 2. Fast random reads and/or writes 3. Well known access patterns HBase might not a be a good fit, if your use case falls in any of the below categories: 1. New data only needs to be appended 2. Batch processing instead of random reads 3. Complicated access patterns, and data de-normalization is not an option 4. Full ANSI SQL support required 5. A single node can deal with the volume and the velocity of the complete data set

6 Apache HBase NoSQL on Hadoop 5 Quick start guide In this section we discuss about the things you should know in order to get started with HBase. Run Modes 1. Stand-alone mode: In standalone mode, HBase does not use HDFS, it uses the local filesystem instead and it runs all HBase daemons and a local ZooKeeper all up in the same JVM. 2. Pseudo-distributed mode Pseudo-distributed mode means that HBase still runs completely on a single host, but each HBase daemon (HMaster, HRegionServer, and Zookeeper) runs as a separate process and the data can be stored in HDFS configured on the single node. Use this configuration testing and prototyping on HBase. 3. Fully distributed In a distributed configuration, the cluster contains multiple nodes, each of which runs one or more HBase daemon. These include primary and backup Master instances, multiple Zookeeper nodes, and multiple RegionServer nodes. Accessing HBase HBase data can be accessed using 1. HBase shell (Command line access): The Apache HBase Shell is (J)Ruby's IRB with some HBase particular commands added. Anything you can do in IRB, you should be able to do in the HBase Shell. 2. Client API: The native API provided by HBase is in Java, and it supports many other APIs as well. HBase configuration properties Apache HBase uses the same configuration system as Apache Hadoop. All configuration files are located in the conf/ directory, which needs to be kept in sync for each node on your cluster. hbase-site.xml is the main configuration file. Web Consoles The details of your hbase cluster can be easily accessed through the web consoles provided by HBase. By default, the Master console, runs on the port 16010, and the Regionserver console runs on

7 Apache HBase NoSQL on Hadoop 6 Architecture HBase has three major components: the client library, a master server, and region servers. Master server The HMaster is responsible for assigning the regions to each HRegionServer when you start HBase, uses Apache ZooKeeper, a reliable, highly available, persistent and distributed coordination service, to facilitate that task. The master server is also responsible for handling load balancing of regions across region servers, to unload busy servers and move regions to less occupied ones. The master is not part of the actual data storage or retrieval path. It negotiates load balancing and maintains the state of the cluster, but never provides any data services to either the region servers or the clients, and is therefore lightly loaded in practice. In addition, it takes care of schema changes and other metadata operations, such as creation of tables and column families. Region Server Region servers are responsible for all read and write requests for all regions they serve, and also split regions that have exceeded the configured region size thresholds. Clients communicate directly with them to handle all data-related operations. Region In HBase, rows of data is stored in tables. Tables are split into chunks of rows called regions. The regions are distributed across the cluster, hosted and made available to client processes by the RegionServer process. All the rows in the table that sort between the region s start key and end key are stored in the same region. Regions are non-overlapping, i.e. a single row key belongs to exactly one region at any point in time. A region is only served by a single region server at any point in time, which is how HBase guarantees strong consistency within a single row. Hbase Data architecture A Region in turn, consists of many Stores, which correspond to column families. Each Store instance can in turn have one or more StoreFile instances, which are lightweight wrappers around the actual storage file called HFile. A Store also has one MemStore and zero or more StoreFiles. The data for each column family is stored and accessed separately. HBase handles basically two kinds of file types: one is used for the write-ahead log (WAL) and the other for the actual data storage. The files are primarily handled by the HRegionServer.

8 Apache HBase NoSQL on Hadoop 7 Data Model In HBase, data is stored in tables, which have rows and columns, in the form of a multidimensional map (note that it is not similar to a table in a relational database) Namespace A namespace is a logical grouping of tables analogous to a database in relation database systems. A namespace can be created, removed or altered. If namespace is not specified while creating a table, the default namespace will be used. Table An HBase table consists of multiple rows. Row A row in HBase consists of a row key and one or more columns with values associated with them. Rows are sorted lexicographically by the row key. For this reason, the design of the row key is very important. The goal is to store data in such a way that related rows are near each other. A common row key pattern is a website domain. If your row keys are domains, you should probably store them in reverse (org.apache.www, org.apache.mail, org.apache.jira). This way, all of the Apache domains are near each other in the table, rather than being spread out based on the first letter of the subdomain. Please note that rowkeys cannot be changed. The only way they can be "changed" in a table is if the row is deleted and then re-inserted.

9 Apache HBase NoSQL on Hadoop 8 Column A column in HBase consists of a column family and a column qualifier, which are delimited by a : (colon) character. Column Family Column families physically collocate a set of columns and their values, often for performance reasons. Each column family has a set of storage properties, such as whether its values should be cached in memory, how its data is compressed or its row keys are encoded, and others. Each row in a table has the same column families, though a given row might not store anything in a given column family. One could argue that a column family to HBase is similar to a table to relational DB. Column Qualifier A column qualifier is added to a column family to provide the index for a given piece of data. Given a column family content, a column qualifier might be content:html, and another might be content:pdf. Though column families are fixed at table creation, column qualifiers are mutable and may differ greatly between rows. Cell A cell is a combination of row, column family, and column qualifier, and contains a value and a timestamp, which represents the value s version. Timestamp A timestamp is written alongside each value, and is the identifier for a given version of a value. By default, the timestamp represents the time on the RegionServer when the data was written, but you can specify a different timestamp value when you put data into the cell. Project Environment The project is executed using two instances of HBase Hbase setup 1. Stand-alone HBase installation on OS-X 2. Pseduo distributed HBase on Hortonworks Sandbox (HDP 2.3.2) In both instances the latest stable release of HBase (version 1.1.2) is used. The Java version used for developing client code is 1.7 Note: The code snippets are highlighted in this color

10 Apache HBase NoSQL on Hadoop 9 Data Set: Northwind The Northwind database captures all the sales transactions that occurs between the imaginary company i.e. Northwind traders and its customers as well as the purchase transactions between Northwind and its suppliers. In this project, we are using this dataset for showcasing various data operations on HBase. The primary data used is the customer table. This table is exported as a flat file from MS SQL Server, and made available to use for HBase data load. sample of the Northwind customer data Data Definition Language In this section we will discuss about the Data definition language of HBase, along with best practices on schema design. HBase schemas can be created or updated using the the Apache HBase Shell or by using Admin in the Java API. Create a namespace From the Hbase shell, execute the command: create_namespace 'northwind' Create a table Create a table in namespace with one column family c and able to maintain at most three versions create 'northwind:customer', {NAME => 'c',versions => 3; Note that, 1. HBase currently does not do well with anything above two or three column families so keep the number of column families in your schema low 2. Try to make do with one column family if you can in your schemas. Only introduce a second and third column family in the case where data access is usually column scoped; i.e. you query one column family or the other but usually not both at the same time 3. Use smallest possible names for column families (in this example c ), as the column family name is stored in each cell of a row.

11 Apache HBase NoSQL on Hadoop 10 Drop table/namespace In order to drop any table, first it should be disabled. disable 'northwind:customer' drop 'northwind:customer' drop_namespace 'northwind' Truncate table Truncate will disable the table first, drop it and recreate it using the same schema. Below screenshot represents how truncate is being executed. Alter table Alter table to add an additional column family d. The versions field indicate how many versions should be kept for each row for this column family. alter customer', {NAME' => d', VERSIONS => 5 Note: All the above operations can also be performed through the native Java API. Data Model Operations The four primary data model operations are Get, Put, Scan, and Delete. Loading Data Different ways of loading data into Hbase table are: 1. Using Java API It supports real time updates using the Put class. Put either adds new rows to a table (if the key is new) or can update existing rows (if the key already exists). Example: public class HBaseDataLoadUsingPut { public static void main(string[] args) throws MasterNotRunningException, ZooKeeperConnectionException, IOException { // TODO Auto-generated method stub // Define the configuration to connect to HBase table Configuration conf = HBaseConfiguration.create(); HTable table = new HTable(conf, "northwind:customer"); BufferedReader reader = null;

12 Apache HBase NoSQL on Hadoop 11 String[] record; String customerid; String companyname; String contactname; String contacttitle; String address; String city; String region; String postalcode; String country; String phone; String fax; try { // Update the file path to point to the Northwind customer file File file = new File("/data/northiwind_customer.txt"); reader = new BufferedReader(new FileReader(file)); String line; while ((line = reader.readline())!= null) { record = line.split("\t"); customerid = record[0]; companyname = record[1]; contactname = record[2]; contacttitle = record[3]; address = record[4]; city = record[5]; region = record[6]; postalcode = record[7]; country = record[8]; phone = record[9]; fax = record[10]; // Create Put objects to store the data to HBase table Put put = new Put(Bytes.toBytes(customerId)); // row key put.add(bytes.tobytes("c"), Bytes.toBytes("companyName"), Bytes.toBytes(companyName)); put.add(bytes.tobytes("c"), Bytes.toBytes("contactName"), Bytes.toBytes(contactName)); put.add(bytes.tobytes("c"), Bytes.toBytes("contactTitle"), Bytes.toBytes(contactTitle)); put.add(bytes.tobytes("c"), Bytes.toBytes("address"), Bytes.toBytes(address)); put.add(bytes.tobytes("c"), Bytes.toBytes("city"), Bytes.toBytes(city)); put.add(bytes.tobytes("c"), Bytes.toBytes("region"), Bytes.toBytes(region)); put.add(bytes.tobytes("c"), Bytes.toBytes("postalCode"), Bytes.toBytes(postalCode)); put.add(bytes.tobytes("c"), Bytes.toBytes("country"), Bytes.toBytes(country)); put.add(bytes.tobytes("c"), Bytes.toBytes("phone"), Bytes.toBytes(phone)); put.add(bytes.tobytes("c"), Bytes.toBytes("fax"), Bytes.toBytes(fax)); table.put(put); catch (IOException e) { e.printstacktrace(); finally { try { reader.close(); catch (IOException e) { e.printstacktrace();

13 Apache HBase NoSQL on Hadoop 12 Output (Sample): 2. Using importtsv for HDFS Files ImportTsv is a utility that will load data in TSV format from HDFS into HBase. The below example explains how it is used for loading data via Put. First, our data-set is loaded to HDFS. And then importtsv is called from command-line to load this data to the pre-created HBase table. Screen-shot of HDFS directory (using Ambari web console) Usage: $ hbase org.apache.hadoop.hbase.mapreduce.importtsv - Dimporttsv.columns=a,b,c <tablename> <hdfs-inputdir> Example: hbase org.apache.hadoop.hbase.mapreduce.importtsv - Dimporttsv.columns=HBASE_ROW_KEY,c:CompanyName,c:ContactName,c:ContactTitle,c:Address,c:City,c:Region,c:PostalC ode,c:country,c:phone,c:fax northwind:customer /tmp/northiwind_customer.txt

14 Apache HBase NoSQL on Hadoop 13 Output (sample): 3. BulkLoad (for huge data sets) The bulk load feature uses a MapReduce job to output table data in HBase s internal data format, and then directly loads the generated StoreFiles into a running cluster. Using bulk load will use less CPU and network resources than simply using the HBase API. Step 1: The data import has been prepared, either by using the importtsv tool with the importtsv.bulk.output option or by some other MapReduce job using the HFileOutputFormat. $ bin/hbase org.apache.hadoop.hbase.mapreduce.importtsv - Dimporttsv.columns=a,b,c -Dimporttsv.bulk.output=hdfs://storefile-outputdir <tablename> <hdfs-data-inputdir> Example: hbase org.apache.hadoop.hbase.mapreduce.importtsv -Dimporttsv columns=hbase_row_key,c:companyname,c:contactname,c:contacttitle,c:address,c:city,c:region,c:postalcode,c:count ry,c:phone,c:fax -Dimporttsv.bulk.output=hdfs:///tmp/northiwind_customer_load northwind:customer /tmp/northiwind_customer.txt Output: HBase data files (StoreFiles) is generated on HDFS, as shown below: /tmp/northiwind_customer_load/c/c4d1e85e045a4b2d8f4645c5be9cd97f Step 2: The completebulkload tool is used to import the prepared data into the running cluster. This command line tool iterates through the prepared data files, and for each one determines the region the file belongs to. $ hbase org.apache.hadoop.hbase.mapreduce.loadincrementalhfiles <hdfs://storefileoutput> <tablename>

15 Apache HBase NoSQL on Hadoop 14 -OR- $ hadoop jar hbase-server-version.jar completebulkload [-c /path/to/hbase/config/hbasesite.xml] /user/user_name/myoutput mytable Example: hbase org.apache.hadoop.hbase.mapreduce.loadincrementalhfiles /tmp/northiwind_customer_load northwind:customer Output (sample): Note that, in this case all the cells are stored with the same time stamp. Data Retrieval 1. Get Get returns attributes for a specified row. Example: get 'northwind:customer', 'WILMK'

16 Apache HBase NoSQL on Hadoop Scans Scan allow iteration over multiple rows for specified attributes. scan northwind:customer The same operations (Get and Scan) can be done using the Java API as well. It allows the user to effectively use the HBase data in client applications. Example for HBase Scan is given below: public class HBaseTableScan { public static void main(string[] args) { // TODO Auto-generated method stub // create a configuration to connect to HBase. Configuration hconf = HBaseConfiguration.create(); /* Uncomment the below code to use with a remote client. * The example uses the properties for HDP sandbox. * hconf.set(constants.hbase_configuration_zookeeper_quorum, * "sandbox.hortonworks.com"); * hconf.setint(constants.hbase_configuration_zookeeper_clientport, * 2181); */ try { HTable htable = new HTable(hConf, "northwind:customer"); byte[] family = Bytes.toBytes("c"); // creating a scan object. Optionally you can include // start and stop row keys Scan scan = new Scan(); scan.addfamily(family); // Get the result object and iterate through your results ResultScanner rs = htable.getscanner(scan); for (Result r = rs.next(); r!= null; r = rs.next()) { // Use the result object rs.close(); catch (IOException e) { // TODO Auto-generated catch block e.printstacktrace(); Delete Delete removes a row from a table. HBase does not modify data in place, and so deletes are handled by creating new markers called tombstones. These tombstones, along with the dead values, are cleaned up on major compactions.

17 Apache HBase NoSQL on Hadoop 16 Example: In this example, we delete the Fax column for a particular customer. After delete operation, the Fax column is no longer displayed. Aggregations Hbase doesn t have its own in-built aggregating operations but it can handle aggregations using the below two methods: 1. You can write you own mapreduce job working with HBase data sitting in HFiles in the HDFS. It will be most efficient way, but not simple, and data you processed would be somewhat stale. It is most efficient since data will not be transferred via HBase API - instead it will be accessed right from HDFS in sequential manner. 2. Register HBase table as external table in Hive and do aggregations. Data will be accessed via HBase API which is not that efficient. It is most powerful way to group by HBase data. It does imply running MR jobs but by Hive, not by HBase. Usage: CREATE TABLE hive_managed_table (key string, value1 string, value2 int) STORED BY 'org.apache.hadoop.hive.hbase.hbasestoragehandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,columnfamilyname:val1,columnfamilyname:val2") TBLPROPERTIES ("hbase.table.name" = "namespace:table_name");

18 Apache HBase NoSQL on Hadoop 17 Joins Whether HBase supports joins is a common question, and there is a simple answer: it doesn t, at least not in the way that RDBMS support them (e.g., with equi-joins or outer-joins in SQL). The read data model operations in HBase are Get and Scan. However, that doesn t mean that equivalent join functionality can t be supported in your application, but you have to do it yourself. The two primary strategies are either de-normalizing the data upon writing to HBase, or to have lookup tables and do the join between HBase tables in your application or MapReduce code. So which is the best approach? It depends on what you are trying to do, and as such there isn t a single answer that works for every use case. Features of HBase In this section, we will explore the striking features of HBase, which makes it a suitable choice for the best NoSQL database on Hadoop. The notable features of HBase include 1. Strongly consistent reads/writes: HBase is not an "eventually consistent" DataStore. This makes it very suitable for tasks such as high-speed counter aggregation. 2. Automatic sharding: HBase tables are distributed on the cluster via regions, and regions are automatically split and re-distributed as your data grows. 3. Automatic RegionServer failover 4. Hadoop/HDFS Integration: HBase supports HDFS out of the box as its distributed file system. 5. MapReduce: HBase supports massively parallelized processing via MapReduce for using HBase as both source and sink. 6. Java Client API: HBase supports an easy to use Java API for programmatic access. 7. Thrift/REST API: HBase also supports Thrift and REST for non-java front-ends. 8. Block Cache and Bloom Filters: HBase supports a Block Cache and Bloom Filters for high volume query optimization. 9. Operational Management: HBase provides built-in web-pages for operational insight as well as JMX metrics. Automatic Versioning The maximum number of versions to store for a given column is part of the column schema and is specified at table creation, or via an alter command, via HColumnDescriptor.DEFAULT_VERSIONS. Example: create a table name emp with a single column family f and keep a maximum of 5 versions of all columns in the column family. And then the salary of the employee is updated 5 times.

19 Apache HBase NoSQL on Hadoop 18 create 'emp', {NAME => 'f', VERSIONS => 5; put 'emp', '10000', 'f:salary', '65000' ; put 'emp', '10000', 'f:salary', '70000' ; put 'emp', '10000', 'f:salary', '75000' ; put 'emp', '10000', 'f:salary', '80000' ; HBase allows to scan the latest n versions, where n varies from 1 to max value, in this case its 5. For example: scan 'emp', {VERSIONS=>3 ROW COLUMN+CELL column=f:salary, timestamp= , value= column=f:salary, timestamp= , value= column=f:salary, timestamp= , value= row(s) in seconds Note that, on addition of the 6 th version of the data the oldest version will be lost, as the maximum number of versions is set to 5. Dynamic Schema Column qualifiers in Hbase are mutable and may differ greatly between rows. One row in a table can have 1 column, where as the next row in the same table can have 1 million columns. This dynamic schema is useful in many applications, as we don t have to specify the number of columns at the table creation time. Let s take an example of Northwind customer data, the association between customer and country is as below: CustomerID ALFKI ANATR ANTON AROUT BERGS BLAUS BLONP BOLID BONAP BOTTM BSBEV CACTU CENTC CHOPS COMMI CONSH DRACD DUMON Country Germany Mexico Mexico UK Sweden Germany France Spain France Canada UK Argentina Mexico Switzerland Brazil UK Germany France

20 Apache HBase NoSQL on Hadoop 19 This association can be stored in the Hbase table with country as the rowkey. The table has one column family named c and the customers associated to the country can be stored as column qualifiers. If some extra information needs to be added to the country-customer pair, we can simply add it to the value e.g. if complete address of the person need to be stored. Below is the snapshot of northwind-country data. As you can see, there are multiple column qualifiers associated to a rowkey. Whenever a new customer is added, a new column qualifier is added to the country (row) he belongs to. In real world scenarios, one country will have thousands of customers, and all of them can be stored in just a single row in HBase. The advantage is, if your application needs to retrieve all the customers belonging to a particular category (in this case, country), a single get operation is enough, if the rowkey of the table is category (in this case, country).

21 Apache HBase NoSQL on Hadoop 20 Pre region splits With a process called pre-splitting, you can create a table with many regions by supplying the split points at the table creation time. Since pre-splitting will ensure that the initial load is more evenly distributed throughout the cluster, you should always consider using it if you know your key distribution beforehand. However, pre-splitting also has a risk of creating regions, that do not truly distribute the load evenly because of data skew, or in the presence of very hot or large rows. If the initial set of region split points is chosen poorly, you may end up with heterogeneous load distribution, which will in turn limit your clusters performance. There is no short answer for the optimal number of regions for a given load, but you can start with a lower multiple of the number of region servers as number of splits, then let automated splitting take care of the rest. We will go through an example to understand how pre region split helps in distributing the data. Without pre-region split First, let s create the northwind customer table without any pre-region splits. If you go to the HBase web console, you can see that there is only one region for this table.

22 Apache HBase NoSQL on Hadoop 21 Applying pre-region split Here, we can apply the pre-region split while creating the customer table ( rowkey is customerid). CustomerID in this dataset is a varchar starting from A to Z. As per the split defined below, the data will be distributed amongst 27 different regions depending on the starting letter of the rowkey. For example, all customerid starting from A will go to the A region. Here we assume that the first character of customerid is uniformly distributed from A to Z) create 'northwind:customer', {NAME => 'c', VERSIONS => 3, {SPLITS => [ 'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'] The web console now shows that this table has 27 online regions. After loading the data, you can see that different regions have received the data based on the value of customerid.

23 Apache HBase NoSQL on Hadoop 22 Compression HBase comes with support for a number of compression algorithms that can be enabled at the column family level or for compacting compression. The available algorithms are NONE (no compression), GZ, LZO, LZ4 and SNAPPY (which is probably better than LZO in performance). Enabling compression requires installation of the corresponding libraries (unless you only want to use the Java based GZIP compression), and specifying the chosen algorithm in the column family schema. Column family compression design considerations: 1. Already compressed data (such as JPEG) should be in an uncompressed column family. 2. Small but very often used families should not be compressed The different cases where compression can be applied, are explained below. a) To enable compression during table creation: create northwind:products, {NAME => 'colfam1', COMPRESSION => 'LZ4

24 Apache HBase NoSQL on Hadoop 23 b) To enable/change/disable compression algorithms for existing tables, use alter command. create northwind:customer, c disable ' northwind:customer ' alter ' northwind:customer ', { NAME => 'c', COMPRESSION => 'LZ4 enable ' northwind:customer' Note that, only newly flushed store files after the change will use the new compression format. c) To force all existing HFiles to be rewritten with the newly selected compression format Issue a major_compact '<tablename>' in the shell to start a major compaction process in the background. It will rewrite all files, and therefore use the new settings. Recommendations 1. If the values are large (and not pre-compressed, such as images), use a data block compressor. 2. Use GZIP for cold data, which is accessed infrequently. GZIP compression uses more CPU resources than Snappy or LZO, but provides a higher compression ratio. 3. Use Snappy or LZO for hot data, which is accessed frequently. Snappy and LZO use fewer CPU resources than GZIP, but do not provide as high of a compression ratio. 4. In most cases, enabling Snappy or LZO by default is a good choice, because they have a low performance overhead and provide space savings. 5. Before Snappy became available by Google in 2011, LZO was the default. Snappy has similar qualities as LZO but has been shown to perform better. Encoding Data block encoding attempts to limit duplication of information in keys, taking advantage of some of the fundamental designs and patterns of HBase, such as sorted row keys and the schema of a given table. The encoding methods include Prefix, Diff, Fast Diff, Prefix tree etc. We will explore the prefix encoding as an example. Prefix encoding Often, keys are very similar. Specifically, keys often share a common prefix and only differ near the end. For instance, one key might be RowKey:Family:Qualifier0 and the next key might be RowKey:Family:Qualifier1. In Prefix encoding, an extra column is added which holds the length of the prefix shared between the current key and the previous key. Assuming the first key here is totally different from the key before, its prefix length is 0. The second key s prefix length is 23, since they have the first 23 characters in common. Obviously if the keys tend to have nothing in common, Prefix will not provide much benefit.

25 Apache HBase NoSQL on Hadoop 24 The following image shows a hypothetical column family with no data block encoding. And with prefix encoding, the column family will look like the image below: Time to Live (TTL) ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows once the expiration time is reached. This applies to all versions of a row, even the current one. The TTL time encoded in the HBase for the row is specified in UTC. Store files which contains only expired rows are deleted on minor compaction. Setting hbase.store.delete.expired.storefile to false disables this feature. Setting minimum number of versions to other than 0 also disables this. Recent versions of HBase also support setting time to live on a per cell basis. Cell TTLs are submitted as an attribute on mutation requests (Appends, Increments, Puts, etc.) using Mutation#setTTL. If the TTL attribute is set, it will be applied to all cells updated on the server by the operation. There are two notable differences between cell TTL handling and ColumnFamily TTLs: Cell TTLs are expressed in units of milliseconds instead of seconds. A cell TTLs cannot extend the effective lifetime of a cell beyond a ColumnFamily level TTL setting.

26 Apache HBase NoSQL on Hadoop 25 To specify TTL for tables using shell command: create 'northwind:products', {NAME => 'p', VERSIONS => 1, TTL => To sepcify TTL for tables using Hbase API: The API provides the following getter and setter to read and write the TTL: int gettimetolive(); void settimetolive(int timetolive); HBase and Mapreduce Apache MapReduce is a software framework used to analyze large amounts of data, and is the framework used most often with Apache Hadoop. HBase is tightly integrated with Mapreduce and allows to access its tables from mapreduce jobs for both read and writes. Different types of Mapreduced Jobs for Hbase includes: 1. MapReduce Job Summary to HBase table 2. Read from HBase table in MapReduce job and write Summary to HDFS File 3. HBase MapReduce Summary to RDBMS table The below code explains how a mapreduce job writes its output to a HBase table. The data set is read from HDFS and processed parallely using mapreduce framework, and the output is loaded to HBase table northwind:customer. Driver Class: public class HBaseMapreduceWrite { public static void main(string[] args) throws IOException, ClassNotFoundException, InterruptedException { Configuration config = HBaseConfiguration.create(); Job job = new Job(config, "HBaseMapreduceWrite"); job.setjarbyclass(hbasemapreducewrite.class); // class that contains mapper FileInputFormat.addInputPath(job, new Path(args[0])); // set the input path job.setmapperclass(hbasewritemapper.class); job.setmapoutputkeyclass(immutablebyteswritable.class); job.setmapoutputvalueclass(put.class); TableMapReduceUtil.initTableReducerJob("northwind:customer", // output table null, // reducer class job); job.setnumreducetasks(0); // To create a map only job System.exit(job.waitForCompletion(true)? 0:1);

27 Apache HBase NoSQL on Hadoop 26 Mapper Class: public class HBaseWriteMapper extends Mapper<LongWritable, Text, ImmutableBytesWritable, Put> { public void map(longwritable row, Text value, Context context) throws IOException, InterruptedException { String[] record; String customerid; String companyname; String contactname; String contacttitle; String address; String city; String region; String postalcode; String country; String phone; String fax; // get the attributes of the customer record from the mapper input values record = value.tostring().split("\t"); customerid = record[0]; companyname = record[1]; contactname = record[2]; contacttitle = record[3]; address = record[4]; city = record[5]; region = record[6]; postalcode = record[7]; country = record[8]; phone = record[9]; fax = record[10]; // create the Put object using the customer attributes to be stored in HBase Put put = new Put(Bytes.toBytes(customerId)); put.add(bytes.tobytes("c"), Bytes.toBytes("companyName"), Bytes.toBytes(companyName)); put.add(bytes.tobytes("c"), Bytes.toBytes("contactName"), Bytes.toBytes(contactName)); put.add(bytes.tobytes("c"), Bytes.toBytes("contactTitle"), Bytes.toBytes(contactTitle)); put.add(bytes.tobytes("c"), Bytes.toBytes("address"), Bytes.toBytes(address)); put.add(bytes.tobytes("c"), Bytes.toBytes("city"), Bytes.toBytes(city)); put.add(bytes.tobytes("c"), Bytes.toBytes("region"), Bytes.toBytes(region)); put.add(bytes.tobytes("c"), Bytes.toBytes("postalCode"), Bytes.toBytes(postalCode)); put.add(bytes.tobytes("c"), Bytes.toBytes("country"), Bytes.toBytes(country)); put.add(bytes.tobytes("c"), Bytes.toBytes("phone"), Bytes.toBytes(phone)); put.add(bytes.tobytes("c"), Bytes.toBytes("fax"), Bytes.toBytes(fax)); // Write the Put object to context using customerid as the key. context.write(new ImmutableBytesWritable(customerId.getBytes()), put);

28 Apache HBase NoSQL on Hadoop 27 Client Filters Get and Scan instances can be optionally configured with filters which are applied on the RegionServer. HBase supports many types of filters to retrieve the required data, on the row, family and column level. In this section, we will explore the major filters and comparators offered by HBase. 1. Partial Key Scan The most efficient way to retrieve the required rows is to use a start and stop row in the scan operation, provided the rows you are looking for are continuous. In most cases, the row-key design is such that rows are clubbed together based on the most common access pattern. Also, remember that Hbase store the data in sorted on rowkey. The following example shows how to use this feature to retrieve all the customers whose ID starts with the letters WAR. The stop row is given as WAS, means do not retrieve those customers which comes after WAR. Note that stop-row is not included in the range. scan 'northwind:customer', {STARTROW => 'WAR',STOPROW => 'WAS' ROW COLUMN+CELL WARTH column=c:address, timestamp= , value=torikatu 38 WARTH column=c:city, timestamp= , value=oulu WARTH column=c:companyname, timestamp= , value=wartian Herkku WARTH column=c:contactname, timestamp= , value=pirkko Koskitalo WARTH column=c:contacttitle, timestamp= , value=accounting Manager WARTH column=c:country, timestamp= , value=finland WARTH column=c:fax, timestamp= , value= WARTH column=c:phone, timestamp= , value= WARTH column=c:postalcode, timestamp= , value=90110 WARTH column=c:region, timestamp= , value=null 1 row(s) in seconds 2. SingleColumnValueFilter SingleColumnValueFilter can be used to test column values for equivalence (CompareOp.EQUAL), inequality (CompareOp.NOT_EQUAL), or ranges (e.g., CompareOp.GREATER). The following is example tests the equivalence of a column to a String "my value". SingleColumnValueFilter filter = new SingleColumnValueFilter( cf, column, CompareOp.EQUAL, Bytes.toBytes("my value") ); scan.setfilter(filter); 3. RegexStringComparator RegexStringComparator supports regular expressions for value comparisons RegexStringComparator comp = new RegexStringComparator("my."); // any value that starts with 'my' SingleColumnValueFilter filter = new SingleColumnValueFilter( cf,column, CompareOp.EQUAL,comp); scan.setfilter(filter);

29 Apache HBase NoSQL on Hadoop FamilyFilter FamilyFilter can be used to filter on the ColumnFamily. It is generally a better idea to select ColumnFamilies in the Scan than to do it with a Filter. Scan scan = new Scan(); scan.addfamily(family); // (optional) limit to one family 5. QualifierFilter QualifierFilter can be used to filter based on Column (aka Qualifier) name. Examples are ColumnPrefixFilter and ColumnRangeFilter Scan scan = new Scan(); // (optional) limit to one row scan.addfamily(family); // (optional) limit to one family Filter f = new ColumnPrefixFilter(prefix); scan.setfilter(f); ColumnRangeFilter is used for slicing of data. Scan scan = new Scan(row, row); // (optional) limit to one row scan.addfamily(family); // (optional) limit to one family Filter f = new ColumnRangeFilter(startColumn, true, endcolumn, true); scan.setfilter(f); 6. RowFilter As discussed earlier, it is a better idea to use the startrow/stoprow methods on Scan for row selection, however RowFilter can also be used. RowFilter filter = new RowFilter(CompareOp.EQUAL,new BinaryComparator(Bytes.toBytes(myRowId.toString())); ACID in HBase Apache HBase is not an ACID compliant database. However, it does guarantee certain specific properties. HBase employs a kind of MVCC (Multi-version concurrency control). And HBase has no mixed read/write transactions. In a nutshell each RegionServer maintains what we call "strictly monotonically increasing transaction numbers". When a write transaction (a set of puts or deletes) starts it retrieves the next highest transaction number. In HBase this is called a WriteNumber. When a read transaction (a Scan or Get) starts it retrieves the transaction number of the last committed transaction. HBase calls this the ReadPoint. Each created KeyValue is tagged with its transaction's WriteNumber (this tag is called the memstore timestamp in HBase. Note that this is separate from the application-visible timestamp.)

30 Apache HBase NoSQL on Hadoop 29 The high level flow of a write transaction in HBase looks like this: 1. lock the row(s), to guard against concurrent writes to the same row(s). 2. retrieve the current writenumber. 3. apply changes to the WAL (Write Ahead Log). 4. apply the changes to the Memstore (using the acquired writenumber to tag the KeyValues). 5. commit the transaction, i.e. attempt to roll the Readpoint forward to the acquired Writenumber. 6. unlock the row(s). The high level flow of a read transaction looks like this: 1. open the scanner. 2. get the current readpoint. 3. filter all scanned KeyValues with memstore timestamp > the readpoint. 4. close the scanner (this is initiated by the client). In reality it is more complicated, but this explanation is to illustrate it on high level. Note that a reader acquires no locks at all, but we still get all of ACID. It is important to realize that this only works if transactions are committed strictly serially; otherwise an earlier uncommitted transaction could become visible when one that started later commits first. In HBase transaction are typically short, so this is not a problem. HBase does exactly that - All transactions are committed serially. Committing a transaction in HBase means settting the current ReadPoint to the transaction's WriteNumber, and hence make its changes visible to all new Scans. HBase keeps a list of all unfinished transactions. A transaction's commit is delayed until all prior transactions committed. Note that HBase can still make all changes immediately and concurrently, only the commits are serial. Also note that a scan will always reflect a view of the data at least as new as the beginning of the scan. High Availability For achieving high availability for reads, HBase provides a feature called region replication. In this model, for each region of a table, there will be multiple replicas that are opened in different RegionServers. By default, the region replication is set to 1, so only a single region replica is deployed and there will not be any changes from the original model. If region replication is set to 2 or more, then the master will assign replicas of the regions of the table. The Load Balancer ensures that the region replicas are not co-hosted in the same region servers and also in the same rack (if possible).

31 Apache HBase NoSQL on Hadoop 30 Hbase as an Object store Instead of storing different parameters as column qualifiers, HBase can also store the data as a serialized object. This is a very useful feature when you have 1. thousands of fields for an entity, and whenever a single field changes, you need to create a new version of the entity. In conventional design, you have to update all the column qualifiers. If the entity is stored as a single object (in one column qualifier), you can modify the object, and put it back as a new version. 2. When your application demands the data to be preserved as the objects used in the application. Any change in the parameters of the objects can then be modified at the application level, and not at the database. We will try to explain this concept using the same northwind:customer example. Instead of storing the different attributes of the customer as different columns, we will create a customer object, and store it in the Hbase table. Also, we will explain, how can it be retrieved. NorthwindCustomer Class public class NorthwindCustomer implements Writable { public Map<String, String> customermap; public NorthwindCustomer() { customermap = new HashMap<String, String>(); public NorthwindCustomer(Map<String, String> customermap) { super(); this.customermap = customermap; public void setcustomerdetails(map<string, String> customermap) { this.customermap = customermap; public Map<String, String> getcustomerdetails() { return public void readfields(datainput in) throws IOException { customermap.clear(); int entries = in.readint(); String key; String value; customermap = new HashMap<String, String>(); for (int i = 0; i < entries; i++) { key = in.readutf(); value = in.readutf(); customermap.put(key, public void write(dataoutput out) throws IOException { out.writeint(customermap.size()); for (String key : customermap.keyset()) { out.writeutf(key); out.writeutf(customermap.get(key));

32 Apache HBase NoSQL on Hadoop 31 Data Loading public class HBaseLoadDataAsObject { public static void main(string[] args) throws MasterNotRunningException, ZooKeeperConnectionException, IOException { // TODO Auto-generated method stub Configuration conf = HBaseConfiguration.create(); HTable table = new HTable(conf, "northwind:customer"); BufferedReader reader = null; String[] record; Map<String, String> customermap = new HashMap<String, String>(); NorthwindCustomer northwindcustomer ; try { File file = new File("/data/northiwind_customer.txt"); reader = new BufferedReader(new FileReader(file)); String line; while ((line = reader.readline())!= null) { record = line.split("\t"); customermap.clear(); customermap.put("customerid", record[0]); customermap.put("companyname", record[1]); customermap.put("contactname", record[2]); customermap.put("contacttitle", record[3]); customermap.put("address", record[4]); customermap.put("city", record[5]); customermap.put("region", record[6]); customermap.put("postalcode", record[7]); customermap.put("country", record[8]); customermap.put("phone", record[9]); customermap.put("fax", record[10]); northwindcustomer = new NorthwindCustomer(customerMap); Put put = new Put(Bytes.toBytes(record[0])); put.add(bytes.tobytes("c"), Bytes.toBytes("profile"), (serialize(northwindcustomer))); table.put(put); catch (IOException e) { e.printstacktrace(); finally { try { reader.close(); catch (IOException e) { e.printstacktrace(); // Serialize the object public static byte[] serialize(writable writable) throws IOException { ByteArrayOutputStream out = new ByteArrayOutputStream(); DataOutputStream dataout = new DataOutputStream(out); writable.write(dataout); dataout.close(); return out.tobytearray();

33 Apache HBase NoSQL on Hadoop 32 Data Retrieval Screen shot of the HBase object store public class HBaseGetObjectData { public static void main(string[] args) { // TODO Auto-generated method stub Configuration hconf = HBaseConfiguration.create(); /* * hconf.set(constants.hbase_configuration_zookeeper_quorum,"sandbox.hortonworks.com") ; * hconf.setint(constants.hbase_configuration_zookeeper_clientport,2181); */ Result r; NorthwindCustomer northwindcustomer; Map<String, String> customermap; try HTable htable = new HTable(hConf, "northwind:customer"); byte[] family = Bytes.toBytes("c"); byte[] qualifier = Bytes.toBytes("profile"); Get get = new Get(Bytes.toBytes("WOLZA")); r = htable.get(get); byte[] value = r.getvalue(family, qualifier); northwindcustomer = new NorthwindCustomer(); deserialize(northwindcustomer, value); customermap = northwindcustomer.getcustomerdetails(); for (String property : customermap.keyset()) { System.out.println(property + " == " + customermap.get(property)); htable.close(); catch (IOException e) { // TODO Auto-generated catch block e.printstacktrace(); // deserialize the object public static void deserialize(writable writable, byte[] bytes) throws IOException { ByteArrayInputStream in = new ByteArrayInputStream(bytes); DataInputStream datain = new DataInputStream(in); writable.readfields(datain); datain.close();

34 Apache HBase NoSQL on Hadoop 33 The result of the above operation is given below: country == Poland contacttitle == Owner address == ul. Filtrowa 68 city == Warszawa phone == (26) contactname == Zbyszek Piestrzeniewicz companyname == Wolski Zajazd postalcode == customerid == WOLZA region == NULL fax == (26) Scalability and Performance The real performance of HBase is visible when the amount of data is huge and the number of nodes in the cluster is adequate. In our project setup we are using only a pseudo distributed cluster, and it is not enough to test the scalability and performance of HBase. In order to get an idea of the performance of HBase, we are referring to a vendor independent case study of NoSQL databases, done by Altoros systems Inc. the databases were tested under the same conditions, regardless of their specifics, using the Yahoo Cloud Serving Benchmark. The complete case study can be found in the references section of this report. Some of the interesting results are given below: Data load operation 100 million records, each containing 10 fields of 100 randomly generated bytes, were imported to a four-node cluster. HBase demonstrated by far the best writing speed. With precreated regions and deferred log flush enabled, it reached 40K ops/sec. Cassandra also showed great performance during the loading phase with around 15K ops/sec.

35 Apache HBase NoSQL on Hadoop 34 Update and reads Next, an update-heavily scenario that simulates the database work during which typical actions of an e-commerce solution user are recorded, is performed. As you can see, during updates, HBase and Cassandra went far ahead from the main group with the average response latency time not exceeding two milliseconds. HBase was even faster. HBase client was configured with AutoFlush turned off. The updates aggregated in the client buffer and pending writes flushed asynchronously, as soon as the buffer became full. To accelerate updates processing on the server, the deferred log flush was enabled and WAL edits were kept in memory during the flush period. During reads, per-column family compression provides HBase and Cassandra with faster data access. HBase was configured with native LZO and Cassandra with Google's Snappy compression codecs. Although the computation ran longer, the compression reduces the number of bytes read from the disk.

36 Apache HBase NoSQL on Hadoop 35 Read only workload This read-only workload simulated a data caching system. The data was stored outside the system, while the application was only reading it. Thanks to B-tree indexes, sharded MySQL became the winner in this competition.

37 Apache HBase NoSQL on Hadoop 36 Example for HBase schema design In this section, we will design a HBase data schema for the Northwind database (assuming that its data size grows huge and hence is being ported to HBase), whose relational schema is given below. Please note that the ideal design of any HBase database, will highly depend on the real access patterns to the data. In this example, we will try to explain the principles of HBase schema design, and how to bring more efficiency based on the access patterns. Core Design Concepts The core concepts we should consider in the initial design are 1. There is no referential integrity offered by HBase. So its up to the application designer/developer to take care of the referential integrity in the database. 2. Since storage is not a problem in HDFS, de-normalization should be applied wherever applicable. It also facilitates fast data retrieval with a single row scan, compared to multiple rows (potentially, across multiple tables) in a normalized design. 3. Since ROW_KEY is the single identifier for a row, it should be unique and should contain a range of values in order to get the data distributed equally in the cluster. 4. Keep the ROW_KEY length as short as is reasonable such that they can still be useful for required data access (e.g. Get vs. Scan). A short key that is useless for data access is not better than a longer key with better get/scan properties. Expect tradeoffs when designing row keys. 5. Heavy joins should be avoided using composite row-keys, based on the access patterns.

38 Apache HBase NoSQL on Hadoop Try to make do with one column family if you can in your schemas. Only introduce a second and third column family in the case where data access is usually column scoped; i.e. you query one column family or the other but usually not both at the same time. 7. Try to keep the column family names as small as possible, preferably one character (e.g. "d" for data/default). 8. Also note that, row keys cannot be changed. The only way they can be "changed" in a table is if the row is deleted and then re-inserted. 9. Finally, if there is a requirement for secondary indexes, we can create separate index tables, and update them periodically. Designed Schema Considering these factors, the designed HBase schema is given below. 1. Customers Three tables in the Northwind relational database (Customers, CustomerCustomerDemo, CustomerDemographics ) are combined to form a single table. In the demographics column family, a dynamic column qualifier is used instead of a static column qualifier. This allows us to handle n to n relationship between customers and customer demographics. Note: Throughout the design, unless mentioned otherwise, the column values are the values from the relational DB for the corresponding column qualifiers. 2. Employees Four tables in the Northwind relational database (Employees, EmployeeTerritories, Region, Territories) are combined to form a single table. In order to handle the multiple territory information for a particular employee, the territory column family (t) uses a composite row-key of EmployeeID and TerritoryID, separated by a delimiter (hyphen character is a good choice to use as a delimiter, since its lexicographical order in ASCII is prior than alpha numeric characters)

39 Apache HBase NoSQL on Hadoop Products Three tables in the Northwind relational database (Products, Categories, Suppliers) are combined to form a single table. Note that the CategoryID column is not really necessary, as all the category information is stored in the same row for each product. It is kept in the design in order to have reference to the RDBMS system.

40 Apache HBase NoSQL on Hadoop Orders Three tables in the Northwind relational database (Orders, OrderDetails, Shippers) are combined to form a single table. In order to retain the order details of each products in an order, the order details column family uses a composite rowkey of OrderID and ProductID. Note that the order of the keys in this design will allow us to get all the products of a particular OrderID, but not vice versa. If we want to see all the orders of a particular product, we can create an index table with ProductID as rowkey and OrderIDs as column qulaifiers. This table needs to be updated when ever a new order entry is inserted in Orders table. As we discussed earlier, the access patterns matter a lot in the HBase schema design! Phoenix Apache Phoenix is a relational database layer over HBase, delivered as a client-embedded JDBC driver targeting low latency queries over HBase data. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. The table metadata is stored in an HBase table and versioned, such that snapshot queries over prior versions will automatically use the correct schema. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows.

41 Apache HBase NoSQL on Hadoop 40 References 1. HBase: The Definitive Guide - O'Reilly ( 2. apache_hbase_reference_guide Apache (