HAWQ Architecture. Alexey Grishchenko

Transcription

1 HAWQ Architecture Alexey Grishchenko

2 Who I am Enterprise Pivotal 7 years in data processing 5 years of experience with MPP 4 years with Hadoop Using HAWQ since the first internal Beta Responsible for designing most of the EMEA HAWQ and Greenplum implementations Spark contributor

3 Agenda What is HAWQ

4 Agenda What is HAWQ Why you need it

5 Agenda What is HAWQ Why you need it HAWQ Components

6 Agenda What is HAWQ Why you need it HAWQ Components HAWQ Design

7 Agenda What is HAWQ Why you need it HAWQ Components HAWQ Design Query execution example

8 Agenda What is HAWQ Why you need it HAWQ Components HAWQ Design Query execution example Competitive solutions

9 What is Analytical SQL-on-Hadoop engine

10 What is Analytical SQL-on-Hadoop engine HAdoop With Queries

11 What is Analytical SQL-on-Hadoop engine HAdoop With Queries Postgres Greenplum HAWQ 2005 Fork Postgres 8.0.2

12 What is Analytical SQL-on-Hadoop engine HAdoop With Queries Postgres Greenplum HAWQ Fork Postgres Rebase Postgres

13 What is Analytical SQL-on-Hadoop engine HAdoop With Queries Postgres Greenplum HAWQ Fork Postgres Rebase Postgres Fork GPDB

14 What is Analytical SQL-on-Hadoop engine HAdoop With Queries Postgres Greenplum HAWQ Fork Postgres Rebase Postgres Fork GPDB HAWQ

15 What is Analytical SQL-on-Hadoop engine HAdoop With Queries Postgres Greenplum HAWQ Fork Postgres Rebase Postgres Fork GPDB HAWQ HAWQ Open Source

16 HAWQ is C and C++ lines of code

17 HAWQ is C and C++ lines of code of them in headers only

18 HAWQ is C and C++ lines of code of them in headers only Python LOC

19 HAWQ is C and C++ lines of code of them in headers only Python LOC Java LOC

20 HAWQ is C and C++ lines of code of them in headers only Python LOC Java LOC Makefile LOC

21 HAWQ is C and C++ lines of code of them in headers only Python LOC Java LOC Makefile LOC Shell scripts LOC

22 HAWQ is C and C++ lines of code of them in headers only Python LOC Java LOC Makefile LOC Shell scripts LOC More than 50 enterprise customers

23 HAWQ is C and C++ lines of code of them in headers only Python LOC Java LOC Makefile LOC Shell scripts LOC More than 50 enterprise customers More than 10 of them in EMEA

24 Apache HAWQ Apache HAWQ (incubating) from What s in Open Source Sources of HAWQ 2.0 alpha HAWQ 2.0 beta is planned for 2015 Q4 HAWQ 2.0 GA is planned for 2016 Q1 Community is yet young come and join!

25 Why do we need it?

26 Why do we need it? SQL-interface for BI solutions to the Hadoop data complaint with ANSI SQL-92, -99, -2003

27 Why do we need it? SQL-interface for BI solutions to the Hadoop data complaint with ANSI SQL-92, -99, Example line query with a number of window function generated by Cognos

28 Why do we need it? SQL-interface for BI solutions to the Hadoop data complaint with ANSI SQL-92, -99, Example line query with a number of window function generated by Cognos Universal tool for ad hoc analytics on top of Hadoop data

29 Why do we need it? SQL-interface for BI solutions to the Hadoop data complaint with ANSI SQL-92, -99, Example line query with a number of window function generated by Cognos Universal tool for ad hoc analytics on top of Hadoop data Example - parse URL to extract protocol, host name, port, GET parameters

30 Why do we need it? SQL-interface for BI solutions to the Hadoop data complaint with ANSI SQL-92, -99, Example line query with a number of window function generated by Cognos Universal tool for ad hoc analytics on top of Hadoop data Example - parse URL to extract protocol, host name, port, GET parameters Good performance

31 Why do we need it? SQL-interface for BI solutions to the Hadoop data complaint with ANSI SQL-92, -99, Example line query with a number of window function generated by Cognos Universal tool for ad hoc analytics on top of Hadoop data Example - parse URL to extract protocol, host name, port, GET parameters Good performance How many times the data would hit the HDD during a single Hive query?

32 HAWQ Cluster Server 1 Server 2 Server 3 Server 4 ZK JM ZK JM ZK JM NameNode SNameNode interconnect Server 5 Server 6 Server N Datanode Datanode Datanode

33 HAWQ Cluster Server 1 Server 2 Server 3 Server 4 YARN RM YARN App Timeline ZK JM ZK JM ZK JM NameNode SNameNode interconnect Server 5 Server 6 Server N YARN NM YARN NM YARN NM Datanode Datanode Datanode

34 HAWQ Cluster Server 1 Server 2 Server 3 Server 4 YARN RM YARN App Timeline ZK JM ZK JM ZK JM HAWQ Master HAWQ Standby NameNode SNameNode interconnect Server 5 Server 6 Server N YARN NM YARN NM YARN NM Datanode Datanode Datanode

35 Master Servers Server 1 Server 2 Server 3 Server 4 YARN RM YARN App Timeline ZK JM ZK JM ZK JM HAWQ Master HAWQ Standby NameNode SNameNode interconnect Server 5 Server 6 Server N YARN NM YARN NM YARN NM Datanode Datanode Datanode

36 Master Servers HAWQ Master HAWQ Standby Master Query Parser Distributed TransacJons Manager Query Parser Distributed Transac,ons Manager Query OpJmizer Metadata Catalog WAL repl. Query Op,mizer Metadata Catalog Global Resource Manager Query Dispatch Global Resource Manager Query Dispatch

37 Segments Server 1 Server 2 Server 3 Server 4 YARN RM YARN App Timeline ZK JM ZK JM ZK JM HAWQ Master HAWQ Standby NameNode SNameNode interconnect Server 5 Server 6 Server N YARN NM YARN NM YARN NM Datanode Datanode Datanode

38 Segments Query Executor libhdfs3 PXF YARN Node Manager Local Filesystem Temporary Data Directory Logs

39 Metadata HAWQ metadata structure is similar to Postgres catalog structure

40 Metadata HAWQ metadata structure is similar to Postgres catalog structure Statistics Number of rows and pages in the table

41 Metadata HAWQ metadata structure is similar to Postgres catalog structure Statistics Number of rows and pages in the table Most common values for each field

42 Metadata HAWQ metadata structure is similar to Postgres catalog structure Statistics Number of rows and pages in the table Most common values for each field Histogram of values distribution for each field

43 Metadata HAWQ metadata structure is similar to Postgres catalog structure Statistics Number of rows and pages in the table Most common values for each field Histogram of values distribution for each field Number of unique values in the field

44 Metadata HAWQ metadata structure is similar to Postgres catalog structure Statistics Number of rows and pages in the table Most common values for each field Histogram of values distribution for each field Number of unique values in the field Number of null values in the field

45 Metadata HAWQ metadata structure is similar to Postgres catalog structure Statistics Number of rows and pages in the table Most common values for each field Histogram of values distribution for each field Number of unique values in the field Number of null values in the field Average width of the field in bytes

46 Statistics No Statistics How many rows would produce the join of two tables?

47 Statistics No Statistics How many rows would produce the join of two tables? ü From 0 to infinity

48 Statistics No Statistics How many rows would produce the join of two tables? ü From 0 to infinity Row Count How many rows would produce the join of two row tables?

49 Statistics No Statistics How many rows would produce the join of two tables? ü From 0 to infinity Row Count How many rows would produce the join of two row tables? ü From 0 to

50 Statistics No Statistics How many rows would produce the join of two tables? ü From 0 to infinity Row Count How many rows would produce the join of two row tables? ü From 0 to Histograms and MCV How many rows would produce the join of two row tables, with known field cardinality, values distribution diagram, number of nulls, most common values?

51 Statistics No Statistics How many rows would produce the join of two tables? ü From 0 to infinity Row Count How many rows would produce the join of two row tables? ü From 0 to Histograms and MCV How many rows would produce the join of two row tables, with known field cardinality, values distribution diagram, number of nulls, most common values? ü ~ From 500 to 1 500

52 Metadata Table structure information ID Name Num Price 1 Яблоко Груша Банан Апельсин Киви Арбуз Дыня Ананас 35 90

53 Metadata Table structure information Distribution fields ID Name Num Price 1 Яблоко Груша Банан Апельсин Киви Арбуз Дыня Ананас hash(id)

54 Metadata Table structure information Distribution fields Number of hash buckets ID Name Num Price 1 Яблоко Груша Банан Апельсин Киви Арбуз Дыня Ананас hash(id) ID Name Num Price 1 Яблоко Груша Банан Апельсин Киви Арбуз Дыня Ананас 35 90

55 Metadata Table structure information Distribution fields Number of hash buckets Partitioning (hash, list, range) ID Name Num Price 1 Яблоко Груша Банан Апельсин Киви Арбуз Дыня Ананас hash(id) ID Name Num Price 1 Яблоко Груша Банан Апельсин Киви Арбуз Дыня Ананас 35 90

56 Metadata Table structure information Distribution fields Number of hash buckets Partitioning (hash, list, range) General metadata Users and groups

57 Metadata Table structure information Distribution fields Number of hash buckets Partitioning (hash, list, range) General metadata Users and groups Access privileges

58 Metadata Table structure information Distribution fields Number of hash buckets Partitioning (hash, list, range) General metadata Users and groups Access privileges Stored procedures PL/pgSQL, PL/Java, PL/Python, PL/Perl, PL/R

59 Query Optimizer HAWQ uses cost-based query optimizers

60 Query Optimizer HAWQ uses cost-based query optimizers You have two options Planner evolved from the Postgres query optimizer ORCA (Pivotal Query Optimizer) developed specifically for HAWQ

61 Query Optimizer HAWQ uses cost-based query optimizers You have two options Planner evolved from the Postgres query optimizer ORCA (Pivotal Query Optimizer) developed specifically for HAWQ Optimizer hints work just like in Postgres Enable/disable specific operation Change the cost estimations for basic actions

62 Storage Formats Which storage format is the most optimal?

63 Storage Formats Which storage format is the most optimal? ü It depends on what you mean by optimal

64 Storage Formats Which storage format is the most optimal? ü It depends on what you mean by optimal Minimal CPU usage for reading and writing the data

65 Storage Formats Which storage format is the most optimal? ü It depends on what you mean by optimal Minimal CPU usage for reading and writing the data Minimal disk space usage

66 Storage Formats Which storage format is the most optimal? ü It depends on what you mean by optimal Minimal CPU usage for reading and writing the data Minimal disk space usage Minimal time to retrieve record by key

67 Storage Formats Which storage format is the most optimal? ü It depends on what you mean by optimal Minimal CPU usage for reading and writing the data Minimal disk space usage Minimal time to retrieve record by key Minimal time to retrieve subset of columns etc.

68 Storage Formats Row-based storage format Similar to Postgres heap storage No toast No ctid, xmin, xmax, cmin, cmax

69 Storage Formats Row-based storage format Similar to Postgres heap storage No toast No ctid, xmin, xmax, cmin, cmax Compression No compression Quicklz Zlib levels 1-9

70 Storage Formats Apache Parquet Mixed row-columnar table store, the data is split into row groups stored in columnar format

71 Storage Formats Apache Parquet Mixed row-columnar table store, the data is split into row groups stored in columnar format Compression No compression Snappy Gzip levels 1 9

72 Storage Formats Apache Parquet Mixed row-columnar table store, the data is split into row groups stored in columnar format Compression No compression Snappy Gzip levels 1 9 The size of row group and page size can be set for each table separately

73 Resource Management Two main options Static resource split HAWQ and YARN does not know about each other

74 Resource Management Two main options Static resource split HAWQ and YARN does not know about each other YARN HAWQ asks YARN Resource Manager for query execution resources

75 Resource Management Two main options Static resource split HAWQ and YARN does not know about each other YARN HAWQ asks YARN Resource Manager for query execution resources Flexible cluster utilization Query might run on a subset of nodes if it is small

76 Resource Management Two main options Static resource split HAWQ and YARN does not know about each other YARN HAWQ asks YARN Resource Manager for query execution resources Flexible cluster utilization Query might run on a subset of nodes if it is small Query might have many executors on each cluster node to make it run faster

77 Resource Management Two main options Static resource split HAWQ and YARN does not know about each other YARN HAWQ asks YARN Resource Manager for query execution resources Flexible cluster utilization Query might run on a subset of nodes if it is small Query might have many executors on each cluster node to make it run faster You can control the parallelism of each query

78 Resource Management Resource Queue can be set with Maximum number of parallel queries

79 Resource Management Resource Queue can be set with Maximum number of parallel queries CPU usage priority

80 Resource Management Resource Queue can be set with Maximum number of parallel queries CPU usage priority Memory usage limits

81 Resource Management Resource Queue can be set with Maximum number of parallel queries CPU usage priority Memory usage limits CPU cores usage limit

82 Resource Management Resource Queue can be set with Maximum number of parallel queries CPU usage priority Memory usage limits CPU cores usage limit MIN/MAX number of executors across the system

83 Resource Management Resource Queue can be set with Maximum number of parallel queries CPU usage priority Memory usage limits CPU cores usage limit MIN/MAX number of executors across the system MIN/MAX number of executors on each node

84 Resource Management Resource Queue can be set with Maximum number of parallel queries CPU usage priority Memory usage limits CPU cores usage limit MIN/MAX number of executors across the system MIN/MAX number of executors on each node Can be set up for user or group

85 External Data PXF Framework for external data access Easy to extend, many public plugins available Official plugins: CSV, SequenceFile, Avro, Hive, HBase Open Source plugins: JSON, Accumulo, Cassandra, JDBC, Redis, Pipe

86 External Data PXF Framework for external data access Easy to extend, many public plugins available Official plugins: CSV, SequenceFile, Avro, Hive, HBase Open Source plugins: JSON, Accumulo, Cassandra, JDBC, Redis, Pipe HCatalog HAWQ can query tables from HCatalog the same way as HAWQ native tables

87 Query Example HAWQ Master Query Parser Query OpJmizer YARN RM Metadata TransacJon Mgr. Resource Mgr. Query Dispatch NameNode Server 1 Server 2 Server N Plan Resource Prepare Execute Result Cleanup

88 Query Example HAWQ Master Query Parser Query OpJmizer YARN RM Metadata TransacJon Mgr. QE Resource Mgr. Query Dispatch NameNode Server 1 Server 2 Server N Plan Resource Prepare Execute Result Cleanup

92 Query Example HAWQ Master Query Parser Query OpJmizer Motion Gather Project s.beer, s.price YARN RM Metadata Resource Mgr. HashJoin b.name = s.bar TransacJon Mgr. QE Query Dispatch s Scan Sells Motion Redist(b.name) Filter b.city = 'San Francisco' b Scan Bars NameNode Server 1 Server 2 Server N Plan Resource Prepare Execute Result Cleanup

93 Query Example HAWQ Master Query Parser Query OpJmizer YARN RM Metadata Resource Mgr. TransacJon Mgr. QE Query Dispatch NameNode Server 1 Server 2 Server N Plan Resource Prepare Execute Result Cleanup

94 Query Example HAWQ Master Query Parser Query OpJmizer YARN RM Metadata TransacJon Mgr. QE Resource Mgr. Query Dispatch I need 5 containers Each with 1 CPU core and 256 MB RAM NameNode Server 1 Server 2 Server N Plan Resource Prepare Execute Result Cleanup

95 Query Example HAWQ Master Query Parser Query OpJmizer Server 1: 2 containers Server 2: 1 container Server N: 2 containers YARN RM Metadata TransacJon Mgr. QE Resource Mgr. Query Dispatch I need 5 containers Each with 1 CPU core and 256 MB RAM NameNode Server 1 Server 2 Server N Plan Resource Prepare Execute Result Cleanup

98 Query Example HAWQ Master Query Parser Query OpJmizer Server 1: 2 containers Server 2: 1 container Server N: 2 containers YARN RM Metadata TransacJon Mgr. QE Resource Mgr. Query Dispatch I need 5 containers Each with 1 CPU core and 256 MB RAM NameNode Server 1 Server 2 Server N QE QE QE QE QE Plan Resource Prepare Execute Result Cleanup

99 Query Example HAWQ Master Query Parser Query OpJmizer YARN RM Metadata Resource Mgr. TransacJon Mgr. QE Query Dispatch NameNode Server 1 Server 2 Server N QE QE QE QE QE Plan Resource Prepare Execute Result Cleanup

100 Query Example HAWQ Master Query Parser Query OpJmizer Motion Gather Project s.beer, s.price YARN RM Metadata Resource Mgr. HashJoin b.name = s.bar TransacJon Mgr. QE Query Dispatch s Scan Sells Motion Redist(b.name) Filter b.city = 'San Francisco' b Scan Bars NameNode Server 1 Server 2 Server N QE QE QE QE QE Plan Resource Prepare Execute Result Cleanup

109 Query Example HAWQ Master Query Parser Query OpJmizer YARN RM Metadata TransacJon Mgr. QE Resource Mgr. Query Dispatch Free query resources Server 1: 2 containers Server 2: 1 container Server N: 2 containers NameNode Server 1 Server 2 Server N QE QE QE QE QE Plan Resource Prepare Execute Result Cleanup

110 Query Example HAWQ Master Query Parser Query OpJmizer OK YARN RM Metadata TransacJon Mgr. QE Resource Mgr. Query Dispatch Free query resources Server 1: 2 containers Server 2: 1 container Server N: 2 containers NameNode Server 1 Server 2 Server N QE QE QE QE QE Plan Resource Prepare Execute Result Cleanup

114 Query Example HAWQ Master Query Parser Query OpJmizer YARN RM Metadata TransacJon Mgr. Resource Mgr. Query Dispatch NameNode Server 1 Server 2 Server N Plan Resource Prepare Execute Result Cleanup

115 Query Performance Data does not hit the disk unless this cannot be avoided

116 Query Performance Data does not hit the disk unless this cannot be avoided Data is not buffered on the segments unless this cannot be avoided

117 Query Performance Data does not hit the disk unless this cannot be avoided Data is not buffered on the segments unless this cannot be avoided Data is transferred between the nodes by UDP

118 Query Performance Data does not hit the disk unless this cannot be avoided Data is not buffered on the segments unless this cannot be avoided Data is transferred between the nodes by UDP HAWQ has a good cost-based query optimizer

119 Query Performance Data does not hit the disk unless this cannot be avoided Data is not buffered on the segments unless this cannot be avoided Data is transferred between the nodes by UDP HAWQ has a good cost-based query optimizer C/C++ implementation is more efficient than Java implementation of competitive solutions

120 Query Performance Data does not hit the disk unless this cannot be avoided Data is not buffered on the segments unless this cannot be avoided Data is transferred between the nodes by UDP HAWQ has a good cost-based query optimizer C/C++ implementation is more efficient than Java implementation of competitive solutions Query parallelism can be easily tuned

121 Competitive Solutions Query OpJmizer Hive SparkSQL Impala HAWQ

122 Competitive Solutions Hive SparkSQL Impala HAWQ Query OpJmizer ANSI SQL

123 Competitive Solutions Hive SparkSQL Impala HAWQ Query OpJmizer ANSI SQL Built-in Languages

124 Competitive Solutions Hive SparkSQL Impala HAWQ Query OpJmizer ANSI SQL Built-in Languages Disk IO

125 Competitive Solutions Hive SparkSQL Impala HAWQ Query OpJmizer ANSI SQL Built-in Languages Disk IO Parallelism

126 Competitive Solutions Hive SparkSQL Impala HAWQ Query OpJmizer ANSI SQL Built-in Languages Disk IO Parallelism DistribuJons

127 Competitive Solutions Hive SparkSQL Impala HAWQ Query OpJmizer ANSI SQL Built-in Languages Disk IO Parallelism DistribuJons Stability

128 Competitive Solutions Hive SparkSQL Impala HAWQ Query OpJmizer ANSI SQL Built-in Languages Disk IO Parallelism DistribuJons Stability Community

129 Roadmap AWS and S3 integration

130 Roadmap AWS and S3 integration Mesos integration

131 Roadmap AWS and S3 integration Mesos integration Better Ambari integration

132 Roadmap AWS and S3 integration Mesos integration Better Ambari integration Cloudera, MapR and IBM Hadoop distributions native support

133 Roadmap AWS and S3 integration Mesos integration Better Ambari integration Cloudera, MapR and IBM Hadoop distributions native support Make the SQL-on-Hadoop engine ever!

134 Summary Modern SQL-on-Hadoop engine For structured data processing and analysis Combines the best techniques of competitive solutions Just released to the open source Community is very young Join our community and contribute!

135 Questions Apache HAWQ Reach me on