adaptive loading adaptive indexing dbtouch 3 Ideas for Big Data Exploration

Size: px
Start display at page:

Download "adaptive loading adaptive indexing dbtouch 3 Ideas for Big Data Exploration"

Transcription

1 adaptive loading adaptive indexing dbtouch Ideas for Big Data Exploration Stratos Idreos CWI, INS-, Amsterdam

2

3

4

5 data is everywhere

6 daily data years

7 daily data years Eric Schmidt: Every two days we create as much data as much we did from dawn of humanity to 00

8 - exabytes daily data years 0 Eric Schmidt: Every two days we create as much data as much we did from dawn of humanity to 00

9

10 not always sure what we are looking for (until we find it)

11 not always sure what we are looking for (until we find it) data exploration

12 database systems great... declarative processing, back-end to numerous apps

13 database systems great... declarative processing, back-end to numerous apps but databases are too heavy and blind!

14 database systems great... declarative processing, back-end to numerous apps but databases are too heavy and blind! timeline

15 database systems great... declarative processing, back-end to numerous apps but databases are too heavy and blind! load timeline

16 database systems great... declarative processing, back-end to numerous apps but databases are too heavy and blind! load index timeline

17 database systems great... declarative processing, back-end to numerous apps but databases are too heavy and blind! load index query timeline

18 expert users - idle time - workload knowledge database systems great... declarative processing, back-end to numerous apps but databases are too heavy and blind! load index query timeline

19

20 systems tailored for data exploration

21 no workload knowledge systems tailored for data exploration

22 no workload knowledge no installation steps systems tailored for data exploration

23 no workload knowledge no installation steps quick response times systems tailored for data exploration

24 no workload knowledge no installation steps quick response times minimize data-to-query time systems tailored for data exploration

25 ideas adaptive indexing ACM SIGMOD Jim Gray Diss. Award - Cor Baayen Award years, papers

26 ideas adaptive indexing ACM SIGMOD Jim Gray Diss. Award - Cor Baayen Award years, papers adaptive loading In use in Hadapt and Tableau years, papers

27 ideas adaptive indexing ACM SIGMOD Jim Gray Diss. Award - Cor Baayen Award years, papers adaptive loading In use in Hadapt and Tableau years, papers dbtouch VENI 0 year, paper

28 ideas adaptive indexing ACM SIGMOD Jim Gray Diss. Award - Cor Baayen Award years, papers adaptive loading In use in Hadapt and Tableau years, papers dbtouch VENI 0 year, paper Martin Kersten, Stefan Manegold, Felix Halim, Panagiotis Karras, Roland Yap, Goetz Graefe, Harumi Kuno, Eleni Petraki, Anastasia Ailamaki, Ioannis Alagiannis, Renata Borovica, Miguel Branco, Ryan Johnson, Erietta Liarou

29 indexing load index query tune= create proper indexes offline performance 0-00X

30 indexing load index query tune= create proper indexes offline performance 0-00X but it depends on workload! which indices to build? on which data parts? and when to build them?

31 load index query

32 load index query timeline

33 load index query sample workload timeline

34 load index query sample workload analyze timeline

35 load index query sample workload analyze create indices timeline

36 load index query sample workload analyze create indices query timeline

37 load index query sample workload analyze create indices query timeline complex and time consuming process

38 load index query human administrators + auto-tuning tools sample workload analyze create indices query timeline complex and time consuming process

39 what can go wrong?

40 what can go wrong? not enough idle time to finish proper tuning

41 what can go wrong? not enough idle time to finish proper tuning by the time we finish tuning, the workload changes

42 database cracking

43 database cracking remove all tuning steps and need for human input incremental, adaptive, partial indexing

44 initialization indexing database cracking remove all tuning steps and need for human input incremental, adaptive, partial indexing

45 initialization query indexing database cracking remove all tuning steps and need for human input incremental, adaptive, partial indexing

46 initialization query indexing database cracking remove all tuning steps and need for human input incremental, adaptive, partial indexing

47 initialization query indexing database cracking remove all tuning steps and need for human input incremental, adaptive, partial indexing every query is treated as an advice on how data should be stored

48 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A

49 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A

50 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A

51 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A

52 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A

53 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A

54 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A

55 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A

56 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A

57 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Result tuples Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A

58 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Result tuples Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

59 Database Cracking CIDR 00 cracking example gain knowledge on how data is organized Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Result tuples Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

60 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

61 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

62 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

63 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

64 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

65 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

66 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

67 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

68 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Dynamically/on-the-fly within the select-operator

69 Database Cracking CIDR 00 cracking example Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Result tuples Dynamically/on-the-fly within the select-operator

70 Database Cracking CIDR 00 cracking example the more we crack, the more we learn Column A Cracker column of A Cracker column of A Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Q (copy) Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Result tuples Dynamically/on-the-fly within the select-operator

71 Database Cracking CIDR 00 cracking example the more we crack, the more we learn Q: select * from R where R.A > 0 and R.A < Q: select * from R where R.A > and R.A <= Column A Q (copy) Cracker column of A Piece : A <= 0 Q (in place) Piece : 0 < A < Piece : <= A Cracker column of A every query is treated as an advice on how data should be stored Piece : A <= Piece : < A <= 0 Piece : 0 < A < Piece : <= A <= Piece 5: < A Result tuples Dynamically/on-the-fly within the select-operator

72 Sideways Cracking, SIGMOD 0 TPC-H 0000 MonetDB Presorted Sel. Crack Sid. Crack MySQL Presorted Response time (milli secs) TPC-H Query Query sequence

73 Sideways Cracking, SIGMOD 0 TPC-H 0000 MonetDB Presorted Sel. Crack Sid. Crack MySQL Presorted Response time (milli secs) TPC-H Query 5 Normal MonetDB selection cracking Query sequence

74 Sideways Cracking, SIGMOD 0 TPC-H 0000 MonetDB Presorted Sel. Crack Sid. Crack MySQL Presorted Response time (milli secs) TPC-H Query 5 Normal MonetDB selection cracking Presorted MonetDB Preparation cost - minutes Query sequence

75 Sideways Cracking, SIGMOD 0 TPC-H 0000 MonetDB Presorted Sel. Crack Sid. Crack MySQL Presorted Response time (milli secs) Presorted MonetDB Preparation cost - minutes TPC-H Query Query sequence Normal MonetDB selection cracking MonetDB with sideways cracking

76 Sideways Cracking, SIGMOD 0 TPC-H 0000 MonetDB Presorted Sel. Crack Sid. Crack MySQL Presorted Response time (milli secs) Presorted MonetDB Preparation cost - minutes TPC-H Query Query sequence Normal MonetDB selection cracking MonetDB with sideways cracking

77 Sideways Cracking, SIGMOD 0 TPC-H 0000 MonetDB Presorted Sel. Crack Sid. Crack MySQL Presorted Response time (milli secs) Presorted MonetDB Preparation cost - minutes TPC-H Query Query sequence Normal MonetDB selection cracking MonetDB with sideways cracking

78 Sideways Cracking, SIGMOD 0 TPC-H 0000 MonetDB Presorted Sel. Crack Sid. Crack MySQL Presorted Response time (milli secs) Presorted MonetDB Preparation cost - minutes TPC-H Query Query sequence Normal MonetDB selection cracking MonetDB with sideways cracking

79 Stochastic Cracking, PVLDB skyserver experiments cracking answers queries while full indexing is still half way creating the index

80 cracking databases updates (SIGMOD0) multiple columns (SIGMOD0) storage restrictions (SIGMOD0) benchmarking (TPCTC0) algorithms (PVLDB) concurrency control (PVLDB) robustness (PVLDB)

81 cracking databases updates (SIGMOD0) multiple columns (SIGMOD0) storage restrictions (SIGMOD0) benchmarking (TPCTC0) joins disk based multi-dimensional holistic indexing optimization algorithms (PVLDB) row-stores multi-cores concurrency control (PVLDB) robustness (PVLDB) aggregations vectorwised

82 loading load index query

83 loading load index query copy data inside the database database now has full control

84 loading load index query copy data inside the database database now has full control slow process...not all data might be needed all the time

85 database vs. unix tools file, attributes, billion tuples DB Awk single query cost (secs) 0 550,00,50,00

86 database vs. unix tools file, attributes, billion tuples DB Awk single query cost (secs) 0 550,00,50,00 break down db cost Loading Query Processing % %

87 database vs. unix tools file, attributes, billion tuples DB Awk single query cost (secs) 0 550,00,50,00 break down db cost Loading Query Processing % loading is a major bottleneck %

88 database vs. unix tools file, attributes, billion tuples DB Awk single query cost (secs) 0 550,00,50,00 break down db cost Loading Query Processing % loading is a major bottleneck %... but writing/maintaining scripts is hard too

89 adaptive loading load/touch only what is needed and only when it is needed

90 but raw data access is expensive tokenizing - parsing - no indexing - no statistics

91 but raw data access is expensive tokenizing - parsing - no indexing - no statistics challenge: fast raw data access

92 query plan

93 query plan scan

94 query plan scan db

95 query plan scan db

96 query plan scan zero loading cost

97 query plan scan files access raw data adaptively on-the-fly zero loading cost

98 query plan scan files cache access raw data adaptively on-the-fly zero loading cost

99 selective parsing - file indexing file splitting - online statistics query plan scan files cache access raw data adaptively on-the-fly zero loading cost

100 NoDB SIGMOD 0 Execution Time (sec) ~500 ~500 Q0 Q Q Q Q Q5 Q Q Q Q Q0 Q Q Q Q Q5 Q Q Q Q Load 00 0 MySQL CSV Engine MySQL DBMS X DBMS X w/ external files PostgreSQL PostgresRaw PM + C 0

101 NoDB SIGMOD 0 Execution Time (sec) ~500 ~500 Q0 Q Q Q Q Q5 Q Q Q Q Q0 Q Q Q Q Q5 Q Q Q Q Load 00 0 MySQL CSV Engine MySQL DBMS X DBMS X w/ external files PostgreSQL PostgresRaw PM + C 0

102 NoDB SIGMOD 0 Execution Time (sec) ~500 ~500 Q0 Q Q Q Q Q5 Q Q Q Q Q0 Q Q Q Q Q5 Q Q Q Q Load 00 0 MySQL CSV Engine MySQL DBMS X DBMS X w/ external files PostgreSQL PostgresRaw PM + C 0

103 NoDB SIGMOD 0 Execution Time (sec) ~500 ~500 Q0 Q Q Q Q Q5 Q Q Q Q Q0 Q Q Q Q Q5 Q Q Q Q Load 00 0 MySQL CSV Engine MySQL DBMS X DBMS X w/ external files PostgreSQL PostgresRaw PM + C 0

104 NoDB SIGMOD 0 Execution Time (sec) ~500 ~500 Q0 Q Q Q Q Q5 Q Q Q Q Q0 Q Q Q Q Q5 Q Q Q Q Load 00 0 MySQL CSV Engine MySQL DBMS X DBMS X w/ external files PostgreSQL PostgresRaw PM + C 0

105 NoDB SIGMOD 0 Execution Time (sec) ~500 ~500 Q0 Q Q Q Q Q5 Q Q Q Q Q0 Q Q Q Q Q5 Q Q Q Q Load 00 0 MySQL CSV Engine MySQL DBMS X DBMS X w/ external files PostgreSQL PostgresRaw PM + C 0

106 NoDB SIGMOD 0 Execution Time (sec) ~500 ~500 Q0 Q Q Q Q Q5 Q Q Q Q Q0 Q Q Q Q Q5 Q Q Q Q Load 00 0 MySQL CSV Engine MySQL DBMS X DBMS X w/ external files PostgreSQL PostgresRaw PM + C 0

107 NoDB SIGMOD 0 Execution Time (sec) ~500 ~500 Q0 Q Q Q Q Q5 Q Q Q Q Q0 Q Q Q Q Q5 Q Q Q Q Load 00 0 MySQL CSV Engine MySQL DBMS X DBMS X w/ external files PostgreSQL PostgresRaw PM + C reducing data-to-query time 0

108 querying load index query

109 querying load index query declarative SQL interface

110 querying load index query declarative SQL interface correct and complete answers

111 querying load index query declarative SQL interface correct and complete answers

112 querying complex and slow - not fit for exploration load index query declarative SQL interface correct and complete answers

113 dbtouch dbtouch CIDR 0

114 dbtouch dbtouch CIDR 0 data

115 dbtouch dbtouch CIDR 0 data query

116 dbtouch dbtouch CIDR 0 no expert knowledge - no set-up costs interactive mode fit for exploration data query

117 dbtouch CIDR 0 challenge design database systems tailored for touch exploration

118 dbtouch CIDR 0 challenge design database systems tailored for touch exploration user drives query processing actions

119 adaptive loading adaptive indexing dbtouch Ideas for Big Data Exploration Stratos Idreos CWI, INS-, Amsterdam

120 adaptive loading adaptive indexing dbtouch Ideas for Big Data Exploration Stratos Idreos CWI, INS-, Amsterdam systems tailored for exploration

121 more in INS- data vaults sciborg cyclotron sciql holistic indexing datacell charles hardware-conscious

122 more in INS- data vaults sciborg cyclotron sciql holistic indexing Thank you! datacell charles hardware-conscious

Big Data Exploration By Stratos Idreos CWI, Amsterdam, The Netherlands. Introduction. In Need for Big Data Query Processing

Big Data Exploration By Stratos Idreos CWI, Amsterdam, The Netherlands. Introduction. In Need for Big Data Query Processing Big Data Exploration By Stratos Idreos CWI, Amsterdam, The Netherlands Introduction The Big Data Era. We are now entering the era of data deluge, where the amount of data outgrows the capabilities of query

More information

NoDB: Efficient Query Execution on Raw Data Files

NoDB: Efficient Query Execution on Raw Data Files NoDB: Efficient Query Execution on Raw Data Files Ioannis Alagiannis Renata Borovica Miguel Branco Stratos Idreos Anastasia Ailamaki EPFL, Switzerland {ioannis.alagiannis, renata.borovica, miguel.branco,

More information

class 1 welcome to CS265! BIG DATA SYSTEMS prof. Stratos Idreos

class 1 welcome to CS265! BIG DATA SYSTEMS prof. Stratos Idreos class 1 welcome to CS265 BIG DATA SYSTEMS prof. plan for today big data + systems class structure + logistics who is who project ideas 2 data vs knowledge 3 big data 4 daily data 2.5 exabytes years 2012

More information

Overview of Data Exploration Techniques

Overview of Data Exploration Techniques Overview of Data Exploration Techniques Stratos Idreos Harvard University stratos@seas.harvard.edu Olga Papaemmanouil Brandeis University olga@cs.brandeis.edu Surajit Chaudhuri Microsoft Research surajitc@microsoft.com

More information

PostgreSQL Business Intelligence & Performance Simon Riggs CTO, 2ndQuadrant PostgreSQL Major Contributor

PostgreSQL Business Intelligence & Performance Simon Riggs CTO, 2ndQuadrant PostgreSQL Major Contributor PostgreSQL Business Intelligence & Performance Simon Riggs CTO, 2ndQuadrant PostgreSQL Major Contributor The research leading to these results has received funding from the European Union's Seventh Framework

More information

dbtouch: Analytics at your Fingertips

dbtouch: Analytics at your Fingertips dbtouch: Analytics at your Fingertips Stratos Idreos Erietta Liarou CWI, Amsterdam EPFL, Lausanne stratos.idreos@cwi.nl erietta.liarou@epfl.ch ABSTRACT As we enter the era of data deluge, turning data

More information

SWISSBOX REVISITING THE DATA PROCESSING SOFTWARE STACK

SWISSBOX REVISITING THE DATA PROCESSING SOFTWARE STACK 3/2/2011 SWISSBOX REVISITING THE DATA PROCESSING SOFTWARE STACK Systems Group Dept. of Computer Science ETH Zürich, Switzerland SwissBox Humboldt University Dec. 2010 Systems Group = www.systems.ethz.ch

More information

Report Data Management in the Cloud: Limitations and Opportunities

Report Data Management in the Cloud: Limitations and Opportunities Report Data Management in the Cloud: Limitations and Opportunities Article by Daniel J. Abadi [1] Report by Lukas Probst January 4, 2013 In this report I want to summarize Daniel J. Abadi's article [1]

More information

DiNoDB: Efficient Large-Scale Raw Data Analytics

DiNoDB: Efficient Large-Scale Raw Data Analytics DiNoDB: Efficient Large-Scale Raw Data Analytics Yongchao Tian Eurecom yongchao.tian@eurecom.fr Anastasia Ailamaki EPFL anastasia.ailamaki@epfl.ch Ioannis Alagiannis EPFL ioannis.alagiannis@epfl.ch Pietro

More information

iservdb The database closest to you IDEAS Institute

iservdb The database closest to you IDEAS Institute iservdb The database closest to you IDEAS Institute 1 Overview 2 Long-term Anticipation iservdb is a relational database SQL compliance and a general purpose database Data is reliable and consistency iservdb

More information

NUMA obliviousness through memory mapping

NUMA obliviousness through memory mapping NUMA obliviousness through memory mapping Mrunal Gawade CWI, Amsterdam mrunal.gawade@cwi.nl Martin Kersten CWI, Amsterdam martin.kersten@cwi.nl ABSTRACT With the rise of multi-socket multi-core CPUs a

More information

Telemetry Database Query Performance Review

Telemetry Database Query Performance Review Telemetry Database Query Performance Review SophosLabs Network Security Group Michael Shannon Vulnerability Research Manager, SophosLabs michael.shannon@sophos.com Christopher Benninger Linux Deep Packet

More information

Arrays in database systems, the next frontier?

Arrays in database systems, the next frontier? Arrays in database systems, the next frontier? What is an array? An array is a systematic arrangement of objects, usually in rows and columns Get(A, X, Y) => Value Set(A, X, Y)

More information

Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010

Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010 Hadoop s Entry into the Traditional Analytical DBMS Market Daniel Abadi Yale University August 3 rd, 2010 Data, Data, Everywhere Data explosion Web 2.0 more user data More devices that sense data More

More information

Integrating Apache Spark with an Enterprise Data Warehouse

Integrating Apache Spark with an Enterprise Data Warehouse Integrating Apache Spark with an Enterprise Warehouse Dr. Michael Wurst, IBM Corporation Architect Spark/R/Python base Integration, In-base Analytics Dr. Toni Bollinger, IBM Corporation Senior Software

More information

Master s Thesis Nr. 14

Master s Thesis Nr. 14 Master s Thesis Nr. 14 Systems Group, Department of Computer Science, ETH Zurich Performance Benchmarking of Stream Processing Architectures with Persistence and Enrichment Support by Ilkay Ozan Kaya Supervised

More information

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013 Petabyte Scale Data at Facebook Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013 Agenda 1 Types of Data 2 Data Model and API for Facebook Graph Data 3 SLTP (Semi-OLTP) and Analytics

More information

SQL Server Performance Tuning and Optimization

SQL Server Performance Tuning and Optimization 3 Riverchase Office Plaza Hoover, Alabama 35244 Phone: 205.989.4944 Fax: 855.317.2187 E-Mail: rwhitney@discoveritt.com Web: www.discoveritt.com SQL Server Performance Tuning and Optimization Course: MS10980A

More information

Reversing Statistics for Scalable Test Databases Generation

Reversing Statistics for Scalable Test Databases Generation Reversing Statistics for Scalable Test Databases Generation Entong Shen Lyublena Antova Pivotal (formerly Greenplum) DBTest 2013, New York, June 24 1 Motivation Data generators: functional and performance

More information

Toward Efficient Variant Calling inside Main-Memory Database Systems

Toward Efficient Variant Calling inside Main-Memory Database Systems Toward Efficient Variant Calling inside Main-Memory Database Systems Sebastian Dorok Bayer Pharma AG and sebastian.dorok@ovgu.de Sebastian Breß sebastian.bress@ovgu.de Gunter Saake gunter.saake@ovgu.de

More information

Rackscale- the things that matter GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH

Rackscale- the things that matter GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH Rackscale- the things that matter GUSTAVO ALONSO SYSTEMS GROUP DEPT. OF COMPUTER SCIENCE ETH ZURICH HTDC 2014 Systems Group = www.systems.ethz.ch Enterprise Computing Center = www.ecc.ethz.ch On the way

More information

Net Developer Role Description Responsibilities Qualifications

Net Developer Role Description Responsibilities Qualifications Net Developer We are seeking a skilled ASP.NET/VB.NET developer with a background in building scalable, predictable, high-quality and high-performance web applications on the Microsoft technology stack.

More information

Overview of Scalable Distributed Database System SD-SQL Server

Overview of Scalable Distributed Database System SD-SQL Server Overview of Scalable Distributed Database System Server Witold Litwin 1, Soror Sahri 2, Thomas Schwarz 3 CERIA, Paris-Dauphine University 75016 Paris, France Abstract. We present a scalable distributed

More information

Business Analytics in (a) Blink

Business Analytics in (a) Blink Business Analytics in (a) Blink Ronald Barber Peter Bendel Marco Czech Oliver Draese Frederick Ho # Namik Hrle Stratos Idreos Min-Soo Kim Oliver Koeth Jae-Gil Lee Tianchao Tim Li Guy Lohman Konstantinos

More information

Microsoft SQL Server: MS-10980 Performance Tuning and Optimization Digital

Microsoft SQL Server: MS-10980 Performance Tuning and Optimization Digital coursemonster.com/us Microsoft SQL Server: MS-10980 Performance Tuning and Optimization Digital View training dates» Overview This course is designed to give the right amount of Internals knowledge and

More information

Performance Tuning and Optimizing SQL Databases 2016

Performance Tuning and Optimizing SQL Databases 2016 Performance Tuning and Optimizing SQL Databases 2016 http://www.homnick.com marketing@homnick.com +1.561.988.0567 Boca Raton, Fl USA About this course This four-day instructor-led course provides students

More information

Data Mining and Database Systems: Where is the Intersection?

Data Mining and Database Systems: Where is the Intersection? Data Mining and Database Systems: Where is the Intersection? Surajit Chaudhuri Microsoft Research Email: surajitc@microsoft.com 1 Introduction The promise of decision support systems is to exploit enterprise

More information

SQL, NoSQL, and Next Generation DBMSs. Shahram Ghandeharizadeh Director of the USC Database Lab

SQL, NoSQL, and Next Generation DBMSs. Shahram Ghandeharizadeh Director of the USC Database Lab SQL, NoSQL, and Next Generation DBMSs Shahram Ghandeharizadeh Director of the USC Database Lab Outline A brief history of DBMSs. OSs SQL NoSQL 1960/70 1980+ 2000+ Before Computers Database DBMS/Data Store

More information

Monitoring and Diagnosing Oracle RAC Performance with Oracle Enterprise Manager

Monitoring and Diagnosing Oracle RAC Performance with Oracle Enterprise Manager Monitoring and Diagnosing Oracle RAC Performance with Oracle Enterprise Manager Kai Yu, Orlando Gallegos Dell Oracle Solutions Engineering Oracle OpenWorld 2010, Session S316263 3:00-4:00pm, Thursday 23-Sep-2010

More information

Technical Challenges for Big Health Care Data. Donald Kossmann Systems Group Department of Computer Science ETH Zurich

Technical Challenges for Big Health Care Data. Donald Kossmann Systems Group Department of Computer Science ETH Zurich Technical Challenges for Big Health Care Data Donald Kossmann Systems Group Department of Computer Science ETH Zurich What is Big Data? technologies to automate experience Purpose answer difficult questions

More information

MySQL Enterprise Backup

MySQL Enterprise Backup MySQL Enterprise Backup Fast, Consistent, Online Backups A MySQL White Paper February, 2011 2011, Oracle Corporation and/or its affiliates Table of Contents Introduction... 3! Database Backup Terms...

More information

Database Virtualization: A New Frontier for Database Tuning and Physical Design

Database Virtualization: A New Frontier for Database Tuning and Physical Design Database Virtualization: A New Frontier for Database Tuning and Physical Design Ahmed A. Soror Ashraf Aboulnaga Kenneth Salem David R. Cheriton School of Computer Science University of Waterloo {aakssoro,

More information

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001 ICOM 6005 Database Management Systems Design Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001 Readings Read Chapter 1 of text book ICOM 6005 Dr. Manuel

More information

In-Memory Data Management for Enterprise Applications

In-Memory Data Management for Enterprise Applications In-Memory Data Management for Enterprise Applications Jens Krueger Senior Researcher and Chair Representative Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software Engineering University

More information

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller In-Memory Databases Algorithms and Data Structures on Modern Hardware Martin Faust David Schwalb Jens Krüger Jürgen Müller The Free Lunch Is Over 2 Number of transistors per CPU increases Clock frequency

More information

An Architecture for Using Tertiary Storage in a Data Warehouse

An Architecture for Using Tertiary Storage in a Data Warehouse An Architecture for Using Tertiary Storage in a Data Warehouse Theodore Johnson Database Research Dept. AT&T Labs - Research johnsont@research.att.com Motivation AT&T has huge data warehouses. Data from

More information

MAGENTO HOSTING Progressive Server Performance Improvements

MAGENTO HOSTING Progressive Server Performance Improvements MAGENTO HOSTING Progressive Server Performance Improvements Simple Helix, LLC 4092 Memorial Parkway Ste 202 Huntsville, AL 35802 sales@simplehelix.com 1.866.963.0424 www.simplehelix.com 2 Table of Contents

More information

Middle-R: A Middleware for Scalable Database Replication. Ricardo Jiménez-Peris

Middle-R: A Middleware for Scalable Database Replication. Ricardo Jiménez-Peris Middle-R: A Middleware for Scalable Database Replication Ricardo Jiménez-Peris Distributed Systems Laboratory Universidad Politécnica de Madrid (UPM) Joint work with Bettina Kemme, Marta Patiño, Yi Lin

More information

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3 Wort ftoc.tex V3-12/17/2007 2:00pm Page ix Introduction xix Part I: Finding Bottlenecks when Something s Wrong Chapter 1: Performance Tuning 3 Art or Science? 3 The Science of Performance Tuning 4 The

More information

Monitoring and Diagnosing Oracle RAC Performance with Oracle Enterprise Manager. Kai Yu, Orlando Gallegos Dell Oracle Solutions Engineering

Monitoring and Diagnosing Oracle RAC Performance with Oracle Enterprise Manager. Kai Yu, Orlando Gallegos Dell Oracle Solutions Engineering Monitoring and Diagnosing Oracle RAC Performance with Oracle Enterprise Manager Kai Yu, Orlando Gallegos Dell Oracle Solutions Engineering About Author Kai Yu Senior System Engineer, Dell Oracle Solutions

More information

Bellwether metrics for diagnosing

Bellwether metrics for diagnosing Bellwether metrics for diagnosing performance bottlenecks Dan Downing Principal Consultant MENTORA GROUP www.mentora.com Objectives Help you identify key resource metrics that help diagnose performance

More information

Overview. Introduction to Database Systems. Motivation... Motivation: how do we store lots of data?

Overview. Introduction to Database Systems. Motivation... Motivation: how do we store lots of data? Introduction to Database Systems UVic C SC 370 Overview What is a DBMS? what is a relational DBMS? Why do we need them? How do we represent and store data in a DBMS? How does it support concurrent access

More information

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

More information

High-Volume Data Warehousing in Centerprise. Product Datasheet

High-Volume Data Warehousing in Centerprise. Product Datasheet High-Volume Data Warehousing in Centerprise Product Datasheet Table of Contents Overview 3 Data Complexity 3 Data Quality 3 Speed and Scalability 3 Centerprise Data Warehouse Features 4 ETL in a Unified

More information

Index Selection Techniques in Data Warehouse Systems

Index Selection Techniques in Data Warehouse Systems Index Selection Techniques in Data Warehouse Systems Aliaksei Holubeu as a part of a Seminar Databases and Data Warehouses. Implementation and usage. Konstanz, June 3, 2005 2 Contents 1 DATA WAREHOUSES

More information

Big Data & Cloud Computing. Faysal Shaarani

Big Data & Cloud Computing. Faysal Shaarani Big Data & Cloud Computing Faysal Shaarani Agenda Business Trends in Data What is Big Data? Traditional Computing Vs. Cloud Computing Snowflake Architecture for the Cloud Business Trends in Data Critical

More information

Using Toaster in a Production Environment

Using Toaster in a Production Environment Using Toaster in a Production Environment Alexandru Damian, David Reyna, Belén Barros Pena Yocto Project Developer Day ELCE 17 Oct 2014 Introduction Agenda: What is Toaster Toaster out of the box Toaster

More information

PERFORMANCE EVALUATION OF DATABASE MANAGEMENT SYSTEMS BY THE ANALYSIS OF DBMS TIME AND CAPACITY

PERFORMANCE EVALUATION OF DATABASE MANAGEMENT SYSTEMS BY THE ANALYSIS OF DBMS TIME AND CAPACITY Vol.2, Issue.2, Mar-Apr 2012 pp-067-072 ISSN: 2249-6645 PERFORMANCE EVALUATION OF DATABASE MANAGEMENT SYSTEMS BY THE ANALYSIS OF DBMS TIME AND CAPACITY Aparna Kaladi 1 and Priya Ponnusamy 2 1 M.E Computer

More information

Big Data Analytics Platform @ Nokia

Big Data Analytics Platform @ Nokia Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform

More information

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo

High Availability for Database Systems in Cloud Computing Environments. Ashraf Aboulnaga University of Waterloo High Availability for Database Systems in Cloud Computing Environments Ashraf Aboulnaga University of Waterloo Acknowledgments University of Waterloo Prof. Kenneth Salem Umar Farooq Minhas Rui Liu (post-doctoral

More information

There and Back Again As Quick As a Flash

There and Back Again As Quick As a Flash There and Back Again As Quick As a Flash Stefan Winkler Independent Software Developer and IT Consultant CDO Committer stefan@winklerweb.net @xpomul on Twitter CDO in a nutshell CDO in a nutshell CDO in

More information

PRODUCT OVERVIEW SUITE DEALS. Combine our award-winning products for complete performance monitoring and optimization, and cost effective solutions.

PRODUCT OVERVIEW SUITE DEALS. Combine our award-winning products for complete performance monitoring and optimization, and cost effective solutions. Creating innovative software to optimize computing performance PRODUCT OVERVIEW Performance Monitoring and Tuning Server Job Schedule and Alert Management SQL Query Optimization Made Easy SQL Server Index

More information

Database Internals (Overview)

Database Internals (Overview) Database Internals (Overview) Eduardo Cunha de Almeida eduardo@inf.ufpr.br Outline of the course Introduction Database Systems (E. Almeida) Distributed Hash Tables and P2P (C. Cassagne) NewSQL (D. Kim

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska (magda) Fall 2007 Lecture 1 - Class Introduction

CSE 544 Principles of Database Management Systems. Magdalena Balazinska (magda) Fall 2007 Lecture 1 - Class Introduction CSE 544 Principles of Database Management Systems Magdalena Balazinska (magda) Fall 2007 Lecture 1 - Class Introduction Outline Introductions Class overview What is the point of a db management system

More information

IN-MEMORY DATABASE SYSTEMS. Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe 2015 1

IN-MEMORY DATABASE SYSTEMS. Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe 2015 1 IN-MEMORY DATABASE SYSTEMS Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe 2015 1 Analytical Processing Today Separation of OLTP and OLAP Motivation Online Transaction Processing (OLTP)

More information

Case Study. Case Study. Performance Testing For Student Application. US-based For-profit University (Higher Education) 1 2014 Compunnel Software Group

Case Study. Case Study. Performance Testing For Student Application. US-based For-profit University (Higher Education) 1 2014 Compunnel Software Group Performance Testing For Student Application US-based For-profit University (Higher Education) 1 2014 Compunnel Software Group Compunnel s Performance Testing Solution Delivers Impressive Student Experience

More information

OVERVIEW OF DATA EXPLORATION TECHNIQUES. Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri SIGMOD 2015, Melbourne

OVERVIEW OF DATA EXPLORATION TECHNIQUES. Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri SIGMOD 2015, Melbourne OVERVIEW OF DATA EXPLORATION TECHNIQUES Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri SIGMOD 2015, Melbourne USER INTERACTION express interests query/results recommendasons annotate collaborate

More information

An Approach to Implement Map Reduce with NoSQL Databases

An Approach to Implement Map Reduce with NoSQL Databases www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh

More information

Load Distribution in Large Scale Network Monitoring Infrastructures

Load Distribution in Large Scale Network Monitoring Infrastructures Load Distribution in Large Scale Network Monitoring Infrastructures Josep Sanjuàs-Cuxart, Pere Barlet-Ros, Gianluca Iannaccone, and Josep Solé-Pareta Universitat Politècnica de Catalunya (UPC) {jsanjuas,pbarlet,pareta}@ac.upc.edu

More information

Portable Scale-Out Benchmarks for MySQL. MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc.

Portable Scale-Out Benchmarks for MySQL. MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc. Portable Scale-Out Benchmarks for MySQL MySQL User Conference 2008 Robert Hodges CTO Continuent, Inc. Continuent 2008 Agenda / Introductions / Scale-Out Review / Bristlecone Performance Testing Tools /

More information

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases.

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases. SQL Databases Course by Applied Technology Research Center. 23 September 2015 This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases. Oracle Topics This Oracle Database: SQL

More information

DataCell: Building a Data Stream Engine on top of a Relational Database Kernel

DataCell: Building a Data Stream Engine on top of a Relational Database Kernel DataCell: Building a Data Stream Engine on top of a Relational Database Kernel Erietta Liarou (supervised by Martin Kersten) CWI Amsterdam, The Netherlands erietta@cwi.nl ABSTRACT Stream applications gained

More information

FROM DB TO DB. Manual. Page 1 of 7. Manual. Tel & Fax: +39 0984 494277 E-mail: info@altiliagroup.com Web: www.altilagroup.com

FROM DB TO DB. Manual. Page 1 of 7. Manual. Tel & Fax: +39 0984 494277 E-mail: info@altiliagroup.com Web: www.altilagroup.com Page 1 of 7 FROM DB TO DB Sede opertiva: Piazza Vermicelli 87036 Rende (CS), Italy Page 2 of 7 TABLE OF CONTENTS 1 APP documentation... 3 1.1 HOW IT WORKS... 3 1.2 Input data... 4 1.3 Output data... 5

More information

Application Tool for Experiments on SQL Server 2005 Transactions

Application Tool for Experiments on SQL Server 2005 Transactions Proceedings of the 5th WSEAS Int. Conf. on DATA NETWORKS, COMMUNICATIONS & COMPUTERS, Bucharest, Romania, October 16-17, 2006 30 Application Tool for Experiments on SQL Server 2005 Transactions ŞERBAN

More information

In-Memory Performance for Big Data

In-Memory Performance for Big Data In-Memory Performance for Big Data Goetz Graefe, Haris Volos, Hideaki Kimura, Harumi Kuno, Joseph Tucek, Mark Lillibridge, Alistair Veitch VLDB 2014, presented by Nick R. Katsipoulakis A Preliminary Experiment

More information

PUBLIC Performance Optimization Guide

PUBLIC Performance Optimization Guide SAP Data Services Document Version: 4.2 Support Package 6 (14.2.6.0) 2015-11-20 PUBLIC Content 1 Welcome to SAP Data Services....6 1.1 Welcome.... 6 1.2 Documentation set for SAP Data Services....6 1.3

More information

Vertica Live Aggregate Projections

Vertica Live Aggregate Projections Vertica Live Aggregate Projections Modern Materialized Views for Big Data Nga Tran - HPE Vertica - Nga.Tran@hpe.com November 2015 Outline What is Big Data? How Vertica provides Big Data Solutions? What

More information

QuickDB Yet YetAnother Database Management System?

QuickDB Yet YetAnother Database Management System? QuickDB Yet YetAnother Database Management System? Radim Bača, Peter Chovanec, Michal Krátký, and Petr Lukáš Radim Bača, Peter Chovanec, Michal Krátký, and Petr Lukáš Department of Computer Science, FEECS,

More information

The Uncracked Pieces in Database Cracking

The Uncracked Pieces in Database Cracking The Uncracked Pieces in Database Cracking Felix Martin Schuhknecht Alekh Jindal Jens Dittrich Information Systems Group, Saarland University CSAIL, MIT http://infosys.cs.uni-saarland.de people.csail.mit.edu/alekh

More information

Minimize cost and risk for data warehousing

Minimize cost and risk for data warehousing SYSTEM X SERVERS SOLUTION BRIEF Minimize cost and risk for data warehousing Microsoft Data Warehouse Fast Track for SQL Server 2014 on System x3850 X6 (55TB) Highlights Improve time to value for your data

More information

Designing Access Methods: The RUM Conjecture

Designing Access Methods: The RUM Conjecture Visionary Paper Designing Access Methods: The RUM Conjecture Manos Athanassoulis Michael S. Kester Lukas M. Maas Radu Stoica Stratos Idreos Anastasia Ailamaki Mark Callaghan Harvard University IBM Research,

More information

Compared to MySQL database, Oracle has the following advantages:

Compared to MySQL database, Oracle has the following advantages: To: John, Jerry Date: 2-23-07 From: Joshua Li Subj: Migration of NEESit Databases from MySQL to Oracle I. Why do we need database migration? Compared to MySQL database, Oracle has the following advantages:

More information

Scaling Your Data to the Cloud

Scaling Your Data to the Cloud ZBDB Scaling Your Data to the Cloud Technical Overview White Paper POWERED BY Overview ZBDB Zettabyte Database is a new, fully managed data warehouse on the cloud, from SQream Technologies. By building

More information

A Generic Auto-Provisioning Framework for Cloud Databases

A Generic Auto-Provisioning Framework for Cloud Databases A Generic Auto-Provisioning Framework for Cloud Databases Jennie Rogers 1, Olga Papaemmanouil 2 and Ugur Cetintemel 1 1 Brown University, 2 Brandeis University Instance Type Introduction Use Infrastructure-as-a-Service

More information

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013 SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase

More information

There and Back Again As Quick As a Flash: Tuning Performance in CDO

There and Back Again As Quick As a Flash: Tuning Performance in CDO There and Back Again As Quick As a Flash: Tuning Performance in Stefan Winkler Independent Software Developer and IT Consultant Committer stefan@winklerweb.net @xpomul on Twitter Note: Extended version

More information

Simple Solutions for Compressed Execution in Vectorized Database System

Simple Solutions for Compressed Execution in Vectorized Database System University of Warsaw Faculty of Mathematics, Computer Science and Mechanics Vrije Universiteit Amsterdam Faculty of Sciences Alicja Luszczak Student no. 248265(UW), 2128020(VU) Simple Solutions for Compressed

More information

Case Study - I. Industry: Social Networking Website Technology : J2EE AJAX, Spring, MySQL, Weblogic, Windows Server 2008.

Case Study - I. Industry: Social Networking Website Technology : J2EE AJAX, Spring, MySQL, Weblogic, Windows Server 2008. Case Study - I Industry: Social Networking Website Technology : J2EE AJAX, Spring, MySQL, Weblogic, Windows Server 2008 Challenges The scalability of the database servers to execute batch processes under

More information

Benchmarking filesystems and PostgreSQL shared buffers

Benchmarking filesystems and PostgreSQL shared buffers Benchmarking filesystems and PostgreSQL shared mode 1 35 1 no 1 no 35 no 1 277.3 197.3 156.1 262.9 171.7 164.8 2.4 157.7 12.1 238.5 151.1 115.2 269. 138.3 133.8 239.3 131.8 126.8 286.9 148.5 124. 237.3

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

Performance And Scalability In Oracle9i And SQL Server 2000

Performance And Scalability In Oracle9i And SQL Server 2000 Performance And Scalability In Oracle9i And SQL Server 2000 Presented By : Phathisile Sibanda Supervisor : John Ebden 1 Presentation Overview Project Objectives Motivation -Why performance & Scalability

More information

Objectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation

Objectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation Objectives Distributed Databases and Client/Server Architecture IT354 @ Peter Lo 2005 1 Understand the advantages and disadvantages of distributed databases Know the design issues involved in distributed

More information

Data Management Using MapReduce

Data Management Using MapReduce Data Management Using MapReduce M. Tamer Özsu University of Waterloo CS742-Distributed & Parallel DBMS M. Tamer Özsu 1 / 24 Basics For data analysis of very large data sets Highly dynamic, irregular, schemaless,

More information

Business Intelligence in SharePoint 2013

Business Intelligence in SharePoint 2013 Business Intelligence in SharePoint 2013 Empowering users to change their world Jason Himmelstein, MVP Senior Technical Director, SharePoint @sharepointlhorn http://www.sharepointlonghorn.com Gold Sponsor

More information

Do Relational Databases Belong in the Cloud? Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com

Do Relational Databases Belong in the Cloud? Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com Do Relational Databases Belong in the Cloud? Michael Stiefel www.reliablesoftware.com development@reliablesoftware.com How do you model data in the cloud? Relational Model A query operation on a relation

More information

Lecture Data Warehouse Systems

Lecture Data Warehouse Systems Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches Column-Stores Horizontal/Vertical Partitioning Horizontal Partitions Master Table Vertical Partitions Primary Key 3 Motivation

More information

In Memory Accelerator for MongoDB

In Memory Accelerator for MongoDB In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000

More information

The Cubetree Storage Organization

The Cubetree Storage Organization The Cubetree Storage Organization Nick Roussopoulos & Yannis Kotidis Advanced Communication Technology, Inc. Silver Spring, MD 20905 Tel: 301-384-3759 Fax: 301-384-3679 {nick,kotidis}@act-us.com 1. Introduction

More information

What Is Specific in Load Testing?

What Is Specific in Load Testing? What Is Specific in Load Testing? Testing of multi-user applications under realistic and stress loads is really the only way to ensure appropriate performance and reliability in production. Load testing

More information

CLOUD COMPUTING Y SU IMPACTO EN LA INFORMATICA

CLOUD COMPUTING Y SU IMPACTO EN LA INFORMATICA CLOUD COMPUTING Y SU IMPACTO EN LA INFORMATICA Gustavo Alonso Systems Group Department of Computer Science ETH Zurich, Switzerland www.systems.ethz.ch JISBD - 2010 1 Background ETH Zürich Systems Group

More information

In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki 30.11.2012

In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki 30.11.2012 In-Memory Columnar Databases HyPer Arto Kärki University of Helsinki 30.11.2012 1 Introduction Columnar Databases Design Choices Data Clustering and Compression Conclusion 2 Introduction The relational

More information

Apache Kylin Introduction Dec 8, 2014 @ApacheKylin

Apache Kylin Introduction Dec 8, 2014 @ApacheKylin Apache Kylin Introduction Dec 8, 2014 @ApacheKylin Luke Han Sr. Product Manager lukhan@ebay.com @lukehq Yang Li Architect & Tech Leader yangli9@ebay.com Agenda What s Apache Kylin? Tech Highlights Performance

More information

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays Database Solutions Engineering By Murali Krishnan.K Dell Product Group October 2009

More information

MapReduce With Columnar Storage

MapReduce With Columnar Storage SEMINAR: COLUMNAR DATABASES 1 MapReduce With Columnar Storage Peitsa Lähteenmäki Abstract The MapReduce programming paradigm has achieved more popularity over the last few years as an option to distributed

More information

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved. Preview of Oracle Database 12c In-Memory Option 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any

More information

SLIDE 1 www.bitmicro.com. Previous Next Exit

SLIDE 1 www.bitmicro.com. Previous Next Exit SLIDE 1 MAXio All Flash Storage Array Popular Applications MAXio N1A6 SLIDE 2 MAXio All Flash Storage Array Use Cases High speed centralized storage for IO intensive applications email, OLTP, databases

More information

Future Prospects of Scalable Cloud Computing

Future Prospects of Scalable Cloud Computing Future Prospects of Scalable Cloud Computing Keijo Heljanko Department of Information and Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 7.3-2012 1/17 Future Cloud Topics Beyond

More information

DBMS Performance Monitoring

DBMS Performance Monitoring DBMS Performance Monitoring Performance Monitoring Goals Monitoring should check that the performanceinfluencing database parameters are correctly set and if they are not, it should point to where the

More information

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect A very short talk about Apache Kylin Business Intelligence meets Big Data Fabian Wilckens EMEA Solutions Architect 1 The challenge today 2 Very quickly: OLAP Online Analytical Processing How many beers

More information

Architecture Sensitive Database Design: Examples from the Columbia Group

Architecture Sensitive Database Design: Examples from the Columbia Group Architecture Sensitive Database Design: Examples from the Columbia Group Kenneth A. Ross Columbia University John Cieslewicz Columbia University Jun Rao IBM Research Jingren Zhou Microsoft Research In

More information