OAK Database optimizations and architectures for complex large data Ioana MANOLESCU-GOUJOT
|
|
- Maurice Peters
- 8 years ago
- Views:
Transcription
1 OAK Database optimizations and architectures for complex large data Ioana MANOLESCU-GOUJOT INRIA Saclay Île-de-France Université Paris Sud LRI UMR CNRS 8623
2 Plan 1. The team 2. Oak research at a glance 3. Zoom: adaptive heterogeneous stores for Big Data Analytics 4. Wrap-up
3 1 The team
4 OAK project-team Joint between INRIA and U. Paris Sud INRIA: Ioana Manolescu (DR) U. Paris Sud faculty: Nicole Bidoit (Pr) Bogdan Cautis (Pr) Benoit Groz (MdC) External faculty: Dario Colazzo (Pr, U. Dauphine) François Goasdoué (Pr, U. Rennes 1) 2 post-docs 2 engineers 6 PhD students 2 M2 Interns
5 2 OAK research at a glance
6 Database optimizations and architectures Database processing: query transform the data through declarative languages Users specify what to do System figures out how to do it 1. Formal models for describing the data and the processing Careful compromise expressivity versus efficiency 2. Logical optimization Inferring whether a computation is equivalent to / contained into another Enumerating alternative methods of evaluating a given computation Query optimization for novel data models and languages 3. Physical optimization Automated storage tuning: selecting materialized views, indices. Physical operators
7 Database optimizations and architectures Database processing: query transform the data through declarative languages Users specify what to do System figures out how to do it 1. Formal models for describing the data and the processing Long-term Careful compromise goal: efficient expressivity tools versus for efficiency declarative management of complex data 2. Logical optimization Inferring whether a computation is equivalent to / contained into another Impact: Enumerating industrialize alternative the methods construction of evaluating a of given innovative computation data- Query optimization for novel data models and languages centric applications 3. Physical optimization Automated storage tuning: selecting materialized views, indices Physical operators
8 OAK research at a glance Document data (JSON, XML ) Static analysis and query optimization Storage optimization through views and indices Massively parallel processing in the cloud Semantic data (RDF, OWL ) Other complex data (XR, social )
9 3 Zoom: Self-tuning heterogeneous stores
10 The problem Glut of varied data management systems (DMS) DM includes DBMS Different data models: NoSQL Relational, nested relational, tree, k-v, graphs, DMSs - Different data access capabilities (from simple API to various query languages) - Different architectures: disk- vs. memory-based, centralized vs. distributed etc. - Different performance - Different levels of transaction support Cloud DMSs
11 The problem Glut of varied data management systems (DMS) DM includes DBMS Different data models: Relational, nested relational, tree, k-v, graphs, - Different data access capabilities (from simple API to various query languages) - Different architectures: disk- vs. memory-based, centralized vs. distributed etc. How do we get performance for a variety of datasets on a variety of DMSs - Different performance - Different levels of transaction support NoSQL DMSs Cloud DMSs
12 The problem Glut of varied data management systems (DMS) DM includes DBMS How do we get Different data models: NoSQL performance Relational, nested relational, tree, k-v, graphs, DMSs for a variety Focus of datasets not on beating the on a variety of most DMSs specialized optimizations of the most specialized engine for a given model/application. - Different data access capabilities (from simple API to various query languages) - Different architectures: disk- vs. memory-based, centralized vs. distributed etc. - Different performance - Different levels of transaction support Cloud DMSs
13 The problem Glut of varied data management systems (DMS) DM includes DBMS How do we get Different data models: NoSQL performance Relational, nested relational, tree, k-v, graphs, DMSs for a variety Focus of datasets not on beating the on a variety of most DMSs specialized optimizations of the most specialized engine Focus for on a robust given model/application. performance for varied Cloud data DMSs models across a changing set of heterogeneous DMSs - Different data access capabilities (from simple API to various query languages) - Different architectures: disk- vs. memory-based, centralized vs. distributed etc. - Different performance - Different levels of transaction support
14 The problem, qualified Glut With of varied data management With no hassle systems (DMS) correctness DM includes DBMS for the Different guarantees data models: application layer Automatically NoSQL Relational, nested relational, tree, k-v, graphs, DMSs - Different data access capabilities (from simple API to various query languages) How do we get performance for a variety of datasets - Different architectures: disk- vs. memory-based, centralized Resilient to vs. distributed etc. on a variety of DMSs changes - Different performance Cloud - Different levels of transaction support DMSs
15 Sample application: Big Data Analytics in Datalyse Investissement d Avenir Cloud & Big Data, Led by Business et Decision, with INRIA Lille, LIG, LIRMM Goal: build cloud-based Big Data Analytics tools for heterogeneous data Data providers: OAK OAK OAK
16 Data models: As the data is Systems: Those available invisible glue for heterogeneous stores (side by side) (side by side) Store each data set as a set of Or splits / shards / partitions / indexes / materialized (potentially indexed) Each fragment resides in a DMS
17 Dataset fragmentations A B C D A B C D 1 2 A B C D 3 4 A B C D 5 6 A B C D 1 3 A B A C A D A B C D 5 6
18 Dataset fragmentations Example: relational dataset R
19 Dataset fragmentations Example: relational dataset R
20 Dataset fragmentations Example: relational dataset R
21 Dataset fragmentations Example: relational dataset R
22 Dataset fragmentations Example: relational dataset R
23 Dataset fragmentations Example: relational dataset R
24 Dataset fragmentations Example: relational dataset R
25 Fragmentations made of views The content of each fragment is described declaratively Fragment = (materialized) view [+ parameters] «The name and addresses of all clients» «The sales partitioned by zipcode» Also indexes «The name and addresses of all clients, by their age and zipcode» Also: navigation in trees or graphs key-value stores Fragment = materialized view [+ parameters] [+ input pattern]
26 Fragments distribution across stores
27 RDF DMS Fragments distribution across stores
28 Fragments distribution across stores RDF DMS K-v store
29 Fragments distribution across stores RDF DMS K-v store JSO N DMS
30 Fragments distribution across stores RDF DMS Rel DBMS K-v store JSON DMS
31 Fragments distribution across stores RDF DMS K-v store Data model translation applied at loading The extraction logic is in the view Rel DBMS Pig store on top of DFS JSON DMS
32 Fragments distribution across stores RDF DMS K-v store Applications query the data in native format Rel DBMS Pig store on top of DFS JSO N DMS
33 Fragments distribution across stores RDF DMS K-v store Fragment description by views guarantees properties such as: completeness equivalence Rel DBMS Pig store on top of DFS JSO N DMS
34 Query answering = View-Based Rewriting VBR known for dramatic performance improvements No limit (e.g. view = query) Comparison with «Local As Views» mediation data models Common data model (V1,, Vn, Q) Query Q Source schema V1 (DMS1) Mediator schema Source schema Vn (DMSn) vs. Query Q Native dataset model Source schema V1 (DMS1) Dataset schema Source schema Vn (DMSn)
35 Query answering = view-based rewriting Comparison with «Local As Views» mediation: data models Side-by-side data models at the top Native model of dataset 1 Query Q Dataset 1 schema Query Q Native model of dataset k Dataset k schema Source schema V 1 1 (DMS1) Source schema V 1 n1 (DMSn) Source schema V k 1 (DMSk1) Source schema V k nk (DMSknk) à Common benefit with LAV: Applications unaware of the fragmentation! à Novel benefit: fragments can migrate to systems and data models
36 architecture Data Centric Application Store Dataset 1 Dataset 2 Query Dataset 1 Dataset 2 Dataset n Dataset n Dataset 1 F1 F3 F2 Dataset 2 F4 F1 F3 F2 Storage Advisor Query Evaluator Storage Descriptors Manager Query Execution Plan Estocada Runtime Execution Engine D1/F1 D2/F2 D1 / F2 D1/F3 D1/F4 D2 / F3 D2/F1 NoSQL System Key-value store Document store Nested relations store Relational store
37 core modules View-based rewriting (VBR) Outputs: queries to DMSs (in their native language) + remaining integration operations DMS capability descriptions exploited here. Runtime To perform integration operations For this, a single runtime (for the most expressive model, e.g. nested relations), should do We may borrow one of the DMSs s runtime
38 What about performance? Select the rewriting likely to lead to the best query evaluation performance Cross-system cost model - Based on cost model calibration - Modest extension for binding patterns View recommendation «Cross-model, cross-system data storage advisor» Great progress in recent years on single-model storage (view, index etc.) recommendation Combinatorial problem (select a subset of the possible views minimizing cost estimation)
39 4 Advancement and potential perspectives
40 Estocada: advancement and perspectives Current status: 3 senior (IM, FG, Alin Deutsch from UCSD) 2 post-docs, 1 PhD student, 1 to start in 2015 Core code modules ready (VBR) Roadmap for deploying adaptors and costmodel for a few popular systems Pig MongoDB Hadoop-based RDF store Would like to have More real use case scenario An engineer (preferred) and/or another PhD student
41 Merci / questions?
Preparing Your Data For Cloud
Preparing Your Data For Cloud Narinder Kumar Inphina Technologies 1 Agenda Relational DBMS's : Pros & Cons Non-Relational DBMS's : Pros & Cons Types of Non-Relational DBMS's Current Market State Applicability
More informationPlace and date of birth Rome, November 26 th 1983
Francesca Bugiotti Personal Information Place and date of birth Rome, November 26 th 1983 E-mail francesca.bugiotti@inria.fr Education Università Roma Tre (November 2008 - April 2012 ) PhD in Computer
More informationDataBridges: data integration for digital cities
DataBridges: data integration for digital cities Thematic action line «Digital Cities» Ioana Manolescu Oak team INRIA Saclay and Univ. Paris Sud-XI Plan 1. DataBridges short history and overview 2. RDF
More informationEIT ICT Labs MASTER SCHOOL DSS Programme Specialisations
EIT ICT Labs MASTER SCHOOL DSS Programme Specialisations DSS EIT ICT Labs Master Programme Distributed System and Services (Cloud Computing) The programme in Distributed Systems and Services focuses on
More informationQuerying MongoDB without programming using FUNQL
Querying MongoDB without programming using FUNQL FUNQL? Federated Unified Query Language What does this mean? Federated - Integrates different independent stand alone data sources into one coherent view
More informationDisributed Query Processing KGRAM - Search Engine TOP 10
fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries Johan Montagnat CNRS, I3S lab, Modalis team on behalf of the CrEDIBLE
More informationPrinciples of Distributed Database Systems
M. Tamer Özsu Patrick Valduriez Principles of Distributed Database Systems Third Edition
More informationbigdata Managing Scale in Ontological Systems
Managing Scale in Ontological Systems 1 This presentation offers a brief look scale in ontological (semantic) systems, tradeoffs in expressivity and data scale, and both information and systems architectural
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationNoSQL in der Cloud Why? Andreas Hartmann
NoSQL in der Cloud Why? Andreas Hartmann 17.04.2013 17.04.2013 2 NoSQL in der Cloud Why? Quelle: http://res.sys-con.com/story/mar12/2188748/cloudbigdata_0_0.jpg Why Cloud??? 17.04.2013 3 NoSQL in der Cloud
More informationMaking Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island
Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY ANN KELLY II MANNING Shelter Island contents foreword preface xvii xix acknowledgments xxi about this book xxii Part 1 Introduction
More informationIncreasing Business Productivity and Value in Financial Services with Secure Big Data Architecture
Increasing Business Productivity and Value in Financial Services with Secure Big Data Architecture Stefanus Natahusada, Director/Consultant Email: info@stefansecurity.com Agenda Financial Services Requirements
More informationBig Data Processing with Google s MapReduce. Alexandru Costan
1 Big Data Processing with Google s MapReduce Alexandru Costan Outline Motivation MapReduce programming model Examples MapReduce system architecture Limitations Extensions 2 Motivation Big Data @Google:
More informationSQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford
SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems
More informationAnalytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world
Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3
More informationChapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server
Chapter 3 Database Architectures and the Web Transparencies Database Environment - Objectives The meaning of the client server architecture and the advantages of this type of architecture for a DBMS. The
More informationBig Data, Fast Data, Complex Data. Jans Aasman Franz Inc
Big Data, Fast Data, Complex Data Jans Aasman Franz Inc Private, founded 1984 AI, Semantic Technology, professional services Now in Oakland Franz Inc Who We Are (1 (2 3) (4 5) (6 7) (8 9) (10 11) (12
More informationWhy NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
More informationfédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Interrogation d'entrepôts distribués et hétérogènes
fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Interrogation d'entrepôts distribués et hétérogènes Johan Montagnat Alban Gaignard http://credible.i3s.unice.fr MI CNRS appel
More informationSearch and Real-Time Analytics on Big Data
Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationLecture Data Warehouse Systems
Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores
More information16.1 MAPREDUCE. For personal use only, not for distribution. 333
For personal use only, not for distribution. 333 16.1 MAPREDUCE Initially designed by the Google labs and used internally by Google, the MAPREDUCE distributed programming model is now promoted by several
More informationCloud Computing and Advanced Relationship Analytics
Cloud Computing and Advanced Relationship Analytics Using Objectivity/DB to Discover the Relationships in your Data By Brian Clark Vice President, Product Management Objectivity, Inc. 408 992 7136 brian.clark@objectivity.com
More informationBig Data Management. Big Data Management. (BDM) Autumn 2013. Povl Koch September 16, 2013 15-09-2013 1
Big Data Management Big Data Management (BDM) Autumn 2013 Povl Koch September 16, 2013 15-09-2013 1 Overview Today s program 1. Little more practical details about this course 2. Chapter 7 in NoSQL Distilled
More informationMongoDB in the NoSQL and SQL world. Horst Rechner horst.rechner@fokus.fraunhofer.de Berlin, 2012-05-15
MongoDB in the NoSQL and SQL world. Horst Rechner horst.rechner@fokus.fraunhofer.de Berlin, 2012-05-15 1 MongoDB in the NoSQL and SQL world. NoSQL What? Why? - How? Say goodbye to ACID, hello BASE You
More informationfédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries
fédération de données et de ConnaissancEs Distribuées en Imagerie BiomédicaLE Data fusion, semantic alignment, distributed queries Johan Montagnat CNRS, I3S lab, Modalis team on behalf of the CrEDIBLE
More informationObjectives. Distributed Databases and Client/Server Architecture. Distributed Database. Data Fragmentation
Objectives Distributed Databases and Client/Server Architecture IT354 @ Peter Lo 2005 1 Understand the advantages and disadvantages of distributed databases Know the design issues involved in distributed
More informationStudy concluded that success rate for penetration from outside threats higher in corporate data centers
Auditing in the cloud Ownership of data Historically, with the company Company responsible to secure data Firewall, infrastructure hardening, database security Auditing Performed on site by inspecting
More informationASTERIX: An Open Source System for Big Data Management and Analysis (Demo) :: Presenter :: Yassmeen Abu Hasson
ASTERIX: An Open Source System for Big Data Management and Analysis (Demo) :: Presenter :: Yassmeen Abu Hasson ASTERIX What is it? It s a next generation Parallel Database System to addressing today s
More informationAdding scalability to legacy PHP web applications. Overview. Mario Valdez-Ramirez
Adding scalability to legacy PHP web applications Overview Mario Valdez-Ramirez The scalability problems of legacy applications Usually were not designed with scalability in mind. Usually have monolithic
More informationA Novel Cloud Based Elastic Framework for Big Data Preprocessing
School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview
More informationScalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens
Scalable End-User Access to Big Data http://www.optique-project.eu/ HELLENIC REPUBLIC National and Kapodistrian University of Athens 1 Optique: Improving the competitiveness of European industry For many
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationTRAINING PROGRAM ON BIGDATA/HADOOP
Course: Training on Bigdata/Hadoop with Hands-on Course Duration / Dates / Time: 4 Days / 24th - 27th June 2015 / 9:30-17:30 Hrs Venue: Eagle Photonics Pvt Ltd First Floor, Plot No 31, Sector 19C, Vashi,
More informationCloudDB: A Data Store for all Sizes in the Cloud
CloudDB: A Data Store for all Sizes in the Cloud Hakan Hacigumus Data Management Research NEC Laboratories America http://www.nec-labs.com/dm www.nec-labs.com What I will try to cover Historical perspective
More informationDatabases 2 (VU) (707.030)
Databases 2 (VU) (707.030) Introduction to NoSQL Denis Helic KMI, TU Graz Oct 14, 2013 Denis Helic (KMI, TU Graz) NoSQL Oct 14, 2013 1 / 37 Outline 1 NoSQL Motivation 2 NoSQL Systems 3 NoSQL Examples 4
More informationDesigning Database Solutions for Microsoft SQL Server 2012 MOC 20465
Designing Database Solutions for Microsoft SQL Server 2012 MOC 20465 Course Outline Module 1: Designing a Database Server Infrastructure This module explains how to design an appropriate database server
More informationOn- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform
On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...
More informationAn Approach to Implement Map Reduce with NoSQL Databases
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh
More informationLDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany a.schultz@fu-berlin.de,
More informationEvaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group
NoSQL Evaluator s Guide McKnight Consulting Group William McKnight is the former IT VP of a Fortune 50 company and the author of Information Management: Strategies for Gaining a Competitive Advantage with
More informationBig Data JAMES WARREN. Principles and best practices of NATHAN MARZ MANNING. scalable real-time data systems. Shelter Island
Big Data Principles and best practices of scalable real-time data systems NATHAN MARZ JAMES WARREN II MANNING Shelter Island contents preface xiii acknowledgments xv about this book xviii ~1 Anew paradigm
More informationData Modeling in the Age of Big Data
Data Modeling in the Age of Big Data Pete Stiglich Pete Stiglich is a principal at Clarity Solution Group. pstiglich@clarity-us.com Abstract With big data adoption accelerating and strong interest in NoSQL
More informationLINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model
LINKED DATA EXPERIENCE AT MACMILLAN Building discovery services for scientific and scholarly content on top of a semantic data model 22 October 2014 Tony Hammond Michele Pasin Background About Macmillan
More information15.00 15.30 30 XML enabled databases. Non relational databases. Guido Rotondi
Programme of the ESTP training course on BIG DATA EFFECTIVE PROCESSING AND ANALYSIS OF VERY LARGE AND UNSTRUCTURED DATA FOR OFFICIAL STATISTICS Rome, 5 9 May 2014 Istat Piazza Indipendenza 4, Room Vanoni
More informationBig Data Analytics. Rasoul Karimi
Big Data Analytics Rasoul Karimi Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 1 Introduction
More informationGraph Database Performance: An Oracle Perspective
Graph Database Performance: An Oracle Perspective Xavier Lopez, Ph.D. Senior Director, Product Management 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Program Agenda Broad Perspective
More informationMongoDB Developer and Administrator Certification Course Agenda
MongoDB Developer and Administrator Certification Course Agenda Lesson 1: NoSQL Database Introduction What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL Types of NoSQL
More informationUnderstanding NoSQL on Microsoft Azure
David Chappell Understanding NoSQL on Microsoft Azure Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Data on Azure: The Big Picture... 3 Relational Technology: A Quick
More informationBig Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
More informationMUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database System in Energy Data Management
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database System in Energy Data Management Zhan Liu, Fabian Cretton, Anne Le Calvé, Nicole Glassey, Alexandre Cotting, Fabrice Chapuis
More informationMS SQL Server 2014 New Features and Database Administration
MS SQL Server 2014 New Features and Database Administration MS SQL Server 2014 Architecture Database Files and Transaction Log SQL Native Client System Databases Schemas Synonyms Dynamic Management Objects
More informationDYNAMIC QUERY FORMS WITH NoSQL
IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 2, Issue 7, Jul 2014, 157-162 Impact Journals DYNAMIC QUERY FORMS WITH
More informationIntroduction to Polyglot Persistence. Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace
Introduction to Polyglot Persistence Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace FOSSCOMM 2016 Background - 14 years in databases and system engineering - NoSQL DBA @ ObjectRocket
More informationHow In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
More informationData Services Advisory
Data Services Advisory Modern Datastores An Introduction Created by: Strategy and Transformation Services Modified Date: 8/27/2014 Classification: DRAFT SAFE HARBOR STATEMENT This presentation contains
More informationChapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing
More informationMicroStrategy Course Catalog
MicroStrategy Course Catalog 1 microstrategy.com/education 3 MicroStrategy course matrix 4 MicroStrategy 9 8 MicroStrategy 10 table of contents MicroStrategy course matrix MICROSTRATEGY 9 MICROSTRATEGY
More informationCost-optimized, Policy-based Data Management in Cloud Environments
Cost-optimized, Policy-based Data Management in Cloud Environments Ilir Fetai Filip-Martin Brinkmann Databases and Information Systems Research Group University of Basel Current State in the Cloud: A zoo
More informationNoSQL for SQL Professionals William McKnight
NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to
More informationSeaCloudDM: Massive Heterogeneous Sensor Data Management in the Internet of Things
SeaCloudDM: Massive Heterogeneous Sensor Data Management in the Internet of Things Jiajie Xu Institute of Software, Chinese Academy of Sciences (ISCAS) 2012-05-15 Outline 1. Challenges in IoT Data Management
More informationQuality Measure Definitions Overview
Quality Measure Definitions Overview pophealth is a open source software tool that automates population health reporting quality measures. pophealth integrates with a healthcare provider's electronic health
More informationOracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.
Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE
More informationUnderstanding NoSQL Technologies on Windows Azure
David Chappell Understanding NoSQL Technologies on Windows Azure Sponsored by Microsoft Corporation Copyright 2013 Chappell & Associates Contents Data on Windows Azure: The Big Picture... 3 Windows Azure
More informationHOW TO DO A SMART DATA PROJECT
April 2014 Smart Data Strategies HOW TO DO A SMART DATA PROJECT Guideline www.altiliagroup.com Summary ALTILIA s approach to Smart Data PROJECTS 3 1. BUSINESS USE CASE DEFINITION 4 2. PROJECT PLANNING
More informationLuncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
More informationDatabase Application Developer Tools Using Static Analysis and Dynamic Profiling
Database Application Developer Tools Using Static Analysis and Dynamic Profiling Surajit Chaudhuri, Vivek Narasayya, Manoj Syamala Microsoft Research {surajitc,viveknar,manojsy}@microsoft.com Abstract
More informationHow To Improve Performance In A Database
Some issues on Conceptual Modeling and NoSQL/Big Data Tok Wang Ling National University of Singapore 1 Database Models File system - field, record, fixed length record Hierarchical Model (IMS) - fixed
More informationNoSQL Systems for Big Data Management
NoSQL Systems for Big Data Management Venkat N Gudivada East Carolina University Greenville, North Carolina USA Venkat Gudivada NoSQL Systems for Big Data Management 1/28 Outline 1 An Overview of NoSQL
More informationIntroduction to NoSQL Databases. Tore Risch Information Technology Uppsala University 2013-03-05
Introduction to NoSQL Databases Tore Risch Information Technology Uppsala University 2013-03-05 UDBL Tore Risch Uppsala University, Sweden Evolution of DBMS technology Distributed databases SQL 1960 1970
More informationRDF Data Management in the Amazon Cloud
RDF Data Management in the Amazon Cloud Francesca Bugiotti Università Roma Tré, Italy franbugiotti@yahoo.it François Goasdoué Université Paris-Sud and Inria Saclay, France fg@lri.fr Ioana Manolescu Inria
More informationReverse Engineering in Data Integration Software
Database Systems Journal vol. IV, no. 1/2013 11 Reverse Engineering in Data Integration Software Vlad DIACONITA The Bucharest Academy of Economic Studies diaconita.vlad@ie.ase.ro Integrated applications
More informationSemantic Stored Procedures Programming Environment and performance analysis
Semantic Stored Procedures Programming Environment and performance analysis Marjan Efremov 1, Vladimir Zdraveski 2, Petar Ristoski 2, Dimitar Trajanov 2 1 Open Mind Solutions Skopje, bul. Kliment Ohridski
More informationPublishing Linked Data Requires More than Just Using a Tool
Publishing Linked Data Requires More than Just Using a Tool G. Atemezing 1, F. Gandon 2, G. Kepeklian 3, F. Scharffe 4, R. Troncy 1, B. Vatant 5, S. Villata 2 1 EURECOM, 2 Inria, 3 Atos Origin, 4 LIRMM,
More informationLofan Abrams Data Services for Big Data Session # 2987
Lofan Abrams Data Services for Big Data Session # 2987 Big Data Are you ready for blast-off? Big Data, for better or worse: 90% of world s data generated over last two years. ScienceDaily, ScienceDaily
More informationBig Data Management. Big Data Management. (BDM) Autumn 2013. Povl Koch September 30, 2013 29-09-2013 1
Big Data Management Big Data Management (BDM) Autumn 2013 Povl Koch September 30, 2013 29-09-2013 1 Overview Today s program 1. Little more practical details about this course 2. Recap from last time 3.
More informationAnalytical Processing in the Big Data Era
Analytical Processing in the Big Data Era 1 Modern industrial, government, and academic organizations are collecting massive amounts of data ( Big Data ) at an unprecedented scale and pace. Companies like
More informationTop DBMS Insights From IT Executives
Understand the top DBMS trends, concerns, and demands in this study conducted by IDG Research Executive Summary NuoDB commissioned the following survey of top IT executives to help you and your peers understand
More informationThis paper defines as "Classical"
Principles of Transactional Approach in the Classical Web-based Systems and the Cloud Computing Systems - Comparative Analysis Vanya Lazarova * Summary: This article presents a comparative analysis of
More informationAn Overview of SAP BW Powered by HANA. Al Weedman
An Overview of SAP BW Powered by HANA Al Weedman About BICP SAP HANA, BOBJ, and BW Implementations The BICP is a focused SAP Business Intelligence consulting services organization focused specifically
More informationIntegrating XML Data Sources using RDF/S Schemas: The ICS-FORTH Semantic Web Integration Middleware (SWIM)
Integrating XML Data Sources using RDF/S Schemas: The ICS-FORTH Semantic Web Integration Middleware (SWIM) Extended Abstract Ioanna Koffina 1, Giorgos Serfiotis 1, Vassilis Christophides 1, Val Tannen
More informationOracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
More informationAn Approach for Knowledge-Based IT Management of Air Traffic Control Systems
An Approach for Knowledge-Based IT Management of Air Traffic Control Systems Fabian Meyer, Reinhold Kroeger RheinMain University of Applied Sciences D-65195 Wiesbaden, Germany {firstname.lastname}@hs-rm.de
More informationGraph Databases What makes them Different?
www.objectivity.com Graph Databases What makes them Different? Darren Wood Chief Architect, InfiniteGraph NoSQL Data Specialists Everyone specializes Doctors, Lawyers, Bankers, Developers Why was data
More informationMicrosoft Big Data Solutions. Anar Taghiyev P-TSP E-mail: b-anarta@microsoft.com;
Microsoft Big Data Solutions Anar Taghiyev P-TSP E-mail: b-anarta@microsoft.com; Why/What is Big Data and Why Microsoft? Options of storage and big data processing in Microsoft Azure. Real Impact of Big
More informationSession 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges
Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges James Campbell Corporate Systems Engineer HP Vertica jcampbell@vertica.com Big
More informationOracle Spatial and Graph. Jayant Sharma Director, Product Management
Oracle Spatial and Graph Jayant Sharma Director, Product Management Agenda Oracle Spatial and Graph Graph Capabilities Q&A 2 Oracle Spatial and Graph Complete Open Integrated Most Widely Used 3 Open and
More informationextensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010
System/ Scale to Primary Secondary Joins/ Integrity Language/ Data Year Paper 1000s Index Indexes Transactions Analytics Constraints Views Algebra model my label 1971 RDBMS O tables sql-like 2003 memcached
More informationThe evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect
The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect IT Insight podcast This podcast belongs to the IT Insight series You can subscribe to the podcast through
More informationThe Ontological Approach for SIEM Data Repository
The Ontological Approach for SIEM Data Repository Igor Kotenko, Olga Polubelova, and Igor Saenko Laboratory of Computer Science Problems, Saint-Petersburg Institute for Information and Automation of Russian
More informationYou Have Your Data, Now What?
You Have Your Data, Now What? Kevin Shelly, GVP, Global Public Sector Data is a Resource SLIDE: 2 Time to Value SLIDE: 3 Big Data: Volume, VARIETY, and Velocity Simple Structured Complex Structured Textual/Unstructured
More informationDATA ANALYTICS Unlocking knowledge and value from data
DATA ANALYTICS Unlocking knowledge and value from data November 2014 Summary Inria Industry Meetings p 3 Your contacts at the Inria Saclay - Île-de-France research center p 4 Technologies Bertifier Sparklificator
More informationBenchmarking and Analysis of NoSQL Technologies
Benchmarking and Analysis of NoSQL Technologies Suman Kashyap 1, Shruti Zamwar 2, Tanvi Bhavsar 3, Snigdha Singh 4 1,2,3,4 Cummins College of Engineering for Women, Karvenagar, Pune 411052 Abstract The
More informationReference Architecture, Requirements, Gaps, Roles
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
More informationRecent and Future Activities in HPC and Scientific Data Management Siegfried Benkner
Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner Research Group Scientific Computing Faculty of Computer Science University of Vienna AUSTRIA http://www.par.univie.ac.at
More informationIntroduction to Big Data Training
Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB
More informationOpen Source Technologies on Microsoft Azure
Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions
More informationLogistics. Database Management Systems. Chapter 1. Project. Goals for This Course. Any Questions So Far? What This Course Cannot Do.
Database Management Systems Chapter 1 Mirek Riedewald Many slides based on textbook slides by Ramakrishnan and Gehrke 1 Logistics Go to http://www.ccs.neu.edu/~mirek/classes/2010-f- CS3200 for all course-related
More informationProfessional Organization Checklist for the Computer Information Systems Curriculum
Professional Organization Checklist f the Computer Infmation Systems Curriculum Association of Computing Machinery and Association of Infmation Systems IS 2002 Model Curriculum and Guidelines f Undergraduate
More information