Hybrid Solutions Combining In-Memory & SSD
|
|
- Rosalind Reed
- 8 years ago
- Views:
Transcription
1 Hybrid Solutions Combining In-Memory & SSD Author:
2 Agenda Overview of the big data technology landscape Building a high-speed SSD-backed data store Complex & compound queries Orchestration in Big Data 2
3 Making Sense of the Exploding Big Data World SQL NoSQL SSD IMDG Key / Value Stream Processing 3
4 Let s look at some tools & technologies
5 SQL Technologies Query: ANSI 92 Semantics: CRUD Aggregation Projection Partial update Performance: 100 s/sec Consistency: Transactional Scaling: Mostly Scale-UP Availability: Disk Based 5
6 NoSQL Technologies Query: Proprietary but rich Semantics: CRUD Map/Reduce No Projection No Partial update Performance: 1000 s/sec Consistency: Eventual Scaling: Mostly Scale-Out Availability: Replication based 6
7 IMDG Technologies Query: Proprietary but rich Semantics: CRUD Aggregation API + Map/Reduce Projection Partial Update Performance: 100k/sec Consistency: Transactional Scaling: Mostly Scale-Out Availability: Replication based 7
8 Key/Value Technologies Query: Key, Value Semantics: Mostly Read No Aggregation No Projection No Partial update Performance: 1M s/sec Consistency: Atomic Scaling: Mostly Scale-Out Availability: Limited 8
9 Stream Processing Technologies Semantics: Event-driven data processing Performance: 10M s/sec Machine learning Real-time analytics Depend on external persistency for maintaining state 9
10 Stream Processing Integration - Streaming tuples Tuple stream (FIFO) Stream Producer IMDG Storm 10
11 Stream Processing Integration - Keeping state Only non-transient data in SSD State IMDG State Result Maintain state and SSD-backed Grid write results in IMDG Stream Producer Tuple stream Storm 11
12 SSD Technology is quickly shaping the Big Data Landscape Great for heavy-reads initial-loads SSD-powered in-memory products can provide great performance with increased server density Big data but also fast data Bigger capacity, better performance, cheaper than RAM 12
13 Summary Many API s - Same Data Use-case requirements across tools SSD enables greater scale at cost Need for big & fast data 13
14 Building an SSD-backed API Data-Store
15 Using IMDG & SSD to build a hybrid big-data solution Create a common API High-speed Data Store Store indexes in RAM payload in SSD Use native low API 15
16 IMDG & SSD integration Hold Indexes & cached objects in RAM Transient Data Old Data Initial Load Client New Data Client writes new data to the space Proc. Data Transient Data Proc. Data Hold Payload in SSD 16
17 Why an SSD-backed IMDG Higher server density and capacity Added resilience and increased availability Faster grid warmup, less down-time Rich query capabilities with persistency 17
18 A typical Big Data App logical architecture can look like this Front End Front-end users accessing a distributed multi-facet service Application / Service RT Analytics Service Storage Batch Processing back-end users accessing a business insights and system maintenance metrics Back End 18
19 We need a High-Speed Data Store Front End Back End Key / Value Document Graph Map / Reduce Transactional Stream based High-speed Data Store Common Data Store serving Multiple Semantics/API Disk becomes the new tape 19
20 We can use IMDG technologies to integrate all our Data Sources RT Transactional Front End Back End Data Access High-speed Data Bus (IMDG) RT Streaming Storm Speed Layer Hadoop Sync Direct Access Mongo Sync RDBMS Sync Data bus: Resilient, FT & HA Hadoop Transactional High-throughput MongoDB MySQL Batch Layer Web Storage Layer 20
21 What really goes on in the grid Proc. Data Client New Data Client New Data New data automatically synced to ES Client writes new data to the space Polling container writes to MongoDB Polling Container Notify Container Proc. Data Application Storage MongoDB Mirror DB Space mirroring Service 21
22 The familiar Lambda architecture Front End Cache Layer RT Stream Processing Back End Front-end users accessing a distributed multi-facet service Storage Batch Processing back-end users accessing a business insights and system maintenance metrics 22
23 Introduce a intermediate near real-time layer Front End Cache Layer RT Stream Processing Back End Front-end users accessing a distributed multi-facet service Near Real-time Processing back-end users accessing a business insights and system maintenance metrics Storage Batch Processing 23
24 Hybrid Architecture Hold Indexes & cached objects in RAM Transient Data Old Data Initial Load Client New Data Client writes new data to the space Proc. Data Transient Data Proc. Data Hold Payload in SSD Application Storage MongoDB Mirror DB Data Mirroring 24
25 Telco customer real-world use-case Make payment (Transactional) Fast history search High-speed Data Bus (IMDG) RT Streaming State persistency Direct Access Storm Real-time business analytics on user activity (if needed) Sync new data available to end user Hadoop Sync business analytics Downstream System Hadoop Long-term analytics and storage 25
26 What we achieved 10-fold Grid growth - 20GB to 200GB Improved initial loading times - less downtime Customer history up to 15 years - previously 3 Longer visibility on intermediate customer analytics 26
27 Summary Using an SSD-backed IMDG as data hub Creating a hybrid solution architecture Using SSD to add a near real-time layer 27
28 Using rich query semantics with data grids
29 Nested Queries & Projections Query for a Person who lives in New York: A = new SQLQuery<Person>(Person.class, address.city = New York ); Query for a Dealer which sells a Honda: B = new SQLQuery<Dealer>(Dealer.class, cars[*] = Honda ); Query for a Person with projections on first and last names: C IdsQuery<Person> idsquery = new IdsQuery<Person> (Person.class, new Long[]{id1, id2}).setprojections( firstname, lastname ); Person result[] = space.readbyids(idsquery).getresultsarray(); 29
30 Basic Aggregations Create a query that yields a results set: A SQLQuery<Employee> = new SQLQuery<Employee> (Employee.class, country=? OR country=? ); query.setparameter(1, UK ); query.setparameter(2, USA ); Perform aggregations for that result set: B Integer maxageinspace = max(space, query, age ); Integer minageinspace = min(space, query, age ); Integer combinedageinspace = sum(space, query, age ); Double averageage = average(space, query, age ); Person oldestpersoninspace = maxentry(space, query, age ); Person youngestpersoninspace = minentry(space, query, age ); 30
31 Complex & Compound Aggregations Create a query that yields a results set: A SQLQuery<Employee> = new SQLQuery<Employee> (Employee.class, country=? OR country=? ); query.setparameter(1, UK ); query.setparameter(1, USA ); B Perform group and filtering aggregations for that result set: = groupby(space, query, new GroupByAggregator().select(average("salary"), min("salary"), max( salary")).groupby("department", gender )).having(new GroupByFilter() { public boolean process(groupbyvalue group) { return group.getdouble("avg(salary)") > 18000;}})); 31
32 Fast Update & Change API Performing changes at the data-store level: A IdsQuery<Person> idsquery = new IdsQuery<Person> (Person.class, id, routing) space.change(idsquery, new ChangeSet().increment( balance.euro, 5.2D)); Performing a series changes at the data-store level: B IdsQuery<Person> idsquery = new IdsQuery<Person> (Person.class, id, routing) space.change(idsquery, new ChangeSet().increment( someintproperty, 1).set( somestringproperty, newvalue ).putinmap( somenestedproperty.somemapproperty, mykey, 2); 32
33 Orchestration in Big Data
34 The Application Deployment Lifecycle Manage Provision Monitor Install Deploy Configure 34
35 Create a Standardised Blueprint of the Application Topology Node Type: Container Node Type: Container Node Node Type: Server Type: DB Node Type: App Connected To relationship Contained In relationship 35
36 Using TOSCA & YAML to Describe the Application Topology... host: type: cloudify.nodes.libcloud.compute... ################################################################################## # Tomcat server ################################################################################## tomcat_server: type: cloudify.nodes.tomcatserver relationships: - type: cloudify.relationships.contained_in target: host ################################################################################## # MongoDB node as a backend data-store for the example Tomcat application ################################################################################## mongodb: type: cloudify.nodes.mongodb relationships: - type: cloudify.relationships.contained_in target: host... 36
37 Post-deployment Management & Monitoring 37
38 Thanks For Attending Author:
GigaSpaces Real-Time Analytics for Big Data
GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and
More informationPulsar Realtime Analytics At Scale. Tony Ng April 14, 2015
Pulsar Realtime Analytics At Scale Tony Ng April 14, 2015 Big Data Trends Bigger data volumes More data sources DBs, logs, behavioral & business event streams, sensors Faster analysis Next day to hours
More informationReal-Time Analytics for Big Market Data with XAP In-Memory Computing
Real-Time Analytics for Big Market Data with XAP In-Memory Computing March 2015 Real Time Analytics for Big Market Data Table of Contents Introduction 03 Main Industry Challenges....04 Achieving Real-Time
More informationORACLE COHERENCE 12CR2
ORACLE COHERENCE 12CR2 KEY FEATURES AND BENEFITS ORACLE COHERENCE IS THE #1 IN-MEMORY DATA GRID. KEY FEATURES Fault-tolerant in-memory distributed data caching and processing Persistence for fast recovery
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationArchitectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
More informationSentimental Analysis using Hadoop Phase 2: Week 2
Sentimental Analysis using Hadoop Phase 2: Week 2 MARKET / INDUSTRY, FUTURE SCOPE BY ANKUR UPRIT The key value type basically, uses a hash table in which there exists a unique key and a pointer to a particular
More informationLambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com
Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...
More informationBIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
More informationMeeting Real Time Risk Management Challenge XAP In-Memory Computing
Meeting Real Time Risk Management Challenge XAP In-Memory Computing March 2015 Meeting Real-Time Risk Management Challenges Table of Contents Introduction 03 Main Industry Challenges....04 Meeting Real-Time
More informationX4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released
General announcements In-Memory is available next month http://www.oracle.com/us/corporate/events/dbim/index.html X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released
More informationReal-time Big Data Analytics with Storm
Ron Bodkin Founder & CEO, Think Big June 2013 Real-time Big Data Analytics with Storm Leading Provider of Data Science and Engineering Services Accelerating Your Time to Value IMAGINE Strategy and Roadmap
More informationAn Approach to Implement Map Reduce with NoSQL Databases
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh
More informationApache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source
Apache Ignite TM (Incubating) - In- Memory Data Fabric Fast Data Meets Open Source DMITRIY SETRAKYAN Founder, PPMC http://www.ignite.incubator.apache.org @apacheignite @dsetrakyan Agenda About In- Memory
More informationAnalytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world
Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3
More informationHybrid Software Architectures for Big Data. Laurence.Hubert@hurence.com @hurence http://www.hurence.com
Hybrid Software Architectures for Big Data Laurence.Hubert@hurence.com @hurence http://www.hurence.com Headquarters : Grenoble Pure player Expert level consulting Training R&D Big Data X-data hot-line
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationAccelerating Hadoop MapReduce Using an In-Memory Data Grid
Accelerating Hadoop MapReduce Using an In-Memory Data Grid By David L. Brinker and William L. Bain, ScaleOut Software, Inc. 2013 ScaleOut Software, Inc. 12/27/2012 H adoop has been widely embraced for
More informationHadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
More information[Hadoop, Storm and Couchbase: Faster Big Data]
[Hadoop, Storm and Couchbase: Faster Big Data] With over 8,500 clients, LivePerson is the global leader in intelligent online customer engagement. With an increasing amount of agent/customer engagements,
More informationGetting Real Real Time Data Integration Patterns and Architectures
Getting Real Real Time Data Integration Patterns and Architectures Nelson Petracek Senior Director, Enterprise Technology Architecture Informatica Digital Government Institute s Enterprise Architecture
More informationWell packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances
INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
More informationIntroduction to Polyglot Persistence. Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace
Introduction to Polyglot Persistence Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace FOSSCOMM 2016 Background - 14 years in databases and system engineering - NoSQL DBA @ ObjectRocket
More informationTungsten Replicator, more open than ever!
Tungsten Replicator, more open than ever! MC Brown, Senior Product Line Manager September, 2015 2014 VMware Inc. All rights reserved. We Face An Age Old Problem BRS/Search 2 It s Gotten Worse 3 Much Worse
More informationOn- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform
On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...
More informationIn Memory Accelerator for MongoDB
In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000
More informationBig Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016
Big Data Approaches Making Sense of Big Data Ian Crosland Jan 2016 Accelerate Big Data ROI Even firms that are investing in Big Data are still struggling to get the most from it. Make Big Data Accessible
More informationReal Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationBigMemory: Providing competitive advantage through in-memory data management
BUSINESS WHITE PAPER : Providing competitive advantage through in-memory data management : Ultra-fast RAM + big data = business power TABLE OF CONTENTS 1 Introduction 2 : two ways to drive real-time big
More informationRemoving Failure Points and Increasing Scalability for the Engine that Drives webmd.com
Removing Failure Points and Increasing Scalability for the Engine that Drives webmd.com Matt Wilson Director, Consumer Web Operations, WebMD @mattwilsoninc 9/12/2013 About this talk Go over original site
More informationIntroduction to Hadoop. New York Oracle User Group Vikas Sawhney
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
More informationHow To Use Big Data For Telco (For A Telco)
ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call
More informationUsing an In-Memory Data Grid for Near Real-Time Data Analysis
SCALEOUT SOFTWARE Using an In-Memory Data Grid for Near Real-Time Data Analysis by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 IN today s competitive world, businesses
More informationMongoDB Developer and Administrator Certification Course Agenda
MongoDB Developer and Administrator Certification Course Agenda Lesson 1: NoSQL Database Introduction What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL Types of NoSQL
More informationTalend Real-Time Big Data Sandbox. Big Data Insights Cookbook
Talend Real-Time Big Data Talend Real-Time Big Data Overview of Real-time Big Data Pre-requisites to run Setup & Talend License Talend Real-Time Big Data Big Data Setup & About this cookbook What is the
More informationWhy NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
More informationEnabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings
Solution Brief Enabling Database-as-a-Service (DBaaS) within Enterprises or Cloud Offerings Introduction Accelerating time to market, increasing IT agility to enable business strategies, and improving
More informationHGST Virident Solutions 2.0
Brochure HGST Virident Solutions 2.0 Software Modules HGST Virident Share: Shared access from multiple servers HGST Virident HA: Synchronous replication between servers HGST Virident ClusterCache: Clustered
More informationBig Data JAMES WARREN. Principles and best practices of NATHAN MARZ MANNING. scalable real-time data systems. Shelter Island
Big Data Principles and best practices of scalable real-time data systems NATHAN MARZ JAMES WARREN II MANNING Shelter Island contents preface xiii acknowledgments xv about this book xviii ~1 Anew paradigm
More informationBig Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013
Big Data Use Case How Rackspace is using Private Cloud for Big Data Bryan Thompson May 8th, 2013 Our Big Data Problem Consolidate all monitoring data for reporting and analytical purposes. Every device
More informationSocial Networks and the Richness of Data
Social Networks and the Richness of Data Getting distributed Webservices Done with NoSQL Fabrizio Schmidt, Lars George VZnet Netzwerke Ltd. Content Unique Challenges System Evolution Architecture Activity
More informationReal Time Fraud Detection With Sequence Mining on Big Data Platform. Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May 6 2014 Santa Clara, CA
Real Time Fraud Detection With Sequence Mining on Big Data Platform Pranab Ghosh Big Data Consultant IEEE CNSV meeting, May 6 2014 Santa Clara, CA Open Source Big Data Eco System Query (NOSQL) : Cassandra,
More informationBig Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
More informationInformation Architecture
The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to
More informationIntroduction to Big Data Training
Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB
More informationOracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
More informationEnterprise Operational SQL on Hadoop Trafodion Overview
Enterprise Operational SQL on Hadoop Trafodion Overview Rohit Jain Distinguished & Chief Technologist Strategic & Emerging Technologies Enterprise Database Solutions Copyright 2012 Hewlett-Packard Development
More informationAssignment # 1 (Cloud Computing Security)
Assignment # 1 (Cloud Computing Security) Group Members: Abdullah Abid Zeeshan Qaiser M. Umar Hayat Table of Contents Windows Azure Introduction... 4 Windows Azure Services... 4 1. Compute... 4 a) Virtual
More informationHow In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
More informationBig Data & Data Science Course Example using MapReduce. Presented by Juan C. Vega
Big Data & Data Science Course Example using MapReduce Presented by What is Mongo? Why Mongo? Mongo Model Mongo Deployment Mongo Query Language Built-In MapReduce Demo Q & A Agenda Founders Max Schireson
More informationNoSQL for SQL Professionals William McKnight
NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to
More informationINTRODUCING APACHE IGNITE An Apache Incubator Project
WHITE PAPER BY GRIDGAIN SYSTEMS FEBRUARY 2015 INTRODUCING APACHE IGNITE An Apache Incubator Project COPYRIGHT AND TRADEMARK INFORMATION 2015 GridGain Systems. All rights reserved. This document is provided
More informationLuncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
More informationEvaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group
NoSQL Evaluator s Guide McKnight Consulting Group William McKnight is the former IT VP of a Fortune 50 company and the author of Information Management: Strategies for Gaining a Competitive Advantage with
More informationIn-Memory Computing Principles and Technology Overview
In-Memory Computing Principles and Technology Overview Agenda > Overview / Why In Memory" > Use Cases" > Concepts & Approaches" > In-Memory Computing Platform 6.1 In-Memory Data Grid In-Memory HPC In-Memory
More informationUnderstanding Neo4j Scalability
Understanding Neo4j Scalability David Montag January 2013 Understanding Neo4j Scalability Scalability means different things to different people. Common traits associated include: 1. Redundancy in the
More informationCSE-E5430 Scalable Cloud Computing Lecture 11
CSE-E5430 Scalable Cloud Computing Lecture 11 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 30.11-2015 1/24 Distributed Coordination Systems Consensus
More informationSPARK USE CASE IN TELCO. Apache Spark Night 9-2-2014! Chance Coble!
SPARK USE CASE IN TELCO Apache Spark Night 9-2-2014! Chance Coble! Use Case Profile Telecommunications company Shared business problems/pain Scalable analytics infrastructure is a problem Pushing infrastructure
More informationHow To Create A Data Visualization With Apache Spark And Zeppelin 2.5.3.5
Big Data Visualization using Apache Spark and Zeppelin Prajod Vettiyattil, Software Architect, Wipro Agenda Big Data and Ecosystem tools Apache Spark Apache Zeppelin Data Visualization Combining Spark
More informationUsing In-Memory Computing to Simplify Big Data Analytics
SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed
More informationChallenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2
More informationMongoDB and Couchbase
Benchmarking MongoDB and Couchbase No-SQL Databases Alex Voss Chris Choi University of St Andrews TOP 2 Questions Should a social scientist buy MORE or UPGRADE computers? Which DATABASE(s)? Document Oriented
More informationNoSQL Databases. Polyglot Persistence
The future is: NoSQL Databases Polyglot Persistence a note on the future of data storage in the enterprise, written primarily for those involved in the management of application development. Martin Fowler
More informationData Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here
Data Virtualization for Agile Business Intelligence Systems and Virtual MDM To View This Presentation as a Video Click Here Agenda Data Virtualization New Capabilities New Challenges in Data Integration
More informationTurn Big Data to Small Data
Turn Big Data to Small Data Use Qlik to Utilize Distributed Systems and Document Databases October, 2014 Stig Magne Henriksen Image: kdnuggets.com From Big Data to Small Data Agenda When do we have a Big
More informationLinux A first-class citizen in Windows Azure. Bruno Terkaly bterkaly@microsoft.com Principal Software Engineer Mobile/Cloud/Startup/Enterprise
Linux A first-class citizen in Windows Azure Bruno Terkaly bterkaly@microsoft.com Principal Software Engineer Mobile/Cloud/Startup/Enterprise 1 First, I am software developer (C/C++, ASM, C#, Java, Node.js,
More informationMakeMyTrip CUSTOMER SUCCESS STORY
MakeMyTrip CUSTOMER SUCCESS STORY MakeMyTrip is the leading travel site in India that is running two ClustrixDB clusters as multi-master in two regions. It removed single point of failure. MakeMyTrip frequently
More informationHow Companies are! Using Spark
How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationOracle NoSQL Database Product Overview. Robert Greene Product Manager / Strategist Nov 19, 2014
Oracle NoSQL Database Product Overview Robert Greene Product Manager / Strategist Nov 19, 2014 Oracle Confidential Internal/Restricted/Highly Restricted Safe Harbor Statement The following is intended
More informationReal Time Data Processing using Spark Streaming
Real Time Data Processing using Spark Streaming Hari Shreedharan, Software Engineer @ Cloudera Committer/PMC Member, Apache Flume Committer, Apache Sqoop Contributor, Apache Spark Author, Using Flume (O
More informationApache HBase. Crazy dances on the elephant back
Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage
More informationextensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010
System/ Scale to Primary Secondary Joins/ Integrity Language/ Data Year Paper 1000s Index Indexes Transactions Analytics Constraints Views Algebra model my label 1971 RDBMS O tables sql-like 2003 memcached
More informationPredictive Analytics with Storm, Hadoop, R on AWS
Douglas Moore Principal Consultant & Architect February 2013 Predictive Analytics with Storm, Hadoop, R on AWS Leading Provider Data Science and Engineering Services Accelerating Your Time to Value using
More informationBIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &
BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & Innovation 04-08-2011 to the EC 8 th February, Luxembourg Your Atos business Research technologists. and Innovation
More informationUnified Batch & Stream Processing Platform
Unified Batch & Stream Processing Platform Himanshu Bari Director Product Management Most Big Data Use Cases Are About Improving/Re-write EXISTING solutions To KNOWN problems Current Solutions Were Built
More informationBigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic
BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop
More informationThe evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect
The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect IT Insight podcast This podcast belongs to the IT Insight series You can subscribe to the podcast through
More informationCan the Elephants Handle the NoSQL Onslaught?
Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented
More informationTowards Smart and Intelligent SDN Controller
Towards Smart and Intelligent SDN Controller - Through the Generic, Extensible, and Elastic Time Series Data Repository (TSDR) YuLing Chen, Dell Inc. Rajesh Narayanan, Dell Inc. Sharon Aicler, Cisco Systems
More informationDeveloping Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University
More informationEvolution from Big Data to Smart Data
Evolution from Big Data to Smart Data Information is Exploding 120 HOURS VIDEO UPLOADED TO YOUTUBE 50,000 APPS DOWNLOADED 204 MILLION E-MAILS EVERY MINUTE EVERY DAY Intel Corporation 2015 The Data is Changing
More informationIndependent process platform
Independent process platform Megatrend in infrastructure software Dr. Wolfram Jost CTO February 22, 2012 2 Agenda Positioning BPE Strategy Cloud Strategy Data Management Strategy ETS goes Mobile Each layer
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationGridGain In- Memory Data Fabric: UlCmate Speed and Scale for TransacCons and AnalyCcs
GridGain In- Memory Data Fabric: UlCmate Speed and Scale for TransacCons and AnalyCcs DMITRIY SETRAKYAN Founder & EVP Engineering @dsetrakyan www.gridgain.com #gridgain Agenda EvoluCon of In- Memory CompuCng
More informationPutting Apache Kafka to Use!
Putting Apache Kafka to Use! Building a Real-time Data Platform for Event Streams! JAY KREPS, CONFLUENT! A Couple of Themes! Theme 1: Rise of Events! Theme 2: Immutability Everywhere! Level! Example! Immutable
More informationThe Flink Big Data Analytics Platform. Marton Balassi, Gyula Fora" {mbalassi, gyfora}@apache.org
The Flink Big Data Analytics Platform Marton Balassi, Gyula Fora" {mbalassi, gyfora}@apache.org What is Apache Flink? Open Source Started in 2009 by the Berlin-based database research groups In the Apache
More informationUsing MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com
Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com Agenda The rise of Big Data & Hadoop MySQL in the Big Data Lifecycle MySQL Solutions for Big Data Q&A
More informationThree Open Blueprints For Big Data Success
White Paper: Three Open Blueprints For Big Data Success Featuring Pentaho s Open Data Integration Platform Inside: Leverage open framework and open source Kickstart your efforts with repeatable blueprints
More informationTopology Aware Analytics for Elastic Cloud Services
Topology Aware Analytics for Elastic Cloud Services athafoud@cs.ucy.ac.cy Master Thesis Presentation May 28 th 2015, Department of Computer Science, University of Cyprus In Brief.. a Tool providing Performance
More informationAccelerating Real Time Big Data Applications. PRESENTATION TITLE GOES HERE Bob Hansen
Accelerating Real Time Big Data Applications PRESENTATION TITLE GOES HERE Bob Hansen Apeiron Data Systems Apeiron is developing a VERY high performance Flash storage system that alters the economics of
More informationFast Innovation requires Fast IT
Fast Innovation requires Fast IT 2014 Cisco and/or its affiliates. All rights reserved. 2 2014 Cisco and/or its affiliates. All rights reserved. 3 IoT World Forum Architecture Committee 2013 Cisco and/or
More informationScalable Architecture on Amazon AWS Cloud
Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect
More informationA Comparison of Clouds: Amazon Web Services, Windows Azure, Google Cloud Platform, VMWare and Others (Fall 2012)
1. Computation Amazon Web Services Amazon Elastic Compute Cloud (Amazon EC2) provides basic computation service in AWS. It presents a virtual computing environment and enables resizable compute capacity.
More informationNoSQL in der Cloud Why? Andreas Hartmann
NoSQL in der Cloud Why? Andreas Hartmann 17.04.2013 17.04.2013 2 NoSQL in der Cloud Why? Quelle: http://res.sys-con.com/story/mar12/2188748/cloudbigdata_0_0.jpg Why Cloud??? 17.04.2013 3 NoSQL in der Cloud
More informationSAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here
PLATFORM Top Ten Questions for Choosing In-Memory Databases Start Here PLATFORM Top Ten Questions for Choosing In-Memory Databases. Are my applications accelerated without manual intervention and tuning?.
More informationOpen Source Technologies on Microsoft Azure
Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions
More informationReference Model for Cloud Applications CONSIDERATIONS FOR SW VENDORS BUILDING A SAAS SOLUTION
October 2013 Daitan White Paper Reference Model for Cloud Applications CONSIDERATIONS FOR SW VENDORS BUILDING A SAAS SOLUTION Highly Reliable Software Development Services http://www.daitangroup.com Cloud
More informationMySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!)
MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!) Erdélyi Ernő, Component Soft Kft. erno@component.hu www.component.hu 2013 (c) Component Soft Ltd Leading Hadoop Vendor Copyright 2013,
More informationNoSQL Data Base Basics
NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS
More information