Big Data for the Mass. Dr Franclin Foping Senior Software Engineer NitroSell.com

Size: px
Start display at page:

Download "Big Data for the Mass. Dr Franclin Foping franclin.foping@nitrosell.com Senior Software Engineer NitroSell.com"

Transcription

1 Big Data for the Mass Dr Franclin Foping Senior Software Engineer NitroSell.com

2 To SQL or NOT to SQL? Relational DB Build in the 70s to solve problems of the 70s Still useful but

3 New Trends Big Data Concurrency Data replication And yes, scalability

4

5 Benefits of nosql Large volumes of structured, semi-structured, and unstructured data Agile sprints, quick iteration, and frequent code pushes Object-oriented programming that is easy to use and flexible Efficient, scale-out architecture instead of expensive, monolithic architecture

6 More Goodness Dynamic Schemas Auto-sharding Auto-Replication Integrated Caching

7 CASE STUDY Key Value Storage Realtime data analytics (INSIGHT)

8 NitroSell platform Allows retailers to easily sell their products online E-Commerce on the up! Caching is a BIG issue out there

9 NitroSell Needed a place to store and retrieve data quickly RDBMS too slow and data structure? What sort of data we store? Session, navigation panel and so forth REDIS

10 Redisallows fast insertion and retrieval Excellent key value storage Schema-less a plus!

11 Sample Data NavPanel <br/><!--<div id="relatedproductspanel"> --> <div id="relatedproducts"> <h2 id="relateditemheading" class="red">stuff you may also like</h2> <table id="relateditemtable"><tr><td id="relateditembody"> <table class="relateditembody"> <tr> <td align="center" class="smalltext" width="25%"> <input type="hidden" name="relitem" value="1600"> <a href="http://www.partyworld.ie/white-snazaroo-face-paint/snaz01/"><img class="image-thumb" alt="white SnazarooFace Paint" title="white SnazarooFace Paint" src="http://images.nitrosell.com/product_images/8/1790/thumb-facepaint-white.jpg" border="0"/></a><br /> <a href="http://www.partyworld.ie/white-snazaroo-face-paint/snaz01/">white SnazarooFace Paint</a><br/> <span class="text-pricestrike"> 5.00</span> <span class="text-pricespecial"> 4.00</span> </td> <td align="center" class="smalltext" width="25%"> <input type="hidden" name="relitem" value="7108"> <a href="http://www.partyworld.ie/long-witch-wig/50702/"><img class="image-thumb" alt="long Witch Wig" title="long Witch Wig" src="http://images.nitrosell.com/product_images/8/1790/thumb jpg" border="0"/></a><br/> <a href="http://www.partyworld.ie/long-witch-wig/50702/">long Witch Wig</a><br/> <span class="text-price"> 6.00</span> </td> <td align="center" class="smalltext" width="25%"> <input type="hidden" name="relitem" value="1834"> <a href="http://www.partyworld.ie/wig-cap/20136/"><img class="image-thumb" alt="wig Cap" title="wig Cap" src="http://images.nitrosell.com/product_images/8/1790/thumb-wig-cap.jpg" border="0"/></a><br/> <a href="http://www.partyworld.ie/wig-cap/20136/">wig Cap</a><br/> <span class="textpricestrike"> 3.00</span> <span class="text-pricespecial"> 2.40</span> </td> <td align="center" class="smalltext" width="25%"> <input type="hidden" name="relitem" value="13532"> <a href="http://www.partyworld.ie/the-corpse-bride-costume/3053-md/"><img class="image-thumb" alt="the Corpse Bride Costume" title="the Corpse Bride Costume" src="http://images.nitrosell.com/product_images/8/1790/thumb-3053_jpg.gif" border="0"/></a><br /> <a href="http://www.partyworld.ie/the-corpse-bride-costume/3053-md/">the Corpse Bride Costume</a><br /> <span class="text-pricestrike"> 37.00</span> <span class="text-pricespecial"> 31.45</span> </td> </tr> </table> </td></tr></table> </div> <!--</div> -->

12 CASE STUDY 2 Real-time Web Analytics Join project undertaken in UCC/NitroSell Provide a smarter related item panel to customers Recommender systems all over e-commerce and the Web in general!

13 Google Analytics Deployed on 2 of our Irish stores (partyworld.ie and wreckless.ie) Each generates about 1000 lines in the logs a day 4 types of events: pageview, addtobasket, removedfrombasket, and ordercomplete

14 Data modeling Each event has its own data types. Ordertotal only in ordercomplete, NULLs in RDBMS Need for a schema-less database Data aggregation played a huge part in the analysis

15 MongoDBdid the trick Excellent Map-Reduce support Data sharding, concurrency and scalability taken care of

16 Sample Data Experiment started a month ago Over 34k log entries for over 7k visitors Worth more than 8k euros

17 Ordercompleteevent { "_id" : ObjectId("54f4834ba4bc3eb710b8e0b1"), "ip_address" : " ", "ordertotal" : "27.49", "basket" : "3827:1", "cookieid" : "695e69j7bftohc1r5qiuak1bfhq9jji6", "abexperiment" : "1", "UserAgent" : "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/ (KHTML, l "hittype" : "ordercomplete", "usergroup" : "B", "abexpuserid" : "66f1a31034c84179-B", "username" : "695e69j7bftohc1r5qiuak1bfhq9jji6", "uid3" : "66f1a31034c84179-B", "ordernumber" : " ", "timestamp" : ISODate(" T15:35:39Z") }

18 pageview { "_id" : ObjectId("54f5f1b8a4bc3e4b08295c49"), "ip_address" : " ", "countrycode" : "IE", "productid" : "14696", "productcode" : "41945", "productname" : "Farm Party Balloons", "cookieid" : "3hafagjtsna915ha5668hddkouhdl637", "username" : "3hafagjtsna915ha5668hddkouhdl637", "abexpuserid" : "a8621f90a7934da4-b", "hittype" : "pageview", "ClickedLink" : "http://www.partyworld.ie/farm-party-balloons/41945/", "pagetitle" : "Farm Party Balloons- PartyWorld Costume Shop", "timestamp" : ISODate(" T17:39:04Z"), "referrer" : "http://www.partyworld.ie/511/kids-partyware/farm-party-sup "domain" : "www.partyworld.ie", "UserAgent" : "Mozilla/5.0 (Linux; U; Android 4.4.2; en-ie; GT-P5210 Bui Safari/534.30", "abexperiment" : "true", "relateditems" : "14715,14698,14695,14763", "usergroup" : "B" }

19 Lessons learnt Product popularity Who is viewing our stores? When? How? IE popularity

20 Summary RDBMS, glory days over Designed in the 70s to solve problems of the 70s Scalability

21 References Efficient Market based analysis 8.pdf Modeling and Analysis of a multi-level caching library/conferences/miami2004/papers/ pdf

22 QUESTIONS?

Big Data Unlock the mystery and see what the future holds. Philip Sow SE Manager, SEA

Big Data Unlock the mystery and see what the future holds. Philip Sow SE Manager, SEA Big Data Unlock the mystery and see what the future holds Philip Sow SE Manager, SEA THE ERA OF BIG DATA Big Data Market: Reach $32.1 Billion in 2015 & to $54.4 billion by 2017 The 3 + 1 Vs Structure/Semi/Unstructured

More information

INTRODUCTION TO CASSANDRA

INTRODUCTION TO CASSANDRA INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open

More information

Department of Software Systems. Presenter: Saira Shaheen, 227233 saira.shaheen@tut.fi 0417016438 Dated: 02-10-2012

Department of Software Systems. Presenter: Saira Shaheen, 227233 saira.shaheen@tut.fi 0417016438 Dated: 02-10-2012 1 MongoDB Department of Software Systems Presenter: Saira Shaheen, 227233 saira.shaheen@tut.fi 0417016438 Dated: 02-10-2012 2 Contents Motivation : Why nosql? Introduction : What does NoSQL means?? Applications

More information

Evolution of Web Application Architecture International PHP Conference. Kore Nordmann / @koredn / June 9th, 2015

Evolution of Web Application Architecture International PHP Conference. Kore Nordmann / @koredn / <kore@qafoo.com> June 9th, 2015 Evolution of Web Application Architecture International PHP Conference Kore Nordmann / @koredn / June 9th, 2015 Evolution Problem Too many visitors Evolution Evolution Lessons Learned:

More information

NoSQL replacement for SQLite (for Beatstream) Antti-Jussi Kovalainen Seminar OHJ-1860: NoSQL databases

NoSQL replacement for SQLite (for Beatstream) Antti-Jussi Kovalainen Seminar OHJ-1860: NoSQL databases NoSQL replacement for SQLite (for Beatstream) Antti-Jussi Kovalainen Seminar OHJ-1860: NoSQL databases Background Inspiration: postgresapp.com demo.beatstream.fi (modern desktop browsers without

More information

Big Data Anwendungen in Industrie und Forschung

Big Data Anwendungen in Industrie und Forschung Big Data Anwendungen in Industrie und Forschung Dr. Reinhard Stumptner +43 7236 3343 851 reinhard.stumptner@scch.at www.scch.at Das SCCH ist eine Initiative der Das SCCH befindet sich im SCCH Key Facts

More information

Making Sense of Big Data in Insurance

Making Sense of Big Data in Insurance Making Sense of Big Data in Insurance Amir Halfon, CTO, Financial Services, MarkLogic Corporation BIG DATA?.. SLIDE: 2 The Evolution of Data Management For your application data! Application- and hardware-specific

More information

Search and Real-Time Analytics on Big Data

Search and Real-Time Analytics on Big Data Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its

More information

Structured Data Storage

Structured Data Storage Structured Data Storage Xgen Congress Short Course 2010 Adam Kraut BioTeam Inc. Independent Consulting Shop: Vendor/technology agnostic Staffed by: Scientists forced to learn High Performance IT to conduct

More information

MongoDB Developer and Administrator Certification Course Agenda

MongoDB Developer and Administrator Certification Course Agenda MongoDB Developer and Administrator Certification Course Agenda Lesson 1: NoSQL Database Introduction What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL Types of NoSQL

More information

Architecting for the Internet of Things & Big Data

Architecting for the Internet of Things & Big Data Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to

More information

never 20X spike ClustrixDB 2nd Choxi (Formally nomorerack.com) Customer Success Story Reliability and Availability with fast growth in the cloud

never 20X spike ClustrixDB 2nd Choxi (Formally nomorerack.com) Customer Success Story Reliability and Availability with fast growth in the cloud Choxi (Formally nomorerack.com) Reliability and Availability with fast growth in the cloud Customer Success Story 2nd fastest growing e-tailer on Internet Retailer Top 100 600% increase in sales on Cyber

More information

MakeMyTrip CUSTOMER SUCCESS STORY

MakeMyTrip CUSTOMER SUCCESS STORY MakeMyTrip CUSTOMER SUCCESS STORY MakeMyTrip is the leading travel site in India that is running two ClustrixDB clusters as multi-master in two regions. It removed single point of failure. MakeMyTrip frequently

More information

NoSQL for SQL Professionals William McKnight

NoSQL for SQL Professionals William McKnight NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to

More information

Integrating Big Data into the Computing Curricula

Integrating Big Data into the Computing Curricula Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big

More information

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing Evaluating NoSQL for Enterprise Applications Dirk Bartels VP Strategy & Marketing Agenda The Real Time Enterprise The Data Gold Rush Managing The Data Tsunami Analytics and Data Case Studies Where to go

More information

Firebird meets NoSQL (Apache HBase) Case Study

Firebird meets NoSQL (Apache HBase) Case Study Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

nomorerack CUSTOMER SUCCESS STORY RELIABILITY AND AVAILABILITY WITH FAST GROWTH IN THE CLOUD

nomorerack CUSTOMER SUCCESS STORY RELIABILITY AND AVAILABILITY WITH FAST GROWTH IN THE CLOUD nomorerack RELIABILITY AND AVAILABILITY WITH FAST GROWTH IN THE CLOUD CUSTOMER SUCCESS STORY Nomorerack is one of the fastest growing e-commerce companies in the US with 1023% growth in revenue and 15-20x

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1 Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots

More information

Doc ID: URCHINB-001 (3/30/05)

Doc ID: URCHINB-001 (3/30/05) Urchin 2005 Linux Web Host. All rights reserved. The content of this manual is furnished under license and may be used or copied only in accordance with this license. No part of this publication may be

More information

www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach

www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach Nic Caine NoSQL Matters, April 2013 Overview The Problem Current Big Data Analytics Relationship Analytics Leveraging

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

Big Data Solutions. Portal Development with MongoDB and Liferay. Solutions

Big Data Solutions. Portal Development with MongoDB and Liferay. Solutions Big Data Solutions Portal Development with MongoDB and Liferay Solutions Introduction Companies have made huge investments in Business Intelligence and analytics to better understand their clients and

More information

Scalable ecommerce with NoSQL. Dipali Trivedi

Scalable ecommerce with NoSQL. Dipali Trivedi Scalable ecommerce with NoSQL Dipali Trivedi ECommerce entities and schema Key aspect of NoSQL adoption Denomarlization: Key Aspect of NoSQL adoption Question oriented schema design: A. What are the products

More information

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems

More information

Fact Sheet In-Memory Analysis

Fact Sheet In-Memory Analysis Fact Sheet In-Memory Analysis 1 Copyright Yellowfin International 2010 Contents In Memory Overview...3 Benefits...3 Agile development & rapid delivery...3 Data types supported by the In-Memory Database...4

More information

So What s the Big Deal?

So What s the Big Deal? So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data

More information

Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET. ISGC 2013, March 2013

Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET. ISGC 2013, March 2013 Efficient Management of System Logs using a Cloud Radoslav Bodó, Daniel Kouřil CESNET ISGC 2013, March 2013 Agenda Introduction Collecting logs Log Processing Advanced analysis Resume Introduction Status

More information

Not Relational Models For The Management of Large Amount of Astronomical Data. Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF)

Not Relational Models For The Management of Large Amount of Astronomical Data. Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF) Not Relational Models For The Management of Large Amount of Astronomical Data Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF) What is a DBMS A Data Base Management System is a software infrastructure

More information

NoSQL Databases. Nikos Parlavantzas

NoSQL Databases. Nikos Parlavantzas !!!! NoSQL Databases Nikos Parlavantzas Lecture overview 2 Objective! Present the main concepts necessary for understanding NoSQL databases! Provide an overview of current NoSQL technologies Outline 3!

More information

Introduction to Apache Cassandra

Introduction to Apache Cassandra Introduction to Apache Cassandra White Paper BY DATASTAX CORPORATION JULY 2013 1 Table of Contents Abstract 3 Introduction 3 Built by Necessity 3 The Architecture of Cassandra 4 Distributing and Replicating

More information

Practical Cassandra. Vitalii Tymchyshyn tivv00@gmail.com @tivv00

Practical Cassandra. Vitalii Tymchyshyn tivv00@gmail.com @tivv00 Practical Cassandra NoSQL key-value vs RDBMS why and when Cassandra architecture Cassandra data model Life without joins or HDD space is cheap today Hardware requirements & deployment hints Vitalii Tymchyshyn

More information

Real-time Big Data An Agile Approach. Presented by: Cory Isaacson, CEO CodeFutures Corporation http://www.codefutures.com

Real-time Big Data An Agile Approach. Presented by: Cory Isaacson, CEO CodeFutures Corporation http://www.codefutures.com Real-time Big Data An Agile Approach Presented by: Cory Isaacson, CEO CodeFutures Corporation http://www.codefutures.com Fall 2014 Introduction Who I am Cory Isaacson, CEO/CTO of CodeFutures Providers

More information

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept

More information

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Real World Big Data Architecture - Splunk, Hadoop, RDBMS Copyright 2015 Splunk Inc. Real World Big Data Architecture - Splunk, Hadoop, RDBMS Raanan Dagan, Big Data Specialist, Splunk Disclaimer During the course of this presentagon, we may make forward looking

More information

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3

More information

Applications for Big Data Analytics

Applications for Big Data Analytics Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail:

More information

SmithCart Google Analytics

SmithCart Google Analytics SmithCart Google Analytics Users Manual Revision Date: 9/7/2013 1 Table of Contents I. Introduction... 3 II. What is Google Analytics?... 3 III. Configuring your Google Account to Use Analytics... 3 A.

More information

ADVANCED DATABASES PROJECT. Juan Manuel Benítez V. - 000425944. Gledys Sulbaran - 000423426

ADVANCED DATABASES PROJECT. Juan Manuel Benítez V. - 000425944. Gledys Sulbaran - 000423426 ADVANCED DATABASES PROJECT Juan Manuel Benítez V. - 000425944 Gledys Sulbaran - 000423426 TABLE OF CONTENTS Contents Introduction 1 What is NoSQL? 2 Why NoSQL? 3 NoSQL vs. SQL 4 Mongo DB - Introduction

More information

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre NoSQL systems: introduction and data models Riccardo Torlone Università Roma Tre Why NoSQL? In the last thirty years relational databases have been the default choice for serious data storage. An architect

More information

Introduction to Big Data Training

Introduction to Big Data Training Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB

More information

Hacettepe University Department Of Computer Engineering BBM 471 Database Management Systems Experiment

Hacettepe University Department Of Computer Engineering BBM 471 Database Management Systems Experiment Hacettepe University Department Of Computer Engineering BBM 471 Database Management Systems Experiment Subject NoSQL Databases - MongoDB Submission Date 20.11.2013 Due Date 26.12.2013 Programming Environment

More information

InfiniteGraph: The Distributed Graph Database

InfiniteGraph: The Distributed Graph Database A Performance and Distributed Performance Benchmark of InfiniteGraph and a Leading Open Source Graph Database Using Synthetic Data Objectivity, Inc. 640 West California Ave. Suite 240 Sunnyvale, CA 94086

More information

A Scalable Data Transformation Framework using the Hadoop Ecosystem

A Scalable Data Transformation Framework using the Hadoop Ecosystem A Scalable Data Transformation Framework using the Hadoop Ecosystem Raj Nair Director Data Platform Kiru Pakkirisamy CTO AGENDA About Penton and Serendio Inc Data Processing at Penton PoC Use Case Functional

More information

Big Data & Data Science Course Example using MapReduce. Presented by Juan C. Vega

Big Data & Data Science Course Example using MapReduce. Presented by Juan C. Vega Big Data & Data Science Course Example using MapReduce Presented by What is Mongo? Why Mongo? Mongo Model Mongo Deployment Mongo Query Language Built-In MapReduce Demo Q & A Agenda Founders Max Schireson

More information

Building Your First MongoDB Application

Building Your First MongoDB Application Building Your First MongoDB Application Ross Lawley Python Engineer @ 10gen Web developer since 1999 Passionate about open source Agile methodology email: ross@10gen.com twitter: RossC0 Today's Talk Quick

More information

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO paul@10gen.com DAMA SF December 15, 2011

NoSQL - What we ve learned with mongodb. Paul Pedersen, Deputy CTO paul@10gen.com DAMA SF December 15, 2011 NoSQL - What we ve learned with mongodb Paul Pedersen, Deputy CTO paul@10gen.com DAMA SF December 15, 2011 DW2.0 and NoSQL management decision support intgrated access - local v. global - structured v.

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information

Data Modeling for Big Data

Data Modeling for Big Data Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes

More information

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 NoSQL Databases Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 Database Landscape Source: H. Lim, Y. Han, and S. Babu, How to Fit when No One Size Fits., in CIDR,

More information

Introduction to Big Data the four V's

Introduction to Big Data the four V's Chapter 1: Introduction to Big Data the four V's This chapter is mainly based on the Big Data script by Donald Kossmann and Nesime Tatbul (ETH Zürich) Big Data Management and Analytics 15 Goal of Today

More information

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Increasing Business Productivity and Value in Financial Services with Secure Big Data Architecture

Increasing Business Productivity and Value in Financial Services with Secure Big Data Architecture Increasing Business Productivity and Value in Financial Services with Secure Big Data Architecture Stefanus Natahusada, Director/Consultant Email: info@stefansecurity.com Agenda Financial Services Requirements

More information

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF LANE @GEOFFLANE

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF LANE <GEOFF@ZORCHED.NET> @GEOFFLANE NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF LANE @GEOFFLANE WHAT IS NOSQL? NON-RELATIONAL DATA STORAGE USUALLY SCHEMA-FREE ACCESS DATA WITHOUT SQL (THUS... NOSQL) WIDE-COLUMN / TABULAR

More information

NoSQL in der Cloud Why? Andreas Hartmann

NoSQL in der Cloud Why? Andreas Hartmann NoSQL in der Cloud Why? Andreas Hartmann 17.04.2013 17.04.2013 2 NoSQL in der Cloud Why? Quelle: http://res.sys-con.com/story/mar12/2188748/cloudbigdata_0_0.jpg Why Cloud??? 17.04.2013 3 NoSQL in der Cloud

More information

CitusDB Architecture for Real-Time Big Data

CitusDB Architecture for Real-Time Big Data CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing

More information

Evaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group

Evaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group NoSQL Evaluator s Guide McKnight Consulting Group William McKnight is the former IT VP of a Fortune 50 company and the author of Information Management: Strategies for Gaining a Competitive Advantage with

More information

MySQL. Leveraging. Features for Availability & Scalability ABSTRACT: By Srinivasa Krishna Mamillapalli

MySQL. Leveraging. Features for Availability & Scalability ABSTRACT: By Srinivasa Krishna Mamillapalli Leveraging MySQL Features for Availability & Scalability ABSTRACT: By Srinivasa Krishna Mamillapalli MySQL is a popular, open-source Relational Database Management System (RDBMS) designed to run on almost

More information

Oracle NoSQL Database and SanDisk Offer Cost-Effective Extreme Performance for Big Data

Oracle NoSQL Database and SanDisk Offer Cost-Effective Extreme Performance for Big Data WHITE PAPER Oracle NoSQL Database and SanDisk Offer Cost-Effective Extreme Performance for Big Data 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Abstract... 3 What Is Big Data?...

More information

Lesson 7 - Website Administration

Lesson 7 - Website Administration Lesson 7 - Website Administration If you are hired as a web designer, your client will most likely expect you do more than just create their website. They will expect you to also know how to get their

More information

ABTO Software PHP Web Development Overview

ABTO Software PHP Web Development Overview ABTO Software PHP Web Development Overview ABTO Software is a Custom PHP Web Development Company One of ABTO Software s specializations as a top Ukrainian outsourcing software development company is PHP

More information

Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise

Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise White Paper BY DATASTAX CORPORATION October 2013 1 Table of Contents Abstract 3 Introduction 3 The Growth in Multiple

More information

Urchin E-Commerce. User Guide

Urchin E-Commerce. User Guide 2005 Linux Web Host. All rights reserved. The content of this manual is furnished under license and may be used or copied only in accordance with this license. No part of this publication may be reproduced,

More information

Domain driven design, NoSQL and multi-model databases

Domain driven design, NoSQL and multi-model databases Domain driven design, NoSQL and multi-model databases Java Meetup New York, 10 November 2014 Max Neunhöffer www.arangodb.com Max Neunhöffer I am a mathematician Earlier life : Research in Computer Algebra

More information

Preparing Your Data For Cloud

Preparing Your Data For Cloud Preparing Your Data For Cloud Narinder Kumar Inphina Technologies 1 Agenda Relational DBMS's : Pros & Cons Non-Relational DBMS's : Pros & Cons Types of Non-Relational DBMS's Current Market State Applicability

More information

Trafodion Operational SQL-on-Hadoop

Trafodion Operational SQL-on-Hadoop Trafodion Operational SQL-on-Hadoop SophiaConf 2015 Pierre Baudelle, HP EMEA TSC July 6 th, 2015 Hadoop workload profiles Operational Interactive Non-interactive Batch Real-time analytics Operational SQL

More information

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Data-intensive HPC: opportunities and challenges. Patrick Valduriez Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,

More information

nosql and Non Relational Databases

nosql and Non Relational Databases nosql and Non Relational Databases Image src: http://www.pentaho.com/big-data/nosql/ Matthias Lee Johns Hopkins University What NoSQL? Yes no SQL.. Atleast not only SQL Large class of Non Relaltional Databases

More information

Web Traffic Capture. 5401 Butler Street, Suite 200 Pittsburgh, PA 15201 +1 (412) 408 3167 www.metronomelabs.com

Web Traffic Capture. 5401 Butler Street, Suite 200 Pittsburgh, PA 15201 +1 (412) 408 3167 www.metronomelabs.com Web Traffic Capture Capture your web traffic, filtered and transformed, ready for your applications without web logs or page tags and keep all your data inside your firewall. 5401 Butler Street, Suite

More information

Big Data Scenario mit Power BI vs. SAP HANA Gerhard Brückl

Big Data Scenario mit Power BI vs. SAP HANA Gerhard Brückl Big Data Scenario mit Power BI vs. SAP HANA Gerhard Brückl About me Gerhard Brückl Working with Microsoft BI since 2006 Started working with SAP HANA in 2013 focused on Analytics and Reporting Blog: email:

More information

Big Data. Facebook Wall Data using Graph API. Presented by: Prashant Patel-2556219 Jaykrushna Patel-2619715

Big Data. Facebook Wall Data using Graph API. Presented by: Prashant Patel-2556219 Jaykrushna Patel-2619715 Big Data Facebook Wall Data using Graph API Presented by: Prashant Patel-2556219 Jaykrushna Patel-2619715 Outline Data Source Processing tools for processing our data Big Data Processing System: Mongodb

More information

Splice Machine: SQL-on-Hadoop Evaluation Guide www.splicemachine.com

Splice Machine: SQL-on-Hadoop Evaluation Guide www.splicemachine.com REPORT Splice Machine: SQL-on-Hadoop Evaluation Guide www.splicemachine.com The content of this evaluation guide, including the ideas and concepts contained within, are the property of Splice Machine,

More information

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН Zettabytes Petabytes ABC Sharding A B C Id Fn Ln Addr 1 Fred Jones Liberty, NY 2 John Smith?????? 122+ NoSQL Database

More information

Dominik Wagenknecht Accenture

Dominik Wagenknecht Accenture Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna

More information

Business Intelligence for Big Data

Business Intelligence for Big Data Business Intelligence for Big Data Will Gorman, Vice President, Engineering May, 2011 2010, Pentaho. All Rights Reserved. www.pentaho.com. What is BI? Business Intelligence = reports, dashboards, analysis,

More information

There s no way around it: learning about Big Data means

There s no way around it: learning about Big Data means In This Chapter Chapter 1 Introducing Big Data Beginning with Big Data Meeting MapReduce Saying hello to Hadoop Making connections between Big Data, MapReduce, and Hadoop There s no way around it: learning

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!)

MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!) MySQL és Hadoop mint Big Data platform (SQL + NoSQL = MySQL Cluster?!) Erdélyi Ernő, Component Soft Kft. erno@component.hu www.component.hu 2013 (c) Component Soft Ltd Leading Hadoop Vendor Copyright 2013,

More information

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice

More information

Getting Started with MongoDB

Getting Started with MongoDB Getting Started with MongoDB TCF IT Professional Conference March 14, 2014 Michael P. Redlich @mpredli about.me/mpredli/ 1 1 Who s Mike? BS in CS from Petrochemical Research Organization Ai-Logix, Inc.

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

Scaling up = getting a better machine. Scaling out = use another server and add it to your cluster.

Scaling up = getting a better machine. Scaling out = use another server and add it to your cluster. MongoDB 1. Introduction MongoDB is a document-oriented database, not a relation one. It replaces the concept of a row with a document. This makes it possible to represent complex hierarchical relationships

More information

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013 F1: A Distributed SQL Database That Scales Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013 What is F1? Distributed relational database Built to replace sharded MySQL back-end of AdWords

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

About Terrace. Company History. 1-888-269-6200 P.O. Box 190367 San Francisco, Ca. 94119

About Terrace. Company History. 1-888-269-6200 P.O. Box 190367 San Francisco, Ca. 94119 About Terrace Business works with Terrace. Terrace designs & develops innovative technology solutions for the connected workplace - cloud, mobile, on premises and desktop. Our talented teams understand

More information

Lecture 21: NoSQL III. Monday, April 20, 2015

Lecture 21: NoSQL III. Monday, April 20, 2015 Lecture 21: NoSQL III Monday, April 20, 2015 Announcements Issues/questions with Quiz 6 or HW4? This week: MongoDB Next class: Quiz 7 Make-up quiz: 04/29 at 6pm (or after class) Reminders: HW 4 and Project

More information

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and

More information

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Big Data Analytics DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Tom Haughey InfoModel, LLC 868 Woodfield Road Franklin Lakes, NJ 07417 201 755 3350 tom.haughey@infomodelusa.com

More information

Social Networks and the Richness of Data

Social Networks and the Richness of Data Social Networks and the Richness of Data Getting distributed Webservices Done with NoSQL Fabrizio Schmidt, Lars George VZnet Netzwerke Ltd. Content Unique Challenges System Evolution Architecture Activity

More information

Using In-Memory Computing to Simplify Big Data Analytics

Using In-Memory Computing to Simplify Big Data Analytics SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed

More information

What is Big Data? BCS Aberdeen Branch 6 November 2014

What is Big Data? BCS Aberdeen Branch 6 November 2014 What is Big Data? BCS Aberdeen Branch 6 November 2014 Keith Gordon Soldier Teacher Data Manager Engineer Information Systems Professional Standards Expert Big Data Sceptic What they say The overeager adoption

More information

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY ANN KELLY II MANNING Shelter Island contents foreword preface xvii xix acknowledgments xxi about this book xxii Part 1 Introduction

More information

Large Data Methods: Keeping it Simple on the Path to Big Data

Large Data Methods: Keeping it Simple on the Path to Big Data Large Data Methods: Keeping it Simple on the Path to Big Data Big Data Business Forum, San Francisco, November 13, 2012 Jim Porzak, Sr. Dir. Business Intelligence, Minted 1 What we will cover: 1. Who is

More information

An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP Oracle ESG Data Systems Architecture

An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP Oracle ESG Data Systems Architecture An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP ESG Data Systems Architecture Big Data & Analytics as a Service Components Unstructured Data / Sparse Data of Value

More information

Time-Series Databases and Machine Learning

Time-Series Databases and Machine Learning Time-Series Databases and Machine Learning Jimmy Bates November 2017 1 Top-Ranked Hadoop 1 3 5 7 Read Write File System World Record Performance High Availability Enterprise-grade Security Distribution

More information

Applying Intelligence. Intelligent Solutions. a Smarter estore

Applying Intelligence. Intelligent Solutions. a Smarter estore Applying Intelligence Intelligent Solutions Smart Data for a Smarter estore Product Data Services ecommerce Data Optimization ALTIUS suite of services help you design a great data structure, build the

More information

You need to be assigned and logged in to the system by the Records Management Service in order to use it.

You need to be assigned and logged in to the system by the Records Management Service in order to use it. Guidance for using the Records Management Service software The software can be used to undertake the following tasks:- 1. Sending information about the boxes to be transferred to the Records Centre. 2.

More information