Hadoop and NoSQL Basics: Big Data Demystified. NYS Innovation Summit, 12/17/2013. Matt
|
|
- Toby Hall
- 7 years ago
- Views:
Transcription
1 Hadoop and NoSQL Basics: Big Data Demystified NYS Innovation Summit, 12/17/2013 Matt
2 When I want people to think I m smart, I just say HADOOP really loud.
3
4 Hadoop! There it is. Big Data! Data Science! Algorithms!
5
6
7 ... why are we thinking about this at all?
8 ALL the data created until the year 2003 = ALL the data created every two days
9
10 Writes > 12 terabytes of data per day.
11 *the 451 group
12 ... how did we get here?
13 HIERARCHICAL DATABASE MODEL RELATIONAL DATABASE MODEL DOCUMENT DATABASE MODEL
14 HIERARCHICAL DATABASE MODEL Fruit Orange Apple Grape Granny Smith Honeycrisp Red Delicious Used in early mainframe computing! Stores data in one-to-many trees! Not very flexible
15 RELATIONAL DATABASE MODEL Fruit_Variety Granny Smith Honeycrisp Red Delicious Navel Fruit Apple Apple Apple Orange Invented in 1970 by Edgar F. Codd at IBM! Stores data in tuples which resemble rows of a table! Still the most widely used database model
16 RELATIONAL DATABASE MODEL Fruit_ID Fruit_Name 1 Orange 2 Apple 3 Grape Variety_ID Variety_Name Fruit_ID 1 Granny Smith 2 2 Honeycrisp 2 3 Red Delicious 2 4 Navel 1... can also store hierarchical data!
17 RELATIONAL DATABASE MODEL Fruit_ID Fruit_Name 1 Orange 2 Apple 3 Grape Variety_ID Variety_Name Fruit_ID 1 Granny Smith 2 2 Honeycrisp 2 3 Red Delicious 2 4 Navel 1 Has rigid structure or schema.
18 RELATIONAL DATABASE MODEL Fruit_ID Fruit_Name 1 Orange 2 Apple 3 Grape Variety_ID Variety_Name Fruit_ID 1 Granny Smith 2 2 Honeycrisp 2 3 Red Delicious 2 4 Navel 1 Uses unique keys for consistency across tables
19 DOCUMENT DATABASE MODEL Red Delicious Apple Honeycrisp Apple Navel Orange Granny Smith Apple Doesn t have a single structure or schema that each entry must follow! Developed in 1995 for use with Lotus Notes! SO TRENDY
20 DOCUMENT DATABASE MODEL {! Fruits : [! {! Type : Apple,! Variety : Red Delicious! },! {! Name : Granny Smith Apple! },! Navel Orange! ]! }!! CAN have structured elements, but structure doesn t need to be consistent across entries
21 HIERARCHICAL RIGID DATABASE MODEL RELATIONAL DATABASE MODEL DOCUMENT DATABASE MODEL FLEXIBLE
22 HIERARCHICAL RIGID DATABASE MODEL RELATIONAL DATABASE MODEL DOCUMENT DATABASE MODEL FLEXIBLE
23 Relational Database is to Document Database As Excel Spreadsheet is to Word Document!
24 Relational Database is to Document Database As Excel Spreadsheet is to Word Document!... as SQL is to NoSQL
25 Relational Database is to Document Database As Excel Spreadsheet is to Word Document!... as SQL is to NoSQL* *... mostly / sorta. Stay tuned!
26 SQL, or Structured Query Language, is a language for getting data into and out of a relational database. SELECT Variety_Name FROM fruits WHERE fruit_id = 2! Variety_Name! ! Granny Smith! Honeycrisp! Red Delicious!
27 Depending on who you ask, NoSQL means NOT SQL or NOT ONLY SQL.
28 (in fact, some characterize NoSQL as a movement, not a particular technology or set of technologies.)
29 SQL Databases are highly standardized.! NoSQL Databases are highly fragmented.
30 SQL Databases are highly standardized.! NoSQL Databases are highly fragmented. Some are document model databases, some use a variation of a key-value store. Document Databases
31 So, what are the characteristics of NoSQL databases* that make them so trendy and exciting? * Generally
32 Relational databases have strict schemas dictating the structure of data. NoSQL databases are generally schemaless, even when they use key-value stores.
33 NoSQL databases are generally schemaless, even when they use key-value stores. More flexible Can start entering data before deciding on how that data will be formatted Less structured, consistent
34 NoSQL databases are generally schemaless, even when they use key-value stores. More flexible Can start entering data before deciding on how that data will be formatted Less structured, consistent
35 Relational databases can scale up (on one computer) but not easily out (across many computers). NoSQL databases are designed to scale out across many computers.
36 NoSQL databases are designed to scale out across many computers. Lots of machines == BIG data Can scale quickly if needed No single point of failure More complicated to set up
37 Relational databases read and write information directly to a disk drive. NoSQL databases store information in memory, and/ or include robust built-in caching in memory.
38 NoSQL databases store information in memory, and/ or include robust built-in caching in memory. Faster Memory more expensive than disk Potential reliability issues
39 Relational databases follow the ACID model: NoSQL databases do not follow the ACID model.
40 NoSQL databases do not follow the ACID model. More freedom to handle requests in a way that honors the uniqueness of things. Much greater room for (potentially serious) errors.
41 Relational databases represent data as rows and columns. NoSQL databases often represent data in formats such as JSON, which are native to many programming languages.
42 NoSQL databases often represent data in formats such as JSON, which are native to many programming languages. Easier, faster for programmers Harder for non-programmers
43 SO WAIT, THOUGH, how the f*** do you find anything in a NoSQL database????
44
45 HADOOP is an open source framework for doing MapReduce.! MapReduce is one way to make sense of a document database. (That s how GOOGLE does it.)!
46 MapReduce has two core steps:! Map! and! Reduce.!!!... both are pretty much what they sound like.
47 This is what it actually looks like: function map(string name, String document): // name: document name // document: document contents for each word w in document: emit (w, 1) function reduce(string word, Iterator partialcounts): // word: a word // partialcounts: a list of aggregated partial counts sum = 0 for each pc in partialcounts: sum += ParseInt(pc) emit (word, sum)
48 MAP: For a given document, map each word phrase or item to the number of times that word phrase or item appears. function map(string name, String document): // name: document name // document: document contents for each word w in document: emit (w, 1)
49 REDUCE: NOW, take all of those maps from every document, and reduce them to a single list of items and counts. function reduce(string word, Iterator partialcounts): // word: a word // partialcounts: a list of aggregated partial counts sum = 0 for each pc in partialcounts: sum += ParseInt(pc) emit (word, sum)
50 Honeycrisp Apple Granny Smith Apple Red Delicious Apple Navel Orange
51 Honeycrisp Apple Granny Smith Apple Red Delicious Apple Navel Orange MAP (Red, 1) (Delicious, 1) (Apple, 1) (Honeycrisp, 1) (Apple, 1) (Navel, 1) (Orange, 1) (Granny, 1) (Smith, 1) (Apple, 1)
52 Honeycrisp Apple Granny Smith Apple Red Delicious Apple Navel Orange MAP (Red, 1) (Delicious, 1) (Apple, 1) (Honeycrisp, 1) (Apple, 1) (Navel, 1) (Orange, 1) (Granny, 1) (Smith, 1) (Apple, 1) REDUCE (Red, 1) (Delicious, 1) (Apple, 3) (Honeycrisp, 1) (Navel, 1) (Orange, 1) (Granny, 1) (Smith, 1)
53 Honeycrisp Apple Granny Smith Apple Red Delicious Apple Navel Orange MAP (Red, 1) (Delicious, 1) (Apple, 1) (Honeycrisp, 1) (Apple, 1) (Navel, 1) (Orange, 1) (Granny, 1) (Smith, 1) (Apple, 1) REDUCE (Red, 1) (Delicious, 1) (Apple, 3) (Honeycrisp, 1) (Navel, 1) (Orange, 1) (Granny, 1) (Smith, 1)
54 The hard work is distributed
55 The hard work is distributed The easy work is centralized
56 COMP 1 Honeycrisp Apple COMP 2 Granny Smith Apple Red Delicious Apple Navel Orange... but what if we ve got our documents stored on multiple machines?
57 COMP 1 Honeycrisp Apple COMP 2 Granny Smith Apple Red Delicious Apple Navel Orange (Red, 1) (Delicious, 1) (Apple, 1) MAP (Honeycrisp, 1) (Apple, 1) (Navel, 1) (Orange, 1) MAP (Granny, 1) (Smith, 1) (Apple, 1)
58 COMP 1 Honeycrisp Apple COMP 2 Granny Smith Apple Red Delicious Apple Navel Orange (Red, 1) (Delicious, 1) (Apple, 1) MAP (Honeycrisp, 1) (Apple, 1) (Navel, 1) (Orange, 1) MAP (Granny, 1) (Smith, 1) (Apple, 1) (Red, 1) (Delicious, 1) (Apple, 2) (Honeycrisp, 1) REDUCE (Navel, 1) (Orange, 1) (Granny, 1) (Smith, 1) (Apple, 1) REDUCE
59 COMP 1 Honeycrisp Apple COMP 2 Granny Smith Apple Red Delicious Apple Navel Orange (Red, 1) (Delicious, 1) (Apple, 1) MAP (Honeycrisp, 1) (Apple, 1) (Navel, 1) (Orange, 1) MAP (Granny, 1) (Smith, 1) (Apple, 1) (Red, 1) (Delicious, 1) (Apple, 2) (Honeycrisp, 1) REDUCE (Navel, 1) (Orange, 1) (Granny, 1) (Smith, 1) (Apple, 1) REDUCE REDUCE (Red, 1) (Delicious, 1) (Apple, 3) (Honeycrisp, 1) (Navel, 1) (Orange, 1) (Granny, 1) (Smith, 1)
60 Is this the easiest way to count apples?
61 NOT
62
63 * relational database *
64
65 Tweet Text: I am so happy! Tweet Location: Albuquerque, NM User Home: New York, NY Tweet Text: #FML #FML #FML Tweet Location: Palo Alto, CA User Home: San Francisco, CA
66 Tweet Text: I am so happy! Tweet Location: Albuquerque, NM User Home: New York, NY Tweet Text: #FML #FML #FML Tweet Location: Palo Alto, CA User Home: San Francisco, CA MAP (WITH MATH + SENTIMENT) (1808, +.9) (Distance in Miles, Sentiment Score) (33, -.6)
67 Tweet Text: I am so happy! Tweet Location: Albuquerque, NM User Home: New York, NY Tweet Text: #FML #FML #FML Tweet Location: Palo Alto, CA User Home: San Francisco, CA MAP (WITH MATH + SENTIMENT) (1808, +.9) (Distance in Miles, Sentiment Score) (33, -.6) REDUCE (1808, +.9) (33, -.6)
68 Tweet Text: I am so happy! Tweet Location: Albuquerque, NM User Home: New York, NY Tweet Text: #FML #FML #FML Tweet Location: Palo Alto, CA User Home: San Francisco, CA MAP (WITH MATH + SENTIMENT) (1808, +.9) (Distance in Miles, Sentiment Score) (33, -.6) REDUCE (1808, +.9) (33, -.6) RINSE AND REPEAT LIKE A MILLION TIMES
69 ... none of this is magic.
70 ... in fact, the magic part is just a precursor to doing the actual hard work.
71
72 Danah Boyd s Six Provocations for Big Data: 1. Automating Research Changes the Definition of Knowledge.! 2. Claims to Objectivity and Accuracy are Misleading! 3. Bigger Data are Not Always Better Data! 4. Not All Data Are Equivalent! 5. Just Because it is Accessible Doesn t Make it Ethical! 6. Limited Access to Big Data Creates New Digital Divides
73 What about THE FUTURE?
74 HIERARCHICAL RIGID DATABASE MODEL RELATIONAL DATABASE MODEL DOCUMENT DATABASE MODEL FLEXIBLE
75 HIERARCHICAL RIGID DATABASE MODEL? RELATIONAL DATABASE MODEL DOCUMENT DATABASE MODEL FLEXIBLE
76
77
78
79 Further Reading: Martin Fowler on NoSQL: Helpful Stack Overflow thread: Finding Friends with MapReduce: Choosing a Database That s Right for Your Business: Demystifying the Role of Big Data in Marketing: mar/12/big-data-marketing-demystified! The NoSQL Movement: Big Data Tools Cost Too Much, Do Too Little: hadoop_no_sql_dont_believe_the_hype/! Is Big Data an Economic Big Dud?: Six Provocations for Big Data:
Lecture 10 - Functional programming: Hadoop and MapReduce
Lecture 10 - Functional programming: Hadoop and MapReduce Sohan Dharmaraja Sohan Dharmaraja Lecture 10 - Functional programming: Hadoop and MapReduce 1 / 41 For today Big Data and Text analytics Functional
More informationNoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre
NoSQL systems: introduction and data models Riccardo Torlone Università Roma Tre Why NoSQL? In the last thirty years relational databases have been the default choice for serious data storage. An architect
More informationIntroduction to NoSQL Databases. Tore Risch Information Technology Uppsala University 2013-03-05
Introduction to NoSQL Databases Tore Risch Information Technology Uppsala University 2013-03-05 UDBL Tore Risch Uppsala University, Sweden Evolution of DBMS technology Distributed databases SQL 1960 1970
More informationInfiniteGraph: The Distributed Graph Database
A Performance and Distributed Performance Benchmark of InfiniteGraph and a Leading Open Source Graph Database Using Synthetic Data Objectivity, Inc. 640 West California Ave. Suite 240 Sunnyvale, CA 94086
More informationApache HBase. Crazy dances on the elephant back
Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage
More informationMapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example
MapReduce MapReduce and SQL Injections CS 3200 Final Lecture Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. OSDI'04: Sixth Symposium on Operating System Design
More informationEvaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing
Evaluating NoSQL for Enterprise Applications Dirk Bartels VP Strategy & Marketing Agenda The Real Time Enterprise The Data Gold Rush Managing The Data Tsunami Analytics and Data Case Studies Where to go
More informationBig Data and Apache Hadoop s MapReduce
Big Data and Apache Hadoop s MapReduce Michael Hahsler Computer Science and Engineering Southern Methodist University January 23, 2012 Michael Hahsler (SMU/CSE) Hadoop/MapReduce January 23, 2012 1 / 23
More informationLecture Data Warehouse Systems
Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores
More informationOpen Source Technologies on Microsoft Azure
Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions
More informationOpen source, high performance database
Open source, high performance database Anti-social Databases: NoSQL and MongoDB Will LaForest Senior Director of 10gen Federal will@10gen.com @WLaForest 1 SQL invented Dynamic Web Content released IBM
More informationSQL Simple Queries. Chapter 3.1 V3.0. Copyright @ Napier University Dr Gordon Russell
SQL Simple Queries Chapter 3.1 V3.0 Copyright @ Napier University Dr Gordon Russell Introduction SQL is the Structured Query Language It is used to interact with the DBMS SQL can Create Schemas in the
More informationNoSQL. Thomas Neumann 1 / 22
NoSQL Thomas Neumann 1 / 22 What are NoSQL databases? hard to say more a theme than a well defined thing Usually some or all of the following: no SQL interface no relational model / no schema no joins,
More informationThe Internet of Things and Big Data: Intro
The Internet of Things and Big Data: Intro John Berns, Solutions Architect, APAC - MapR Technologies April 22 nd, 2014 1 What This Is; What This Is Not It s not specific to IoT It s not about any specific
More informationIntroduction to NoSQL Databases and MapReduce. Tore Risch Information Technology Uppsala University 2014-05-12
Introduction to NoSQL Databases and MapReduce Tore Risch Information Technology Uppsala University 2014-05-12 What is a NoSQL Database? 1. A key/value store Basic index manager, no complete query language
More informationClustering Big Data. Efficient Data Mining Technologies. J Singh and Teresa Brooks. June 4, 2015
Clustering Big Data Efficient Data Mining Technologies J Singh and Teresa Brooks June 4, 2015 Hello Bulgaria (http://hello.bg/) A website with thousands of pages... Some pages identical to other pages
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationInfrastructures for big data
Infrastructures for big data Rasmus Pagh 1 Today s lecture Three technologies for handling big data: MapReduce (Hadoop) BigTable (and descendants) Data stream algorithms Alternatives to (some uses of)
More informationData Modeling for Big Data
Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes
More informationExploration of Non-Relational Database Models. Swayze Smartt. Department of Computer Science. Wake Forest University. Spring 2011 Honors Thesis
Smartt 1 Exploration of Non-Relational Database Models Swayze Smartt Department of Computer Science Wake Forest University Spring 2011 Honors Thesis Advised by Dr. Stan Thomas Smartt 2 Abstract While relational
More informationDistributed Aggregation in Cloud Databases. By: Aparna Tiwari tiwaria@umail.iu.edu
Distributed Aggregation in Cloud Databases By: Aparna Tiwari tiwaria@umail.iu.edu ABSTRACT Data intensive applications rely heavily on aggregation functions for extraction of data according to user requirements.
More informationDatabase Management System Choices. Introduction To Database Systems CSE 373 Spring 2013
Database Management System Choices Introduction To Database Systems CSE 373 Spring 2013 Outline Introduction PostgreSQL MySQL Microsoft SQL Server Choosing A DBMS NoSQL Introduction There a lot of options
More informationBig Data, Fast Data, Complex Data. Jans Aasman Franz Inc
Big Data, Fast Data, Complex Data Jans Aasman Franz Inc Private, founded 1984 AI, Semantic Technology, professional services Now in Oakland Franz Inc Who We Are (1 (2 3) (4 5) (6 7) (8 9) (10 11) (12
More informationSocialprise: Leveraging Social Data in the Enterprise Rev 0109
Socialprise: Leveraging Social Data in the Enterprise Rev 0109 Contents I. Socialprise: Capturing Smart Insights into Agile Relationships II. Socialprise Applications: Getting the Who, What and When of
More informationCloud Computing at Google. Architecture
Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale
More informationPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013
PostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013 Welcome! I m Christophe. PostgreSQL person since 1997. Consultant with PostgreSQL Experts, Inc. cpettus@pgexperts.com
More informationAllegroGraph. a graph database. Gary King gwking@franz.com
AllegroGraph a graph database Gary King gwking@franz.com Overview What we store How we store it the possibilities Using AllegroGraph Databases Put stuff in Get stuff out quickly safely Stuff things with
More informationCan the Elephants Handle the NoSQL Onslaught?
Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented
More informationDatabases for text storage
Databases for text storage Jonathan Ronen New York University jr4069@nyu.edu December 1, 2014 Jonathan Ronen (NYU) databases December 1, 2014 1 / 24 Overview 1 Introduction 2 PostgresSQL 3 MongoDB Jonathan
More informationNoSQL a view from the top
Red Stack Tech Ltd James Anthony Technology Director NoSQL a view from the top Part 1 1 Contents Introduction...Page 3 Key Value Stores..... Page 4 Column Family Data Stores.. Page 6 Document Data Stores...Page
More informationIntroduction to Big Data Training
Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB
More informationHacettepe University Department Of Computer Engineering BBM 471 Database Management Systems Experiment
Hacettepe University Department Of Computer Engineering BBM 471 Database Management Systems Experiment Subject NoSQL Databases - MongoDB Submission Date 20.11.2013 Due Date 26.12.2013 Programming Environment
More informationIntroduction to Hadoop
Introduction to Hadoop Miles Osborne School of Informatics University of Edinburgh miles@inf.ed.ac.uk October 28, 2010 Miles Osborne Introduction to Hadoop 1 Background Hadoop Programming Model Examples
More informationSmartArrays and Java Frequently Asked Questions
SmartArrays and Java Frequently Asked Questions What are SmartArrays? A SmartArray is an intelligent multidimensional array of data. Intelligent means that it has built-in knowledge of how to perform operations
More informationthese three NoSQL databases because I wanted to see a the two different sides of the CAP
Michael Sharp Big Data CS401r Lab 3 For this paper I decided to do research on MongoDB, Cassandra, and Dynamo. I chose these three NoSQL databases because I wanted to see a the two different sides of the
More informationAn Oracle White Paper February 2011. Hadoop and NoSQL Technologies and the Oracle Database
An Oracle White Paper February 2011 Hadoop and NoSQL Technologies and the Oracle Database Disclaimer The following is intended to outline our general product direction. It is intended for information purposes
More informationOracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
More informationData Discovery, Analytics, and the Enterprise Data Hub
Data Discovery, Analytics, and the Enterprise Data Hub Version: 101 Table of Contents Summary 3 Used Data and Limitations of Legacy Analytic Architecture 3 The Meaning of Data Discovery & Analytics 4 Machine
More informationOverview. Introduction to Database Systems. Motivation... Motivation: how do we store lots of data?
Introduction to Database Systems UVic C SC 370 Overview What is a DBMS? what is a relational DBMS? Why do we need them? How do we represent and store data in a DBMS? How does it support concurrent access
More informationINTRODUCING AZURE SEARCH
David Chappell INTRODUCING AZURE SEARCH Sponsored by Microsoft Corporation Copyright 2015 Chappell & Associates Contents Understanding Azure Search... 3 What Azure Search Provides...3 What s Required to
More informationTeradata s Big Data Technology Strategy & Roadmap
Teradata s Big Data Technology Strategy & Roadmap Artur Borycki, Director International Solutions Marketing 18 March 2014 Agenda > Introduction and level-set > Enabling the Logical Data Warehouse > Any
More informationIntroduction to Parallel Programming and MapReduce
Introduction to Parallel Programming and MapReduce Audience and Pre-Requisites This tutorial covers the basics of parallel programming and the MapReduce programming model. The pre-requisites are significant
More informationBig Systems, Big Data
Big Systems, Big Data When considering Big Distributed Systems, it can be noted that a major concern is dealing with data, and in particular, Big Data Have general data issues (such as latency, availability,
More informationWhy Semantic Analysis is Better than Sentiment Analysis. A White Paper by T.R. Fitz-Gibbon, Chief Scientist, Networked Insights
Why Semantic Analysis is Better than Sentiment Analysis A White Paper by T.R. Fitz-Gibbon, Chief Scientist, Networked Insights Why semantic analysis is better than sentiment analysis I like it, I don t
More informationCivil Contractors :Interview case study Industry: Construction
BUILDING PROJECT MANAGEMENT SOLUTIONS THE WAY PROJECT MANAGERS THINK Civil Contractors :Interview case study Industry: Construction How would you describe your business? We manage the construction of earthworks,
More informationAn Approach to Implement Map Reduce with NoSQL Databases
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh
More informationChallenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2
More informationCS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #13: NoSQL and MapReduce
CS 4604: Introduc0on to Database Management Systems B. Aditya Prakash Lecture #13: NoSQL and MapReduce Announcements HW4 is out You have to use the PGSQL server START EARLY!! We can not help if everyone
More information1. INTRODUCTION TO RDBMS
Oracle For Beginners Page: 1 1. INTRODUCTION TO RDBMS What is DBMS? Data Models Relational database management system (RDBMS) Relational Algebra Structured query language (SQL) What Is DBMS? Data is one
More informationAn Open Source NoSQL solution for Internet Access Logs Analysis
An Open Source NoSQL solution for Internet Access Logs Analysis A practical case of why, what and how to use a NoSQL Database Management System instead of a relational one José Manuel Ciges Regueiro
More informationMastering Disaster Recovery: Business Continuity and Virtualization Best Practices W H I T E P A P E R
Mastering Disaster Recovery: Business Continuity and Virtualization Best Practices W H I T E P A P E R Table of Contents Introduction.......................................................... 3 Challenges
More informationLab 4.4 Secret Messages: Indexing, Arrays, and Iteration
Lab 4.4 Secret Messages: Indexing, Arrays, and Iteration This JavaScript lab (the last of the series) focuses on indexing, arrays, and iteration, but it also provides another context for practicing with
More informationINTRODUCING DRUID: FAST AD-HOC QUERIES ON BIG DATA MICHAEL DRISCOLL - CEO ERIC TSCHETTER - LEAD ARCHITECT @ METAMARKETS
INTRODUCING DRUID: FAST AD-HOC QUERIES ON BIG DATA MICHAEL DRISCOLL - CEO ERIC TSCHETTER - LEAD ARCHITECT @ METAMARKETS MICHAEL E. DRISCOLL CEO @ METAMARKETS - @MEDRISCOLL Metamarkets is the bridge from
More informationMapReduce (in the cloud)
MapReduce (in the cloud) How to painlessly process terabytes of data by Irina Gordei MapReduce Presentation Outline What is MapReduce? Example How it works MapReduce in the cloud Conclusion Demo Motivation:
More informationThe Sierra Clustered Database Engine, the technology at the heart of
A New Approach: Clustrix Sierra Database Engine The Sierra Clustered Database Engine, the technology at the heart of the Clustrix solution, is a shared-nothing environment that includes the Sierra Parallel
More informationTopics. Introduction to Database Management System. What Is a DBMS? DBMS Types
Introduction to Database Management System Linda Wu (CMPT 354 2004-2) Topics What is DBMS DBMS types Files system vs. DBMS Advantages of DBMS Data model Levels of abstraction Transaction management DBMS
More informationMicrosoft Dynamics NAV
Microsoft Dynamics NAV Maximizing value through business insight Business Intelligence White Paper November 2011 The information contained in this document represents the current view of Microsoft Corporation
More informationThis article is the second
This article is the second of a series by Pythian experts that will regularly be published as the Performance Corner column in the NoCOUG Journal. The main software components of Oracle Big Data Appliance
More informationBig Data Management. Big Data Management. (BDM) Autumn 2013. Povl Koch September 2, 2013 01-09-2013 1
Big Data Management Big Data Management (BDM) Autumn 2013 Povl Koch September 2, 2013 01-09-2013 1 Overview Today s program 1. Little more practical details about this course 2. Chapter 2 & 3 in NoSQL
More informationMicrosoft Azure Data Technologies: An Overview
David Chappell Microsoft Azure Data Technologies: An Overview Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Blobs... 3 Running a DBMS in a Virtual Machine... 4 SQL Database...
More informationUsing Big Data Analytics for Financial Services Regulatory Compliance
Using Big Data Analytics for Financial Services Regulatory Compliance Industry Overview In today s financial services industry, the pendulum continues to swing further in the direction of lower risk and
More informationBIG DATA AND DIGITAL METHODS LECTURE 1 A TOUR ON BIG DATA Dario Malchiodi BIG DATA, SO WHAT? Source: + https://www.google.it/search?q=big+data http://tagxedo.com BIG DATA, SO WHAT? Number of «big data»
More informationMoving From Hadoop to Spark
+ Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com sujee@elephantscale.com Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationIn Memory Accelerator for MongoDB
In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000
More informationDynamic Programming. Lecture 11. 11.1 Overview. 11.2 Introduction
Lecture 11 Dynamic Programming 11.1 Overview Dynamic Programming is a powerful technique that allows one to solve many different types of problems in time O(n 2 ) or O(n 3 ) for which a naive approach
More informationIntegrating Big Data into the Computing Curricula
Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big
More informationThe Data Engineer. Mike Tamir Chief Science Officer Galvanize. Steven Miller Global Leader Academic Programs IBM Analytics
The Data Engineer Mike Tamir Chief Science Officer Galvanize Steven Miller Global Leader Academic Programs IBM Analytics Alessandro Gagliardi Lead Faculty Galvanize Businesses are quickly realizing that
More informationObject Oriented Databases. OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar
Object Oriented Databases OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar Executive Summary The presentation on Object Oriented Databases gives a basic introduction to the concepts governing OODBs
More informationWhat is Big Data? BCS Aberdeen Branch 6 November 2014
What is Big Data? BCS Aberdeen Branch 6 November 2014 Keith Gordon Soldier Teacher Data Manager Engineer Information Systems Professional Standards Expert Big Data Sceptic What they say The overeager adoption
More informationCS54100: Database Systems
CS54100: Database Systems Cloud Databases: The Next Post- Relational World 18 April 2012 Prof. Chris Clifton Beyond RDBMS The Relational Model is too limiting! Simple data model doesn t capture semantics
More informationSQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford
SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems
More informationGoogle Cloud Data Platform & Services. Gregor Hohpe
Google Cloud Data Platform & Services Gregor Hohpe All About Data We Have More of It Internet data more easily available Logs user & system behavior Cheap Storage keep more of it 3 Beyond just Relational
More informationEmpowering the Masses with Analytics
Empowering the Masses with Analytics THE GAP FOR BUSINESS USERS For a discussion of bridging the gap from the perspective of a business user, read Three Ways to Use Data Science. Ask the average business
More informationScalable ecommerce with NoSQL. Dipali Trivedi
Scalable ecommerce with NoSQL Dipali Trivedi ECommerce entities and schema Key aspect of NoSQL adoption Denomarlization: Key Aspect of NoSQL adoption Question oriented schema design: A. What are the products
More informationArchitectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
More informationWINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS
WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS Managing and analyzing data in the cloud is just as important as it is anywhere else. To let you do this, Windows Azure provides a range of technologies
More informationNOSQL, BIG DATA AND GRAPHS. Technology Choices for Today s Mission- Critical Applications
NOSQL, BIG DATA AND GRAPHS Technology Choices for Today s Mission- Critical Applications 2 NOSQL, BIG DATA AND GRAPHS NOSQL, BIG DATA AND GRAPHS TECHNOLOGY CHOICES FOR TODAY S MISSION- CRITICAL APPLICATIONS
More informationQuerying MongoDB without programming using FUNQL
Querying MongoDB without programming using FUNQL FUNQL? Federated Unified Query Language What does this mean? Federated - Integrates different independent stand alone data sources into one coherent view
More informationBig Data and Its Impact on the Data Warehousing Architecture
Big Data and Its Impact on the Data Warehousing Architecture Sponsored by SAP Speaker: Wayne Eckerson, Director of Research, TechTarget Wayne Eckerson: Hi my name is Wayne Eckerson, I am Director of Research
More informationApache Spark 11/10/15. Context. Reminder. Context. What is Spark? A GrowingStack
Apache Spark Document Analysis Course (Fall 2015 - Scott Sanner) Zahra Iman Some slides from (Matei Zaharia, UC Berkeley / MIT& Harold Liu) Reminder SparkConf JavaSpark RDD: Resilient Distributed Datasets
More informationParallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel
Parallel Databases Increase performance by performing operations in parallel Parallel Architectures Shared memory Shared disk Shared nothing closely coupled loosely coupled Parallelism Terminology Speedup:
More informationCrystal Reports. Overview. Contents. Table Linking in Crystal Reports
Overview Contents This document demonstrates the linking process in Crystal Reports (CR) 7 and later. This document discusses linking for PC-type databases, ODBC linking and frequently asked questions.
More informationSession 7 Fractions and Decimals
Key Terms in This Session Session 7 Fractions and Decimals Previously Introduced prime number rational numbers New in This Session period repeating decimal terminating decimal Introduction In this session,
More informationUnit 7 The Number System: Multiplying and Dividing Integers
Unit 7 The Number System: Multiplying and Dividing Integers Introduction In this unit, students will multiply and divide integers, and multiply positive and negative fractions by integers. Students will
More informationArchitectures for massive data management
Architectures for massive data management Apache Spark Albert Bifet albert.bifet@telecom-paristech.fr October 20, 2015 Spark Motivation Apache Spark Figure: IBM and Apache Spark What is Apache Spark Apache
More informationChapter 6. Foundations of Business Intelligence: Databases and Information Management
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process
ORACLE OLAP KEY FEATURES AND BENEFITS FAST ANSWERS TO TOUGH QUESTIONS EASILY KEY FEATURES & BENEFITS World class analytic engine Superior query performance Simple SQL access to advanced analytics Enhanced
More informationGoogle Cloud Platform The basics
Google Cloud Platform The basics Who I am Alfredo Morresi ROLE Developer Relations Program Manager COUNTRY Italy PASSIONS Community, Development, Snowboarding, Tiramisu' Reach me alfredomorresi@google.com
More informationAlexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data
INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are
More information7 Ways To Explode Your Profits as a Tint Professional and Change your Life Forever!
WINDOW FILM CUTTING SYSTEM 7 Ways To Explode Your Profits as a Tint Professional and Change your Life Forever! 2012 Tint Tek The automobile window tinting industry is a highly profitable trade and, for
More informationWhat Is Singapore Math?
What Is Singapore Math? You may be wondering what Singapore Math is all about, and with good reason. This is a totally new kind of math for you and your child. What you may not know is that Singapore has
More informationEvaluation Checklist Data Warehouse Automation
Evaluation Checklist Data Warehouse Automation March 2016 General Principles Requirement Question Ajilius Response Primary Deliverable Is the primary deliverable of the project a data warehouse, or is
More informationMapReduce and the New Software Stack
20 Chapter 2 MapReduce and the New Software Stack Modern data-mining applications, often called big-data analysis, require us to manage immense amounts of data quickly. In many of these applications, the
More informationIntroducing DocumentDB
David Chappell Introducing DocumentDB A NoSQL Database for Microsoft Azure Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Why DocumentDB?... 3 The DocumentDB Data Model...
More informationCS 564: DATABASE MANAGEMENT SYSTEMS
Fall 2013 CS 564: DATABASE MANAGEMENT SYSTEMS 9/4/13 CS 564: Database Management Systems, Jignesh M. Patel 1 Teaching Staff Instructor: Jignesh Patel, jignesh@cs.wisc.edu Office Hours: Mon, Wed 1:30-2:30
More informationUnderstanding the Value of In-Memory in the IT Landscape
February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to
More informationA Total Cost of Ownership Comparison of MongoDB & Oracle
A MongoDB White Paper A Total Cost of Ownership Comparison of MongoDB & Oracle August 2015 Table of Contents Executive Summary Cost Categories TCO for Example Projects Upfront Costs Initial Developer Effort
More informationNot Relational Models For The Management of Large Amount of Astronomical Data. Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF)
Not Relational Models For The Management of Large Amount of Astronomical Data Bruno Martino (IASI/CNR), Memmo Federici (IAPS/INAF) What is a DBMS A Data Base Management System is a software infrastructure
More informationBig Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置
More information