An Introduction To Presented by Leon Guzenda, Founder, Objectivity

Size: px
Start display at page:

Download "www.objectivity.com An Introduction To Presented by Leon Guzenda, Founder, Objectivity"

Transcription

1 An Introduction To Graph Databases Presented by Leon Guzenda, Founder, Objectivity Mark Maagdenberg, Sr. Sales Engineer, Objectivity Paul DeWolf, Dir. Field Engineering, Objectivity August 21, 2012

2 Overview Introductions Graph Theory Commonly Used Graph Algorithms Graph Databases Current Implementations Use Cases Hands-On Tutorial

3 We Are From Objectivity Inc. Company Objectivity, Inc. is headquartered in Sunnyvale, CA. Established in 1988 to tackle database problems that network/hierarchical/relational and file-based technologies struggle with. Objectivity has over two decades of Big Data and NoSQL experience Products Develops NoSQL platforms for managing and discovering relationships and patterns in complex data: Objectivity/DB - an object database that manages localized, centralized or distributed databases InfiniteGraph - a massively scalable graph database built on Objectivity/DB that enables organizations to find, store and exploit the relationships in their data Markets The Big Data market is projected to be around $12B in 2012, with a CAGR of 28% over the next five years. 40% per year data growth, cloud adoption, mobile usage and improved real-time analytics underpin Objectivity s growth opportunities as a Big Data analytics enabler. Customers Embedded in hundreds of enterprises, government organizations and products - millions of deployments. Financials Consistently generates increased revenues. Pi Privately held ldby the employees and a few venture capital companies. Copyright Objectivity, Inc. 2012

4 GRAPH THEORY

5 The History of Graph Theory 1736: Leonard Euler writes a paper on the Seven Bridges of Konisberg 1845: Gustav Kirchoff publishes his electrical circuit laws 1852: Francis Guthrie poses the Four Color Problem 1878: Sylvester publishes an article in Nature magazine that describes graphs 1936: Dénes Kőnig publishes a textbook on Graph Theory 1941: Ramsey and Turán define Extremal Graph Theory 1959: De Bruijn publishes a paper summarizing Enumerative Graph Theory 1959: Erdos, Renyi and Gilbert define Random Graph Theory 1969: Heinrich Heesch solves the Four Color problem 2003: Commercial Graph Database products start appearing on the market

6 Graph Theory Terminology... VERTEX: A single node in a graph data structure EDGE: A connection between a pair of VERTICES PROPERTIES: Data items that belong to a particular Vertex WEIGHT: A quantity associated with a particular Edge GRAPH: A collection of linked Vertex and Edge objects Vertex 1 Vertex 2 Edge 1 City: San Francisco Pop: 812,826 Road: I-101 Miles: 47.8 City: San Jose Pop: 967,487

7 ...Graph Theory Terminology... SIMPLE/UNDIRECTED GRAPH: A Graph where each VERTEX may be linked to one or more Vertex objects via Edge objects and each Edge object is connected to exactly two Vertex objects. Furthermore, neither Vertex connected to an Edge is more significant than the other. DIRECTED GRAPH: A Simple/Undirected Graph where one Vertex in a Vertex + Edge + Vertex group (an Arc or Path ) can be considered d the Head of the Path and the other can be considered the Tail. MIXED GRAPH: A Graph in which some paths are Undirected and others are MIXED GRAPH: A Graph in which some paths are Undirected and others are Directed.

8 ...Graph Theory Terminology LOOP: An Edge that is doubly-linked to the same Vertex MULTIGRAPH: A Graph that allows multiple Edges and Loops QUIVER: A Graph where Vertices are allowed to be connected by multiple Arcs. A Quiver may include Loops. WEIGHTED GRAPH: A Graph where a quantity is assigned to an Edge, e.g. a Length assigned to an Edge representing a road between two Vertices representing cities. HALF EDGE: An Edge that is only connected to a single Vertex LOOSE EDGE: An Edge that isn't connected to any Vertices. CONNECTIVITY: Two Vertices are Connected if it is possible to find a path between them.

9 COMMONLY USED GRAPH ALGORITHMS Mac Evans

10 Commonly Used Graph Algorithms... CONNECTEDNESS: Check whether or not a set of nodes in a Graph are connected. All of the nodes in the graph below are connected, e.g. A to B, A to C via B etc. SHORTEST PATH: The path between two nodes that visits the fewest intermediate nodes. In the graph above, A->B->C->D is shorter than A->B->C->B->D (disallowing loops) NODE DEGREE: The degree of a node in a network is a count of the number of connections it has to other nodes. The degree distribution is the probability distribution of these degrees in the whole network. In the graph below, A and D have a node degree of 1. B and C have a node degree of 3.

11 ...Commonly Used Graph Algorithms... CENTRALITY: An assessment of the importance of a node within a network. Degree Centrality is the simplest, being a count of the number of connections that a node has. It may be expressed as Indegree (# of incoming connections) and Outdegre (# of outgoing connections).

12 ...Commonly Used Graph Algorithms... CLOSENESS CENTRALITY: Closeness considers the shortest paths between nodes and assigns a higher value to nodes that can be used to reach most other nodes most quickly. In the graph below, node A has the greatest centrality as all other nodes can be reached in one hop, whereas others require 1 hop to A or 2 hops to any other node. A

13 Commonly Used Graph Algorithms... CONNECTEDNESS: Check whether or not a set of nodes in a Graph are connected. All of the nodes in the graph below are connected, e.g. A to B, A to C via B etc. SHORTEST PATH: The path between two nodes that visits the fewest intermediate nodes. In the graph above, A->B->C->D is shorter than A->B->C->B->D (disallowing loops) NODE DEGREE: The degree of a node in a network is a count of the number of connections it has to other nodes. The degree distribution is the probability distribution of these degrees in the whole network. In the graph below, A and D have a node degree of 1. B andc have a node degree of 3.

14 ...Commonly Used Graph Algorithms... SHORTEST PATH: The path between two nodes that visits the fewest intermediate nodes. In the graph below, A->B->C->D is shorter than A->B->C->B->D (disallowing loops) AVERAGE PATH LENGTH: The average of all path lengths between all pairs of nodes in a graph. TRANSITIVE CLOSURE: The process of exploring a graph by traversing relationships until all nodes have been visited, but without revisiting nodes that are joined together in loops. In the graph above, A->B->C->D is a transitive closure.

15 ...Commonly Used Graph Algorithms... GRAPH DIAMETER (or SPAN): The greatest distance between any pair of nodes in a graph. It is computed by finding the shortest path between each pair of nodes. The maximum of these path thlengths is a measure of fthe diameter of fthe graph. The diameters of the two graphs below are 2 and 5.

16 ...Commonly Used Graph Algorithms... BETWEENESS CENTRALITY: A centrality measure of a node within a graph. Nodes that have a high probability of being visited on a randomly chosen short path between two randomly chosen nodes have a high betweeness In the graph below, node D has the highest betweeness centrality.

17 GRAPH DATABASES

18 Recognizing Graphs In Object Models... Tree Structures 1-to-Many Object Class A

19 ...Recognizing Graphs In Object Models... Tree Structures 1-to-Many Relationship Data Object Class A Object Class A

20 Recognizing Graphs In Object Models... Tree Structures 1-to-Many Relationship Data Object Class A Object Class A Graph (Network) Structures Many-to-Many Object Class A

21 Recognizing Graphs In Object Models... Tree Structures 1-to-Many Relationship Data Object Class A Object Class A Graph (Network) Structures Many-to-Many Relationship Data Object Class A Object Class A Copyright Objectivity, Inc. 2012

22 Why Do We Need Graph DBMSs?... Relational Database Think about the SQL query for finding all links between the two blue rows... Good luck! Table_A Table_B Table_C Table_D Table_E Table_F Table_G Relational databases aren t good at handling complex relationships!

23 ...Graph DBMSs Are Designed To Handle Relationships Relational Database Think about the SQL query for finding all links between the two blue rows... Good luck! Table_A Table_B Table_C Table_D Table_E Table_F Table_G Objectivity/DB or InfiniteGraph - The solution can be found with a few lines of code A3 G4

24 Graph Databases Data model: Node (Vertex) and Relationship (Edge) objects Directed May be a hypergraph h (edges with multiple l endpoints) Examples: InfiniteGraph, Neo4j, OrientDB, AllegroGraph, TitanDB and Dex VERTEX 2 N EDGE

25 Graph DBMSs Use A Very Simple Object Model Tree Structures 1-to-Many Relationship Data Object Class A Object Class A Graph (Network) Structures Many-to-Many Relationship Data GRAPH MODEL EDGE Object Class A Object Class A VERTEX Copyright Objectivity, Inc. 2012

26 Basic Capabilities Of Most Graph Databases... Rapid Graph Traversal Start

27 ...Basic Capabilities Of Most Graph Databases... Rapid Graph Traversal Inclusive or Exclusive Selection Start Start X X

28 ...Basic Capabilities Of Most Graph Databases Rapid Graph Traversal Inclusive or Exclusive Selection Start Start X X Find the Shortest or All Paths Between Objects Start Finish

29 CURRENT IMPLEMENTATIONS

30 Graph Databases Pre-2003

31 Graph Databases Post-2003 X

32 Graph Databases Compared [From OrientDB] Feature OrientDB Neo4j DEX InfiniteGraph License Open Source Open Source Commercial Apache and Commercial Query languages Not available, only via Via Java API Transaction support? ACID (plus lazy during bulk ingest) Protocols Embedded via Java API, remote as and Embedded via Java API and remote via REST? Embedded via Java API. Tinkerpop support. Replication Multi-Master Master-Slave No [No] Self loops Yes

33 Graph Databases Compared [UNSW] DATA STORAGE FEATURES

34 Graph Databases Compared [UNSW] OPERATION & MANIPULATION FEATURES

35 Graph Databases Compared [UNSW] GRAPH DATA STRUCTURES

36 Graph Databases Compared [UNSW] SCHEMA & INSTANCE REPRESENTATION

37 Graph Databases Compared [UNSW] QUERY FEATURES

38 Graph Databases Compared [UNSW] INTEGRITY CONSTRAINTS

39 Graph Databases Compared [UNSW] SUPPORT FOR ESSENTIAL GRAPH QUERIES

40 Graph Databases Pros and Cons Strengths: Extremely fast for connected data Scales out, typically Easy to query (navigation) Simple data model Weaknesses: May not support distribution or sharding Requires conceptual shift... a different way of thinking VERTEX 2 N EDGE

41 USE CASES

42 Example 1 - Market Analysis The 10 companies that control a majority of U.S. consumer goods brands

43 Example 2 - Demographics Used in social network analysis, marketing, medical research etc.

44 Example 3 - Seed To Consumer Tracking?

45 Example 4 - Ad Placement Networks Smartphone Ad placement - based on the the user s profile and location data captured by opt-in applications. The location data can be stored and distilled in a key-value and column store hybrid database, such as Cassandra The locations are matched with geospatial data to deduce user interests. As Ad placement orders arrive, an application built on a graph database such as InfiniteGraph, matches groups of users with Ads: Maximizes relevance for the user. Yields maximum value for the advertiser and the placer.

46 Example 5 - Healthcare Informatics Problem: Physicians need better electronic records for managing patient data on a global basis and match symptoms, causes, treatments and interdependencies to improve diagnoses and outcomes. Solution: Create a database capable of leveraging existing architecture using NOSQL tools such as Objectivity/DB and InfiniteGraph that can handle data capture, symptoms, diagnoses, treatments, reactions to medications, interactions and progress. Result: It works: Diagnosis is faster and more accurate The knowledge base tracks similar medical cases. Treatment success rates have improved.

47 Example 6 - Big Data Analytics

48 Example 7 Visual Analytics

49 Advice: The Repository Matters A Lot NEED RDBMS Key- Value Column Family Document Database ODBMS OLTP YES No Maybe No Maybe No Text Handling No No No YES Maybe No Graph Database Multimedia No Maybe No Maybe YES Maybe Engineering/ Scientific No No No No YES Maybe Business YES No Maybe No Maybe Maybe Intelligence Log Maybe No Maybe No YES Maybe Processing Connection Handling/ Analysis No No No No Maybe YES

50 More Advice: Languages and Tools Matter Too NEED Repository Language BI Tools Visual Analytics OLTP RDBMS SQL, Java YES Maybe Text Document Database Java, XML No Maybe Multimedia ODBMS Java, C++ No Maybe Eng/Science ODBMS C,C++, R Fortran Maybe YES Business RDBMS Java, SQL, R YES YES Intelligence Log NoSQL, C++, R, Processing ODBMS Java, SQL Connection Handling/ Analysis Graph Database Java, C++, SPARQL Maybe Maybe YES YES

51 A Polyglot Approach May Work Best LANGUAGE REPOSITORY PROBLEM ANALYTICS BI TOOLS GRAPH TOOLS VISUAL ANALYTICS

52 Hands On With A Graph Database We'll be using InfiniteGraph today You'll need a Java Development environment on your machine If you haven't downloaded InfiniteGraph already, please go to: [ We'll be covering a HelloGraph and a more complex sample program

www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach

www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach Nic Caine NoSQL Matters, April 2013 Overview The Problem Current Big Data Analytics Relationship Analytics Leveraging

More information

The Synergy Between the Object Database, Graph Database, Cloud Computing and NoSQL Paradigms

The Synergy Between the Object Database, Graph Database, Cloud Computing and NoSQL Paradigms ICOODB 2010 - Frankfurt, Deutschland The Synergy Between the Object Database, Graph Database, Cloud Computing and NoSQL Paradigms Leon Guzenda - Objectivity, Inc. 1 AGENDA Historical Overview Inherent

More information

GRAPH DATABASE SYSTEMS. h_da Prof. Dr. Uta Störl Big Data Technologies: Graph Database Systems - SoSe 2016 1

GRAPH DATABASE SYSTEMS. h_da Prof. Dr. Uta Störl Big Data Technologies: Graph Database Systems - SoSe 2016 1 GRAPH DATABASE SYSTEMS h_da Prof. Dr. Uta Störl Big Data Technologies: Graph Database Systems - SoSe 2016 1 Use Case: Route Finding Source: Neo Technology, Inc. h_da Prof. Dr. Uta Störl Big Data Technologies:

More information

Cloud Computing and Advanced Relationship Analytics

Cloud Computing and Advanced Relationship Analytics Cloud Computing and Advanced Relationship Analytics Using Objectivity/DB to Discover the Relationships in your Data By Brian Clark Vice President, Product Management Objectivity, Inc. 408 992 7136 brian.clark@objectivity.com

More information

Graph Database Proof of Concept Report

Graph Database Proof of Concept Report Objectivity, Inc. Graph Database Proof of Concept Report Managing The Internet of Things Table of Contents Executive Summary 3 Background 3 Proof of Concept 4 Dataset 4 Process 4 Query Catalog 4 Environment

More information

InfiniteGraph: The Distributed Graph Database

InfiniteGraph: The Distributed Graph Database A Performance and Distributed Performance Benchmark of InfiniteGraph and a Leading Open Source Graph Database Using Synthetic Data Objectivity, Inc. 640 West California Ave. Suite 240 Sunnyvale, CA 94086

More information

Graph Databases What makes them Different?

Graph Databases What makes them Different? www.objectivity.com Graph Databases What makes them Different? Darren Wood Chief Architect, InfiniteGraph NoSQL Data Specialists Everyone specializes Doctors, Lawyers, Bankers, Developers Why was data

More information

Objectivity positions graph database as relational complement to InfiniteGraph 3.0

Objectivity positions graph database as relational complement to InfiniteGraph 3.0 Objectivity positions graph database as relational complement to InfiniteGraph 3.0 Analyst: Matt Aslett 1 Oct, 2012 Objectivity Inc has launched version 3.0 of its InfiniteGraph graph database, improving

More information

How graph databases started the multi-model revolution

How graph databases started the multi-model revolution How graph databases started the multi-model revolution Luca Garulli Author and CEO @OrientDB QCon Sao Paulo - March 26, 2015 Welcome to Big Data 90% of the data in the world today has been created in the

More information

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1 Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots

More information

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance

More information

Big Data Analytics. Rasoul Karimi

Big Data Analytics. Rasoul Karimi Big Data Analytics Rasoul Karimi Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 1 Introduction

More information

Preparing Your Data For Cloud

Preparing Your Data For Cloud Preparing Your Data For Cloud Narinder Kumar Inphina Technologies 1 Agenda Relational DBMS's : Pros & Cons Non-Relational DBMS's : Pros & Cons Types of Non-Relational DBMS's Current Market State Applicability

More information

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3

More information

V. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005

V. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005 V. Adamchik 1 Graph Theory Victor Adamchik Fall of 2005 Plan 1. Basic Vocabulary 2. Regular graph 3. Connectivity 4. Representing Graphs Introduction A.Aho and J.Ulman acknowledge that Fundamentally, computer

More information

NoSQL and Graph Database

NoSQL and Graph Database NoSQL and Graph Database Biswanath Dutta DRTC, Indian Statistical Institute 8th Mile Mysore Road R. V. College Post Bangalore 560059 International Conference on Big Data, Bangalore, 9-20 March 2015 Outlines

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

www.objectivity.com Ibrahim Sallam Director of Development

www.objectivity.com Ibrahim Sallam Director of Development www.objectivity.com Ibrahim Sallam Director of Development Graphs, what are they and why? Graph Data Management. Why do we need it? Problems in Distributed Graph How we solved the problems Simple Graph

More information

Domain driven design, NoSQL and multi-model databases

Domain driven design, NoSQL and multi-model databases Domain driven design, NoSQL and multi-model databases Java Meetup New York, 10 November 2014 Max Neunhöffer www.arangodb.com Max Neunhöffer I am a mathematician Earlier life : Research in Computer Algebra

More information

NOSQL, BIG DATA AND GRAPHS. Technology Choices for Today s Mission- Critical Applications

NOSQL, BIG DATA AND GRAPHS. Technology Choices for Today s Mission- Critical Applications NOSQL, BIG DATA AND GRAPHS Technology Choices for Today s Mission- Critical Applications 2 NOSQL, BIG DATA AND GRAPHS NOSQL, BIG DATA AND GRAPHS TECHNOLOGY CHOICES FOR TODAY S MISSION- CRITICAL APPLICATIONS

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

1-Oct 2015, Bilbao, Spain. Towards Semantic Network Models via Graph Databases for SDN Applications

1-Oct 2015, Bilbao, Spain. Towards Semantic Network Models via Graph Databases for SDN Applications 1-Oct 2015, Bilbao, Spain Towards Semantic Network Models via Graph Databases for SDN Applications Agenda Introduction Goals Related Work Proposal Experimental Evaluation and Results Conclusions and Future

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

NoSQL for SQL Professionals William McKnight

NoSQL for SQL Professionals William McKnight NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to

More information

INTRODUCTION TO CASSANDRA

INTRODUCTION TO CASSANDRA INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open

More information

Cloud Scale Distributed Data Storage. Jürmo Mehine

Cloud Scale Distributed Data Storage. Jürmo Mehine Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented

More information

these three NoSQL databases because I wanted to see a the two different sides of the CAP

these three NoSQL databases because I wanted to see a the two different sides of the CAP Michael Sharp Big Data CS401r Lab 3 For this paper I decided to do research on MongoDB, Cassandra, and Dynamo. I chose these three NoSQL databases because I wanted to see a the two different sides of the

More information

How to Choose Between Hadoop, NoSQL and RDBMS

How to Choose Between Hadoop, NoSQL and RDBMS How to Choose Between Hadoop, NoSQL and RDBMS Keywords: Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data, Hadoop, NoSQL Database, Relational Database, SQL, Security, Performance Introduction A

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

Social Media Mining. Graph Essentials

Social Media Mining. Graph Essentials Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures

More information

Open Source Technologies on Microsoft Azure

Open Source Technologies on Microsoft Azure Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions

More information

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage

More information

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre NoSQL systems: introduction and data models Riccardo Torlone Università Roma Tre Why NoSQL? In the last thirty years relational databases have been the default choice for serious data storage. An architect

More information

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores Composite Software October 2010 TABLE OF CONTENTS INTRODUCTION... 3 BUSINESS AND IT DRIVERS... 4 NOSQL DATA STORES LANDSCAPE...

More information

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015 E6893 Big Data Analytics Lecture 8: Spark Streams and Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing

More information

NoSQL and Hadoop Technologies On Oracle Cloud

NoSQL and Hadoop Technologies On Oracle Cloud NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

An Introduction to APGL

An Introduction to APGL An Introduction to APGL Charanpal Dhanjal February 2012 Abstract Another Python Graph Library (APGL) is a graph library written using pure Python, NumPy and SciPy. Users new to the library can gain an

More information

How To Improve Performance In A Database

How To Improve Performance In A Database Some issues on Conceptual Modeling and NoSQL/Big Data Tok Wang Ling National University of Singapore 1 Database Models File system - field, record, fixed length record Hierarchical Model (IMS) - fixed

More information

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* Rakesh Nagi Department of Industrial Engineering University at Buffalo (SUNY) *Lecture notes from Network Flows by Ahuja, Magnanti

More information

MEAP Edition Manning Early Access Program Neo4j in Action MEAP version 3

MEAP Edition Manning Early Access Program Neo4j in Action MEAP version 3 MEAP Edition Manning Early Access Program Neo4j in Action MEAP version 3 Copyright 2012 Manning Publications For more information on this and other Manning titles go to www.manning.com brief contents PART

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

A Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader

A Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader A Performance Evaluation of Open Source Graph Databases Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader Overview Motivation Options Evaluation Results Lessons Learned Moving Forward

More information

ScaleArc for SQL Server

ScaleArc for SQL Server Solution Brief ScaleArc for SQL Server Overview Organizations around the world depend on SQL Server for their revenuegenerating, customer-facing applications, running their most business-critical operations

More information

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS Managing and analyzing data in the cloud is just as important as it is anywhere else. To let you do this, Windows Azure provides a range of technologies

More information

Oracle Big Data Spatial & Graph Social Network Analysis - Case Study

Oracle Big Data Spatial & Graph Social Network Analysis - Case Study Oracle Big Data Spatial & Graph Social Network Analysis - Case Study Mark Rittman, CTO, Rittman Mead OTN EMEA Tour, May 2016 info@rittmanmead.com www.rittmanmead.com @rittmanmead About the Speaker Mark

More information

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island

Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY ANN KELLY II MANNING Shelter Island contents foreword preface xvii xix acknowledgments xxi about this book xxii Part 1 Introduction

More information

Oracle Database 10g: Building GIS Applications Using the Oracle Spatial Network Data Model. An Oracle Technical White Paper May 2005

Oracle Database 10g: Building GIS Applications Using the Oracle Spatial Network Data Model. An Oracle Technical White Paper May 2005 Oracle Database 10g: Building GIS Applications Using the Oracle Spatial Network Data Model An Oracle Technical White Paper May 2005 Building GIS Applications Using the Oracle Spatial Network Data Model

More information

The Sierra Clustered Database Engine, the technology at the heart of

The Sierra Clustered Database Engine, the technology at the heart of A New Approach: Clustrix Sierra Database Engine The Sierra Clustered Database Engine, the technology at the heart of the Clustrix solution, is a shared-nothing environment that includes the Sierra Parallel

More information

NoSQL Databases. Nikos Parlavantzas

NoSQL Databases. Nikos Parlavantzas !!!! NoSQL Databases Nikos Parlavantzas Lecture overview 2 Objective! Present the main concepts necessary for understanding NoSQL databases! Provide an overview of current NoSQL technologies Outline 3!

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

DataStax Enterprise, powered by Apache Cassandra (TM)

DataStax Enterprise, powered by Apache Cassandra (TM) PerfAccel (TM) Performance Benchmark on Amazon: DataStax Enterprise, powered by Apache Cassandra (TM) Disclaimer: All of the documentation provided in this document, is copyright Datagres Technologies

More information

Data Modeling for Big Data

Data Modeling for Big Data Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes

More information

Create and Drive Big Data Success Don t Get Left Behind

Create and Drive Big Data Success Don t Get Left Behind Create and Drive Big Data Success Don t Get Left Behind The performance boost from MapR not only means we have lower hardware requirements, but also enables us to deliver faster analytics for our users.

More information

Overview on Graph Datastores and Graph Computing Systems. -- Litao Deng (Cloud Computing Group) 06-08-2012

Overview on Graph Datastores and Graph Computing Systems. -- Litao Deng (Cloud Computing Group) 06-08-2012 Overview on Graph Datastores and Graph Computing Systems -- Litao Deng (Cloud Computing Group) 06-08-2012 Graph - Everywhere 1: Friendship Graph 2: Food Graph 3: Internet Graph Most of the relationships

More information

NoSQL Databases. Polyglot Persistence

NoSQL Databases. Polyglot Persistence The future is: NoSQL Databases Polyglot Persistence a note on the future of data storage in the enterprise, written primarily for those involved in the management of application development. Martin Fowler

More information

Object and Graph Databases

Object and Graph Databases Portland State University - November 3, 2011 Object and Graph Databases Leon Guzenda - Objectivity, Inc. 1 AGENDA 2 OBJECT DATABASE INDUSTRY 3 The ODBMS Players The Object-Oriented Database System Manifesto

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information

Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise

Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise White Paper BY DATASTAX CORPORATION October 2013 1 Table of Contents Abstract 3 Introduction 3 The Growth in Multiple

More information

Oracle BI 11g R1: Build Repositories

Oracle BI 11g R1: Build Repositories Oracle University Contact Us: 1.800.529.0165 Oracle BI 11g R1: Build Repositories Duration: 5 Days What you will learn This Oracle BI 11g R1: Build Repositories training is based on OBI EE release 11.1.1.7.

More information

CitusDB Architecture for Real-Time Big Data

CitusDB Architecture for Real-Time Big Data CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Big Data Management in the Clouds. Alexandru Costan IRISA / INSA Rennes (KerData team)

Big Data Management in the Clouds. Alexandru Costan IRISA / INSA Rennes (KerData team) Big Data Management in the Clouds Alexandru Costan IRISA / INSA Rennes (KerData team) Cumulo NumBio 2015, Aussois, June 4, 2015 After this talk Realize the potential: Data vs. Big Data Understand why we

More information

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН Zettabytes Petabytes ABC Sharding A B C Id Fn Ln Addr 1 Fred Jones Liberty, NY 2 John Smith?????? 122+ NoSQL Database

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

Performance and Scalability Overview

Performance and Scalability Overview Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and

More information

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I)

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I) ! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and

More information

Using IBM dashdb With IBM Embeddable Reporting Service

Using IBM dashdb With IBM Embeddable Reporting Service What this tutorial is about In today's mobile age, companies have access to a wealth of data, stored in JSON format. Leading edge companies are making key decision based on that data but the challenge

More information

How To Use Big Data For Telco (For A Telco)

How To Use Big Data For Telco (For A Telco) ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

MongoDB Developer and Administrator Certification Course Agenda

MongoDB Developer and Administrator Certification Course Agenda MongoDB Developer and Administrator Certification Course Agenda Lesson 1: NoSQL Database Introduction What is NoSQL? Why NoSQL? Difference Between RDBMS and NoSQL Databases Benefits of NoSQL Types of NoSQL

More information

Introduction to Apache Cassandra

Introduction to Apache Cassandra Introduction to Apache Cassandra White Paper BY DATASTAX CORPORATION JULY 2013 1 Table of Contents Abstract 3 Introduction 3 Built by Necessity 3 The Architecture of Cassandra 4 Distributing and Replicating

More information

Big Data Solutions. Portal Development with MongoDB and Liferay. Solutions

Big Data Solutions. Portal Development with MongoDB and Liferay. Solutions Big Data Solutions Portal Development with MongoDB and Liferay Solutions Introduction Companies have made huge investments in Business Intelligence and analytics to better understand their clients and

More information

Data Warehousing in the Age of Big Data

Data Warehousing in the Age of Big Data Data Warehousing in the Age of Big Data Krish Krishnan AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD * PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Morgan Kaufmann is an imprint of Elsevier

More information

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs Kousha Etessami U. of Edinburgh, UK Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 6) 1 / 13 Overview Graphs and Graph

More information

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & Innovation 04-08-2011 to the EC 8 th February, Luxembourg Your Atos business Research technologists. and Innovation

More information

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

An Approach to Implement Map Reduce with NoSQL Databases

An Approach to Implement Map Reduce with NoSQL Databases www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh

More information

Customized Report- Big Data

Customized Report- Big Data GINeVRA Digital Research Hub Customized Report- Big Data 1 2014. All Rights Reserved. Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 2 2014. All Rights Reserved.

More information

Study concluded that success rate for penetration from outside threats higher in corporate data centers

Study concluded that success rate for penetration from outside threats higher in corporate data centers Auditing in the cloud Ownership of data Historically, with the company Company responsible to secure data Firewall, infrastructure hardening, database security Auditing Performed on site by inspecting

More information

Katta & Hadoop. Katta - Distributed Lucene Index in Production. Stefan Groschupf Scale Unlimited, 101tec. sg{at}101tec.com

Katta & Hadoop. Katta - Distributed Lucene Index in Production. Stefan Groschupf Scale Unlimited, 101tec. sg{at}101tec.com 1 Katta & Hadoop Katta - Distributed Lucene Index in Production Stefan Groschupf Scale Unlimited, 101tec. sg{at}101tec.com foto by: belgianchocolate@flickr.com 2 Intro Business intelligence reports from

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84 Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics

More information

Euler Paths and Euler Circuits

Euler Paths and Euler Circuits Euler Paths and Euler Circuits An Euler path is a path that uses every edge of a graph exactly once. An Euler circuit is a circuit that uses every edge of a graph exactly once. An Euler path starts and

More information

Graph Databases: Neo4j

Graph Databases: Neo4j Course NDBI040: Big Data Management and NoSQL Databases Practice 05: Graph Databases: Neo4j Martin Svoboda 5. 1. 2016 Faculty of Mathematics and Physics, Charles University in Prague Outline Graph databases

More information

How To Make Data Streaming A Real Time Intelligence

How To Make Data Streaming A Real Time Intelligence REAL-TIME OPERATIONAL INTELLIGENCE Competitive advantage from unstructured, high-velocity log and machine Big Data 2 SQLstream: Our s-streaming products unlock the value of high-velocity unstructured log

More information

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data: The Datafication Of Everything Thoughts Devices Processes Thoughts Things Processes Run the Business Organize data to do something

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers Modern IT Operations Management Why a New Approach is Required, and How Boundary Delivers TABLE OF CONTENTS EXECUTIVE SUMMARY 3 INTRODUCTION: CHANGING NATURE OF IT 3 WHY TRADITIONAL APPROACHES ARE FAILING

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Distributed File Systems and NoSQL Database Distributed

More information

Decision Mathematics D1 Advanced/Advanced Subsidiary. Tuesday 5 June 2007 Afternoon Time: 1 hour 30 minutes

Decision Mathematics D1 Advanced/Advanced Subsidiary. Tuesday 5 June 2007 Afternoon Time: 1 hour 30 minutes Paper Reference(s) 6689/01 Edexcel GCE Decision Mathematics D1 Advanced/Advanced Subsidiary Tuesday 5 June 2007 Afternoon Time: 1 hour 30 minutes Materials required for examination Nil Items included with

More information

Big Data Management. Big Data Management. (BDM) Autumn 2013. Povl Koch September 30, 2013 29-09-2013 1

Big Data Management. Big Data Management. (BDM) Autumn 2013. Povl Koch September 30, 2013 29-09-2013 1 Big Data Management Big Data Management (BDM) Autumn 2013 Povl Koch September 30, 2013 29-09-2013 1 Overview Today s program 1. Little more practical details about this course 2. Recap from last time 3.

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

Enterprise Operational SQL on Hadoop Trafodion Overview

Enterprise Operational SQL on Hadoop Trafodion Overview Enterprise Operational SQL on Hadoop Trafodion Overview Rohit Jain Distinguished & Chief Technologist Strategic & Emerging Technologies Enterprise Database Solutions Copyright 2012 Hewlett-Packard Development

More information

Big Data Analytics in LinkedIn. Danielle Aring & William Merritt

Big Data Analytics in LinkedIn. Danielle Aring & William Merritt Big Data Analytics in LinkedIn by Danielle Aring & William Merritt 2 Brief History of LinkedIn - Launched in 2003 by Reid Hoffman (https://ourstory.linkedin.com/) - 2005: Introduced first business lines

More information

Cloud3DView: Gamifying Data Center Management

Cloud3DView: Gamifying Data Center Management Cloud3DView: Gamifying Data Center Management Yonggang Wen Assistant Professor School of Computer Engineering Nanyang Technological University ygwen@ntu.edu.sg November 26, 2013 School of Computer Engineering

More information

Graph/Network Visualization

Graph/Network Visualization Graph/Network Visualization Data model: graph structures (relations, knowledge) and networks. Applications: Telecommunication systems, Internet and WWW, Retailers distribution networks knowledge representation

More information