A Comparison of Current Graph Database Models



Similar documents
GRAPH DATABASE SYSTEMS. h_da Prof. Dr. Uta Störl Big Data Technologies: Graph Database Systems - SoSe

The Current State of Graph Databases

Overview on Graph Datastores and Graph Computing Systems. -- Litao Deng (Cloud Computing Group)

Survey of Graph Database Models

Objectivity positions graph database as relational complement to InfiniteGraph 3.0

NoSQL and Graph Database

Big Data Analytics. Rasoul Karimi

1-Oct 2015, Bilbao, Spain. Towards Semantic Network Models via Graph Databases for SDN Applications

How graph databases started the multi-model revolution

Review of Graph Databases for Big Data Dynamic Entity Scoring

The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg. Adam Marcus MIT CSAIL

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015

A Brief Study of Open Source Graph Databases

Big Graph Data Management

A Performance Evaluation of Open Source Graph Databases. Robert McColl David Ediger Jason Poovey Dan Campbell David A. Bader

A scalable graph pattern matching engine on top of Apache Giraph

A Performance Evaluation of Open Source Graph Databases

Semantic Modeling with RDF. DBTech ExtWorkshop on Database Modeling and Semantic Modeling Lili Aunimo

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

Introduction to NOSQL

db4o Teching Software-Engineering Great stuff! 20 years with Relational Databases PostgreSQL, SQLite, Oracle...

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I)

Structured Data Storage

Introduction to Ontologies

Data sharing in the Big Data era

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

Big Data, Fast Data, Complex Data. Jans Aasman Franz Inc

Network Graph Databases, RDF, SPARQL, and SNA

Creating an RDF Graph from a Relational Database Using SPARQL

Database Management System Choices. Introduction To Database Systems CSE 373 Spring 2013

Graph Processing and Social Networks

Performance Analysis, Data Sharing, Tools Integration: New Approach based on Ontology

A Practical Approach to Process Streaming Data using Graph Database

Review: Graph Databases

Heterogeneous databases mediation

Introduction to NoSQL Databases. Tore Risch Information Technology Uppsala University

Practical Semantic Web and Linked Data Applications

E6895 Advanced Big Data Analytics Lecture 4:! Data Store

How To Improve Performance In A Database

Software tools for Complex Networks Analysis. Fabrice Huet, University of Nice Sophia- Antipolis SCALE (ex-oasis) Team

An empirical comparison of graph databases

HyperGraphDB: A Generalized Graph Database

STAR Semantic Technologies for Archaeological Resources.

Graph Database Applications and Concepts with Neo4j

We have big data, but we need big knowledge

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

imgraph: A distributed in-memory graph database

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

History of Database Systems

A Comparative Analysis of Different NoSQL Databases on Data Model, Query Model and Replication Model

The Ontological Approach for SIEM Data Repository

Graph Algorithms and Graph Databases. Dr. Daisy Zhe Wang CISE Department University of Florida August 27th 2014

Lecture Data Warehouse Systems

bigdata Managing Scale in Ontological Systems

Databases 2 (VU) ( )

Big Data Management. Big Data Management. (BDM) Autumn Povl Koch September 2,

3. Relational Model and Relational Algebra

INTRODUCTION DATABASE MANAGEMENT SYSTEMS

Graph Databases Josep Lluis Larriba Pey David Dominguez Sal

InfiniteGraph: The Distributed Graph Database

Database Systems. Lecture 1: Introduction

Large-Scale Data Processing

12 The Semantic Web and RDF

Performance Evaluation of NoSQL Systems Using YCSB in a resource Austere Environment

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

Evaluating SPARQL-to-SQL translation in ontop

Semantic Web Technologies and Data Management

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

Technologies for a CERIF XML based CRIS

Converting Relational to Graph Databases

Yet Another Triple Store Benchmark? Practical Experiences with Real-World Data

Challenges for Data Driven Systems

Data Access over Large Semi-Structured Databases

Introduction: Database management system

Graph Databases for Large-Scale Healthcare Systems: A Framework for Efficient Data Management and Data Services

NOSQL DATABASES IN EEG/ERP

MEAP Edition Manning Early Access Program Neo4j in Action MEAP version 3

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs

Industrial Adoption of Automatically Extracted GUI Models for Testing

Semantic Interoperability

RDF y SPARQL: Dos componentes básicos para la Web de datos

AllegroGraph. a graph database. Gary King gwking@franz.com

Microblogging Queries on Graph Databases: An Introspection

Cloud & Big Data a perfect marriage? Patrick Valduriez

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Semantic Web Success Story

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Transcription:

A Comparison of Current Graph Database Models Renzo Angles Universidad de Talca (Chile) 3rd Int. Workshop on Graph Data Management: Techniques and applications (GDM 2012) 5 April, Washington DC, USA

Outline 1. Introduction 2. Current graph databases 3. Comparison of graph database models 4. Conclusions and future work

Outline 1. Introduction 2. Current graph databases 3. Comparison of graph database models 4. Conclusions and future work

Database models A data model is a collection of conceptual tools used to model real-world entities and the relationships among them [Silberschatz et al. 1996]. A DB model consists of 3 components [Codd 1980]: (2) Query operators (3) Integrity rules (1) Data structure types

Graph database model 1 Graph data structures Simple graphs (nodes + edges + labels + direction) Generalizations: nested graphs (hypernodes), hypergraphs (hyperedges), attributed graphs 2 Graph-oriented operations Simple functions (e.g., shortest-path) Graph query language (operators) Domain-specific queries: graph pattern mining 3 Graph integrity constraints Schema-instance consistency Identification of nodes, atributes and relations Path constraints

Graph databases (before 2003) 1975 1984... 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1999 2002 R&M LDM G-Base O2 Tompa GOOD GROOVI GMOD PaMal GraphDB G-Log GRAS GOQL GDM W&X Gram GOAL DGV GGL Hypernode Simatic-XT Hy+ Hypernode2 Hypernode3 R. Angles and C. Gutierrez. Survey of Graph Database Models. ACM Comp. Surveys, 2008.

Graph databases (after 2003) 200 4 200 5 AllegroGraph 200 6 200 7 DEX Neo4j 200 8 200 9 201 0 VertexDB Hypergraph CloudGraph Pregel InfiniteGraph Trinity Sones Filament GStore Horton 201 1 201

Motivation 1. What is the most suitable graph database? Empirical comparison: desirable but hard Benchmark: there is not a standard one The application domain is also important 2. Objective: comparison of graph data models Independent of implementation Easier to evaluate (vs empirical evaluation) Shows (in advance) the expressive power for data modeling

Outline 1. Introduction 2. Current graph databases 3. Comparison of graph database models 4. Conclusions and future work

Current graph databases Graph database systems AllegroGraph (2005): SW-oriented databases DEX (2007): bitmaps-based graph database Neo4J (2007): disk-based transactional graph database HyperGraphDB (2010): hypergraph-based database InfiniteGraph (2010): distributed-oriented system Sones (2010): object-oriented database

Current graph databases Graph stores VertexDB (2009): key-value disk store (TokyoCabinet) Filament (2010): graph library on PostgreSQL G-Store (2010): prototype Redis_graph (2010): implemented on python Work in progress Pregel (Google, 2009): vertex-based intrastructure for graphs Horton (Microsoft, 2010): transactional graph processing CloudGraph (2010): use MySQL as backed Trinity (Microsoft, 2011): RAM-based key value store

Current graph databases Related DB technologies Web-oriented DBs: InfoGrid, FlockDB Document-oriented DBs: OrientDB Triple stores (RDF DBs): 4Store, Virtuoso, Bigdata Distributed graph processing: Angrapa, Apache Hama, Giraph, GoldenOrb, Phoebus, KDT, Signal Collect, HipG

Outline 1. Introduction 2. Current graph databases 3. Comparison of graph database models 4. Conclusions and future work

Data storing features Backed: RDB (filament), BerkeleyDB (HypergraphDB), TokyoCabinet (vertexdb) Indexing: nodes, relations, atributes, triples Data formats: GraphML, graphviz, N-Triples, RDF-XML, ACID (partial support): Allegro, HypergraphDB, Infinite, Neo4j

Operation and manipulation features Languages: SPARQL 1.0(.1), GraphQL, (Sones) Why is an API the main approach? does it facilities the development of applications?

Graph data structures Node/edge atribution: implementation considerations

Schema and instance representation Node/edge identification: ObjectsI-Ds vs Values Support for complex relations (hyperedges = n-ary relations)

Query features Declarative QL: SPARQL, Prolog, Lisp, GraphQL Reasoning: RDF(S), OWL Analysis: Social Networking, graph statistics

Integrity constraints Most oriented to be schema-less Graph integrity constraints: theoretical interest

Support for essential graph queries

Outline 1. Introduction 2. Current graph databases 3. Comparison of graph database models 4. Conclusions and future work

Conclusions General balance of graph database models Data structures: Several types of graph structures Good expressive power for data modeling Query features: Restricted to provide APIs Good support for essential graph queries Lack of graph query languages Integrity constraints: Basic notions of restrictions Oriented to be schema-less

Future work Empirical evaluation of graph databases Development of a Benchmark for GDBs Comparison with other database technologies (in particular RDF databases)

Thanks for your attention! Questions?