Graph Databases Mean Business



Similar documents
BBM467 Data Intensive ApplicaAons

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

Outline. What is Big data and where they come from? How we deal with Big data?

White Paper: Big Data and the hype around IoT

NoSQL Databases. Polyglot Persistence

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

INTRODUCTION TO CASSANDRA

+ Social = Success

Big Systems, Big Data

Why Modern B2B Marketers Need Predictive Marketing

Big Data and Healthcare Payers WHITE PAPER

Process Intelligence: An Exciting New Frontier for Business Intelligence

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

A RE YOU SUFFERING FROM A DATA PROBLEM?

Powering Recommendations with a Graph Database

E commerce Package. anyone can create a web site. shopping carts are free

Copyright 2013 Splunk Inc. Introducing Splunk 6

Types of Data. Features. Getting Started. Integrated Roadway Asset Management System

How To Handle Big Data With A Data Scientist

HOW TO GET STARTED WITH DATA MODELING

Understanding Neo4j Scalability

Top 5 Mistakes Made with Inventory Management for Online Stores

INTRAFOCUS. DATA VISUALISATION An Intrafocus Guide

A New Foundation For Customer Management

Big Data Management. Big Data Management. (BDM) Autumn Povl Koch September 2,

Big Data Integration: A Buyer's Guide

4 Keys to Successful Project Collaboration & Execution

The Quest for Extreme Scalability

ANALYTICS BUILT FOR INTERNET OF THINGS

Power Tools for Pivotal Tracker

How To Turn Big Data Into An Insight

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement

Data Modeling for Big Data

Getting ahead online. your guide to. GOL412_GBBO brochure_aw5.indd 1 10/2/10 10:10:01

CONTENT MARKETING AND SEO

How To Use Big Data For Telco (For A Telco)

A Simple Guide to Churn Analysis

Reputation Marketing

DEVELOPING A SOCIAL MEDIA STRATEGY

BIG DATA ANALYTICS For REAL TIME SYSTEM

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

What is a Domain Name?

Lecture Data Warehouse Systems

Introduction to Geographical Data Visualization

big trends for small businesses

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

Social Media and Content Marketing.

Disco reinvents ediscovery for lawyers. Introducing ediscovery software every lawyer can love.

Improving Business Insight

Data Visualization Techniques

Understanding NoSQL Technologies on Windows Azure

Introduction to Polyglot Persistence. Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace

What is Prospect Analytics?

TRENDS IN DATA WAREHOUSING

<no narration for this slide>

Visual Scoring the 360 View: 5 Steps for Getting Started with Easier, Faster and More Effective Lead Scoring

Ajax: A New Approach to Web Applications

A Distributed Storage Schema for Cloud Computing based Raster GIS Systems. Presented by Cao Kang, Ph.D. Geography Department, Clark University

Exploding Online Sales Using an Effective Order Management System

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

Analyzing Big Data: The Path to Competitive Advantage

MEAP Edition Manning Early Access Program Neo4j in Action MEAP version 3

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management

Social media has changed the world as we know it by connecting people, ideas and products across the globe.

NoSQL Data Base Basics

Your Master Data Is a Graph: Are You Ready?

Big Data Big Deal? Salford Systems

Where is... How do I get to...

Data Visualization Techniques

The Secret Formula for Webinar Presentations that Work Every Time

HELP I NEED TO HIRE A USER EXPERIENCE DESIGNER

BEYOND BI: Big Data Analytic Use Cases

an introduction to VISUALIZING DATA by joel laumans

PREDICTIVE DATA MINING ON WEB-BASED E-COMMERCE STORE

5 Secrets to Wealth. Every Woman Needs To Know To Be Financially Free. By Nancy L. Gaines, Founder Women Gaining Wealth, LLC

10 Quick-Start Tips For Amazon FBA

Introduction Features Benefits Enhance Your Business. The Payeezy SM ecommerce Solution It pays to dream big.

Client Overview. Engagement Situation. Key Requirements

Statistical Challenges with Big Data in Management Science

TECHNOLOGY SOLUTIONS. Proprietary tools for travel simplicity. Simplicity. Access. Integration

Big Data: Tools and Technologies in Big Data

The Data Mining Process

Workflow Automation: Broadcasters just don t get it!

DIGITAL MARKETING 101:

Transcription:

Graph Databases Mean Business Andreas Kollegger & Rik Van Bruggen September 2012 2012 Neo Technology http://neotechnology.com

Table of Contents Graph Databases Mean Business! 2 The Big Data Business! 2 Understanding Graph Databases! 3 Where are people using Graph Databases?! 6 But where is the Business Value?! 8 Conclusion! 10 Neo Technology 1 Graph Databases Mean Business

Graph Databases Mean Business In this paper, we we would like to explore some of the concepts, trends and technologies that are surrounding us today in the age of Big Data. We will focus on how Graph Databases fit into these, and what the true business benefits will be of using these graph database technologies. Not just developing and implementing technology for technology s sake, but for enabling true, valuable business applications. The Big Data Business Big Data is the trend; the unrelenting, ever increasing, warehouse flooding stream of data from every fleeting tweet, instant photo and look-at-me video. It s not just user content, of course. There s data about the data, real-time streams of data, and privately held business and personal data all interwoven in an incredible web of information. NOSQL is the answer. Rising from practical need, engineers put aside the traditional databases that were faltering, declaring that there is Not Only SQL. Amazon needed a better shopping cart, so designed Dynamo. Facebook needed to manage your life inbox, so created Cassandra. Google needed to store, well.. the Internet, so they designed Big Table. Graph databases like Neo4j arose the same way, to relate and track the flow of documents in a comprehensive content management system. From there, it evolved in a more generalistic approach to managing all kinds of semi-structured, highly interconnected data sets - to manage Graphs. As you can see, each of these solutions have broader application. In a Big data based business environments, data does tend to get more and more interconnected, relations between datasets become more complex - and graph based solutions become a great answer to managing this data environment s complexity. Without wanting to discount the other NOSQL answers to the Big data environment (Key-Value stores, Column-Family stores, and Document Databases), we would like to Neo Technology 2 Graph Databases Mean Business

explore the concept of a Graph Database now, to give us some context and understanding before we start exploring why they mean business. Understanding Graph Databases The most general-purpose of the NOSQL data models, a Graph Database can store any kind of data. Take the trees in a Document Database and connect them to each other to share common information (Products have Reviews, written by People), and you ve got a graph. The preceding NOSQL models are special, specific cases of a graph. Graph Database Model The word graph has very context-sensitive meanings. It could refer to bar charts, or scatter plots, or even vector artwork. But when talking about the database category, think about drawing on a whiteboard with circles and lines, sketching out example data for your application; you ve made a graph, so save it just as it is. A Graph is just data that has a defined structure. Classic computer-science has many examples: lists, trees, maps, objects. All of these data structures are special cases of a Graph. A Graph is the general data structure for storing related data. The records in a Graph database are called Nodes which are connected through Relationships that always have a direction. The traditional visual representation uses circles for Nodes and lines for Relationships. Something like this: Neo Technology 3 Graph Databases Mean Business

name:andreas since: 2009 name: Emil Within a database like Neo4j, both the Nodes and the Relationships can store data as Properties which are simple key-value pairs. In the example, both Nodes have a name Property, and the Relationship has a since Property. In addition, the Relationship is given a Type (here and in all-caps, by convention) which indicates how the Nodes are related. There is great power in this simplicity. For example, reviews can be attached to a product, and the author, who in turn can be related to other authors, so you can easily navigate back-and-forth through the data to meet new people, find similar interests, and discover related products that your friends also purchased. This navigation is how you query a Graph Database. Querying a Graph Database The real fun with a Graph starts when you query the database. A Traversal is a graph query which specifies the Paths to follow from a starting point. Neo4j s Cypher query language uses pattern-matching to describe a Traversal. In this next example, we have a tiny social network. To recommend a new friend for Andreas we could ask the question: who are the friends of Andreas friends? Looking at the Graph, the answer is intuitive: just follow along the Relationships from Andreas to his friends, then to their friends and return those Nodes, finding Sarah. Neo Technology 4 Graph Databases Mean Business

name: Andreas name: Peter name: Max name: Sarah The pattern match for that question looks like the shorthand you might write in an email discussing the problem. We want to find: (andreas)-->(friends)-->(friends_of_friends) That ascii-art is exactly what you d use in the MATCH clause of a Cypher query: START andreas=node:people(name= Andreas ) MATCH (andreas)-->(friends)-->(friends_of_friends) RETURN friends_of_friends The START clause specifies where the traversal should begin, here using an Index called People that is conveniently available. Then the RETURN specifies the result set, as you d expect. Neo Technology 5 Graph Databases Mean Business

Similar to SQL, with some Graph-specific meaning, Cypher is comfortably familiar and sensible. You ll find the usual features: a WHERE clause for constraints, aggregation, ordering and more. For full details about Cypher, read the online Neo4j Manual. Where are people using Graph Databases? Of course, Neo4j is a natural choice as the default storage for any application. The Property Graph data model Nodes and Relationships with Properties can store and query data in the same form as it appears when you draw it on a whiteboard. No contortions or compromises required. Still, it is helpful to start with well understood use cases as you learn the best practices for working with a graph. Social Networking The social network, or social graph, is the most intuitive use case for a Graph Database. The expanding relationships of friends, family, neighbors, and colleagues maps neatly into a graph, where you can realize trust networks, find new friends, or understand how you re connected to someone. Recommendations Someone buys a book on bread baking. Could you guess what else might be a good recommendation for them to consider? Immediately, your mind may leap to baking supplies, other books by that author, or seemingly unrelated products like a garden hose that other users who bought that book typically end up buying. A graph can capture all the direct and nuanced relationships in any domain to make more meaningful recommendations. Access Control Consider the case of a cable-tv company: millions of customers across different regions, with different promotions, agreements, and memberships. A customer adds a sports package through the web, then flips to the game-in-progress on their TV. What hap- Neo Technology 6 Graph Databases Mean Business

pens? Hopefully, the company has that complex interleaved access control stored in a graph, so a simple query can resolve in real-time whether that channel is available. Network Management Imagine hosting thousands of applications, each proxied by a load balancer, spread across multiple machines, connected to routers, which are connected to other routers, all with varying power requirements. To guarantee quality service, you need a database that can make sense of the complex interconnections and dependencies, coming up with immediate answers when one piece fails so you know how to keep everything running. Spatial The Traveling Salesman problem, The Seven Bridges of Königsberg, and getting the best flight on Priceline are all examples of common graph problems which are framed in terms of geography and location. With Neo4j Spatial, it is possible to import Open Street Maps data and attach it to your own data, easily getting the benefit of a full GIS stack, or simple location-aware operations. Bio-Tech Graphs are everywhere, has sort of become a tagline for many graph database professionals. And nowhere is this more apparent and visible as in the Bio-Tech industry. Whether you are trying to analyse protein structures, genome data, pharmaceutical dependencies, or patent implications of the latest research efforts, network oriented datastructures are almost everywhere in Bio-Tech. And Graph Databases lend themselves extremely well to modelling these networks, and extracting valuable information from them that could lead to the next live-saving drug, crop or pharma process. Master Data Management Organizations are rarely a neat hierarchical tree of relationships. Cross departmental relationships, overlapping sales territories, suppliers, services, contractors, and more contribute to the densely connected machinery of a modern business. A Graph Database provides the flexibility to accommodate the complexity, and a language for keeping everyone on the same page. Neo Technology 7 Graph Databases Mean Business

In General A Graph Database is a good choice whenever you have related data. These few examples are obvious, highly-connected data problems. Yet, every relational schema in the world contains at least a few join tables each one an indication that the data isn t just related, but has a relationship. There is great power and flexibility in realizing the value in relationships. But where is the Business Value? All of the above are very interesting, inspiring use cases for graph databases, but you are probably thinking by know: how does that apply to me, and why is it relevant? Where is the Value, with capital V, in solving these problems with a Graph, and not in one of the other models that we mentioned earlier in this paper - SQL or NOSQL. Ability The nature of a graph database, and its capability to store and query very densely interconnected datasets. is something that we are not used to. Relational models have always tended to be anti-relational, in the sense that the more relations you wanted to use and navigate, the costlier it would always be - both at design and at run/query time. Joins have always been terribly expensive, and nested joins (where one would want to explore multiple relationships within one query execution), have always been near impossible. Ask any relational database, no matter how much hardware you would throw at it, a 10-fold nested join query, and it would almost certainly come back begging you for mercy. Or not come back with an answer at all. Graph databases are very different that way: they are able and capable of dealing with as many relationships as you might want to throw at it - with neglible performance degradation for every relationship that you would add. This simple and pure ability to deal with densely interconnected datasets, enables new business applications. Recommenda- Neo Technology 8 Graph Databases Mean Business

tion patterns that we would previously never see, can now generate new business transactions. Fraud constructions that would previously go undetected, are now jumping in our face - saving valuable time, effort, and cash along the way. Performance In the same way that graph database often enable certain capabilities for specific use cases as described above, they just as often offer a very significant performance improvement over existing technologies. A classic example is the case where precomputed, periodically fed data warehouses, which provide business managers with information that is *per definition* outdated, can be replaced with near-realtime reporting systems. Data warehouses are often a necessary evil for many organisations, which they have to put in place in order to get ok performance results from their (often relationally oriented) databases. Imagine what business managers of a retail shop in full holiday shopping madness would say if we could guarantee that his or her sales performance figures were to be near real time? Being able to react fast is of the essence in many businesses - and Graph Databases enable this. Cost Imagine what that same business managers would say if we could tell them that the improved performance would actually come at a *lower* cost. Graph databases allow for cheaper IT systems, because they - enable shorter project delivery cycles (see the Flexibility point below) - avoid the need of having to put data warehousing bandaids on the broken systems that cannot deal with the densely connected datasets of today s business environment. Flexibility We all know the cliché: the only thing constant is change. And we have all lived its reality: in today s fast moving business world, we have to be able to act quickly, react to changing circumstances, and be flexible in the way that we react so that we can learn, adapt and improve our performance. In the IT world, we have adopted things like Ag- Neo Technology 9 Graph Databases Mean Business

ile development methologies for a reason: we understand that in the fast paced business environment, our IT systems have to be equally fast paced and able to adapt. IT systems can no longer feel like a straitjacket - they have to allow our business processes to evolve and adapt, in a flexible way. Graph databases, because of a) their schemaless nature, and b) their natural fit with the realities that we see around us, allow for that flexibility. Developers don t have to spend valuable time trying to model what they don t know, have the ability to grow their data models as their understanding of the problem domain grows, and can flexibly work with their data structures in many different situations. This in itself, is a huge added value to any business manager who is conscient of the constant change around them. Conclusion A Graph Database like Neo4j, is a NOSQL solution for making sense of the incredible connections found in all data big and small. Graphs are everywhere. When stored in a mature, proven database like Neo4j, you can comfortably navigate the complexity, and reap real, tangible rewards - both from an IT perspective and from a more down-toearth, bottom-line-affecting business perspective. Should you ever want to discuss some of these business applications for Graph Databases, please do not hesitate to visit www.neotechnology.com or www.neo4j.org, or email us at info@neotechnology.com. Neo Technology 10 Graph Databases Mean Business