Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores
|
|
- Merryl Hunt
- 8 years ago
- Views:
Transcription
1 Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores Composite Software October 2010
2 TABLE OF CONTENTS INTRODUCTION... 3 BUSINESS AND IT DRIVERS... 4 NOSQL DATA STORES LANDSCAPE... 5 TABULAR / COLUMNAR DATA STORES... 5 DOCUMENT STORES... 5 GRAPH DATABASES... 5 KEY/VALUE STORES... 5 OBJECT AND MULTI-VALUE DATABASES... 5 MISCELLANEOUS NOSQL SOURCES... 5 INTEGRATING NOSQL DATA STORES USING DATA VIRTUALIZATION... 6 TABULAR/COLUMNAR DATA STORES... 6 XML DOCUMENT STORES... 7 KEY/VALUE STORES... 7 SUMMARY... 8 Composite Software 2
3 INTRODUCTION There is a trend in the data storage and management arena to consider data storage options beyond the traditional SQL-based relational database. The overall movement began in 2009 and was known as NoSQL (meaning no SQL ), but that label has since evolved into NOSQL (meaning not only SQL ). Unfortunately both of these labels say more about what it isn t than what it is, and this is the source of ongoing confusion for this whole class of data stores. The general definition of a NOSQL data store is that is manages data that is not strictly tabular and relational, so it does not make sense to use SQL for the creation and retrieval of the data. More specifically, NOSQL data stores are usually non-relational, distributed, open-source, and horizontally scalable, although there are exceptions to each of these for specific NOSQL data stores. While NOSQL access standards have yet to fully develop, each implementation provides some sort of Java-based development API appropriate for accessing that type of NOSQL data. The Composite Data virtualization Platforms typically use these APIs to access and integrate NOSQL data, with three kinds of NOSQL data sources a natural integration fit. This paper describes the primary NOSQL data sources in the market today and how to integrate them with other sources using the Composite Data Virtualization Platform. Composite Software 3
4 BUSINESS AND IT DRIVERS The main driver for the creation of NOSQL data stores was the emergence of web-scale data i.e., massive amounts of data at the large web sites and services like Amazon, Google, Yahoo!, Facebook, etc. A number of NOSQL data stores emerged from custom engineering development done at these large companies. Recently predictive analytics, voice-of-thecustomer, churn, fraud and other big data use cases have emerged to further accelerate demand. Storing and processing this data revealed several specific motivations for these new data stores including: Cost per Terabyte: Many of the NOSQL data sources were invented to handle web-scale data that is created in enormous volumes (e.g., web site click streams), and storing this much data in a traditional relational database would be expensive and inefficient. Many of the NOSQL data sources are open source and run on commodity hardware, making them considerably less expensive per terabyte than traditional databases from vendors like Oracle and Teradata. Distributed Processing: Web-scale data is so large that the traditional database approach to storage, indexing, and retrieval does not work very well with this class of data. NOSQL data sources introduce storage architectures that scale horizontally; and parallel algorithms designed to efficiently process the distributed data ( map-reduce being the most prominent example). Data Shape Appropriateness: Many successful web-based services have introduced data that is not efficiently represented as relational, motivating new data structures more appropriate to the data. For example, social media web sites employ graph databases to represent the social relationships inherent in these services. Composite Software 4
5 NOSQL DATA STORES LANDSCAPE Although the original emergence of NOSQL data stores was motivated by web-scale data, the movement has grown to encompass a wide variety of data stores that just happen to not use SQL as their processing language (making it difficult to characterize exactly what a NOSQL data store is). There is no general agreement on the taxonomy of NOSQL data stores, but the categories below capture much of the landscape. Tabular / Columnar Data Stores Storing sparse tabular data, these stores look most like traditional tabular databases. Examples include Hadoop/HBase (Yahoo!), BigTable (Google), Hypertable and VoltDB. Their primary data retrieval paradigm utilizes column filters, generally leveraging hand-coded map-reduce algorithms. Document Stores These NOSQL data sources store unstructured (i.e., text) or semi-structured (i.e., XML) documents. Examples include MongoDB, Mark Logic and CouchDB. Their data retrieval paradigm varies highly, but documents can always be retrieved by unique handle. XML data sources leverage XQuery. Text documents are indexed, facilitating keyword search-like retrieval. Graph Databases These NOSQL sources store graph-oriented data with nodes, edges, and properties and are commonly used to store associations in social networks. Examples include Neo4J, AllegroGraph and FlockDB. Data retrieval focuses on retrieving associations from a particular node. Key/Value Stores These sources store simple key/value pairs like a traditional hashtable. They are further subdivided into in-memory and disk-based solutions. This category of NOSQL systems probably has the largest number of members, each embodying slightly different characteristics. Examples include Memcached, Cassandra (Facebook), SimpleDB, Dynamo (Amazon), Voldemort (Linked-In) and Kyoto Cabinet. Their data retrieval paradigm is simple; given a key, return the value. Some offer more complex querying mechanisms that can look inside the value, but normally the value is considered opaque. Object and Multi-value Databases These types of stores preceded the NOSQL movement, but they have found new life as part of the movement. Object databases store objects (as in object-oriented programming). Multi-value databases store tabular data, but individual cells can store multiple values. Examples include Objectivity, GemStone and Unidata. Proprietary query languages are used to retrieve data. Miscellaneous NOSQL Sources Several other data stores can be classified as NOSQL stores, but they don t fit into any of the categories above. Examples include: GT.M, IBM Lotus/Domino, and the ISIS family. Composite Software 5
6 INTEGRATING NOSQL DATA STORES USING DATA VIRTUALIZATION The Composite Data Virtualization Platform provides a complete development and runtime environment for discovering, accessing, federating, abstracting and delivering data from diverse sources. Access is typically done via standards-based protocols and APIs, for example JDBC and ODBC for SQL-based sources, HTTP and SOAP for Web services, JMS for messages, APIs for enterprise and cloud-based applications. Through these methods, source data is securely exposed from a single virtual location, regardless of how and where it is physically stored. While NOSQL access standards have yet to fully develop, each implementation provides some sort of Java-based development API appropriate for accessing that type of NOSQL data. The Composite Data Virtualization Platform uses these APIs as well as Composite s Custom Java Procedure (CJP) resource to access and integrate NOSQL data. Three kinds of NOSQL systems are a particularly natural fit for this integration approach. These include Tabular/Columnar Data Stores, XML Document Stores, and Key/Value Stores. A more detailed integration approach for each of these is outlined below. Over time, as NOSQL leaders emerge and usage patterns solidify, Composite may elect to provide more in-depth integrations with particular NOSQL data stores through the creation of fully supported adapters. Tabular/Columnar Data Stores Because the original implementation of the Composite Data Virtualization Platform integrated tabular data, retrieving and processing data from this category of NOSQL data store is an easy fit. This approach leverages Composite s ability to incorporate table functions in the FROM clause of a SQL statement. That is, any Composite procedure resource that returns a cursor can be dropped into the View editor as a table, where it will show up in the FROM clause of the SQL statement. For a specific NOSQL data store, a collection of CJP table functions can be implemented that leverage the NOSQL system s Java API. Each CJP would provide access to a different table in the underlying NOSQL data store. The CJPs can take input arguments to filter the data from the table, further leveraging the NOSQL system s processing capability. The values of the filters can even be specified at run-time from a client query by leveraging the virtual column capability of Views. It is worth remembering that these tabular/columnar NOSQL data sources store very large data sets, so caution must be used on large queries. The table function implementation should ensure sufficient data reduction in the target data source by leveraging input parameters. Also, the processing of requests to these data sources can take a very long time (more like batch jobs than live queries), so employing some form of caching would probably be prudent. This approach provides full access to the data in the underlying NOSQL system and it will likely meet most near term needs. There are, however, some disadvantages and inefficiencies in this approach. For example, all the columns specified in the CJP s cursor would always be retrieved, even if they weren t all necessary for the current query. Also, more generic filtering and aggregation might be possible with the underlying system, but the CJP provides only a limited interface to expose that capability to Composite. If a particular NOSQL Tabular data Composite Software 6
7 store becomes quite popular, it would be an ideal candidate for Composite to develop a custom adapter that would fully integrate and leverage that specific data source s capabilities. XML Document Stores Because XML document stores utilize XQuery as their preferred data retrieval paradigm, the Composite Data Virtualization Platform leverages its embedded XQuery engines and XML native data type to easily retrieve and further process documents from this category of NOSQL data store. For a specific NOSQL XML document store with a Java API, a minimum of two CJP procedures are required. Both CJPs return an XML document that can be further manipulated by any of the upstream XML manipulation functionality (e.g., XSLT Transformations). The first CJP would take a document handle (unique identifier) as its only input argument, and then leverage the API to retrieve and return that document. The second CJP would take an XQuery specification as its only input argument, and then leverage the API to execute the query and return the results as a single document. Of course, additional CJPs accepting more specific parameters could also be implemented, facilitating easier integration into multiple views. This approach provides full access to the data in the underlying XML data source, and it will likely be sufficient for most needs. Key/Value Stores The Composite Data Virtualization Platform can integrate key/value stores in two ways. The first is through a custom SQL function. That is, a function can be created that takes the key as a parameter, and returns the value. This function can then be used in multiple SQL statements throughout Composite. In the second, Composite leverages the in-memory key/value store as a cache target. This is the primary use-case typically described by our enterprise customers. This approach is best for small data sets or procedure results, but it doesn t work as well for large tabular data sets. Further, this form of cache integration is often challenged by the impedance mismatch between cached tabular data and cached key/value data (the cached data is opaque inside the key/value store), so the entire set must be retrieved for processing. This form of integration is available today from our professional services organization. Composite Software 7
8 SUMMARY NOSQL data stores are proliferating as a means of supporting web-scale data. Recently predictive analytics, voice-of-the-customer, churn, fraud and other big data use cases have emerged to further accelerate demand. There are a wide variety of NOSQL systems, each with their own set of use-cases and advantages. Each NOSQL data store has a unique and non-standard API that can be used to access and integrate these sources. The Composite Data Virtualization Platform is well suited for integrating data from these NOSQL sources with other data within and outside the enterprise. This paper describes integrations for three flavors of NOSQL data stores: Tabular/Columnar Data Stores, XML Document Stores, and In-Memory Key/Value Stores. Today, Composite can provide basic access to data from any of these NOSQL data stores with minimal programming, using standard resources. In the longer term, when leaders in particular areas of the NOSQL landscape emerge, Composite may provide deeper integrations through standard product adapters that within the Composite Application Data Services product line. Composite Software 8
9 ABOUT COMPOSITE SOFTWARE Composite Software, Inc. is the data virtualization gold standard at ten of the top 20 banks, six of the top ten pharmaceutical companies, four of the top five energy firms, major media and technology organizations; and multiple government agencies. These are among the hundreds of global organizations with disparate, complex information environments that count on the Composite to increase their data agility, cut costs and reduce risk. Backed by nearly a decade of pioneering R&D, Composite is the data virtualization performance leader, scaling from project to enterprise for data federation, data warehouse extension, enterprise data sharing, real-time and cloud computing data integration. Founded in 2002, Composite Software is a privately held, venture-funded corporation based in Silicon Valley. For more information, please visit
Composite Software Data Virtualization Turbocharge Analytics with Big Data and Data Virtualization
Composite Software Data Virtualization Turbocharge Analytics with Big Data and Data Virtualization Composite Software, Inc. June 2011 TABLE OF CONTENTS INTRODUCTION... 3 PROBLEM ANALYTICS PUSH THE LIMITS
More informationCloud Scale Distributed Data Storage. Jürmo Mehine
Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationINTRODUCTION TO CASSANDRA
INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationBig Data Technologies. Prof. Dr. Uta Störl Hochschule Darmstadt Fachbereich Informatik Sommersemester 2015
Big Data Technologies Prof. Dr. Uta Störl Hochschule Darmstadt Fachbereich Informatik Sommersemester 2015 Situation: Bigger and Bigger Volumes of Data Big Data Use Cases Log Analytics (Web Logs, Sensor
More informationThe Quest for Extreme Scalability
The Quest for Extreme Scalability In times of a growing audience, very successful internet applications have all been facing the same database issue: while web servers can be multiplied without too many
More informationBIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &
BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & Innovation 04-08-2011 to the EC 8 th February, Luxembourg Your Atos business Research technologists. and Innovation
More informationNoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre
NoSQL systems: introduction and data models Riccardo Torlone Università Roma Tre Why NoSQL? In the last thirty years relational databases have been the default choice for serious data storage. An architect
More informationNoSQL Data Base Basics
NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS
More informationSQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford
SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems
More informationDatabases 2 (VU) (707.030)
Databases 2 (VU) (707.030) Introduction to NoSQL Denis Helic KMI, TU Graz Oct 14, 2013 Denis Helic (KMI, TU Graz) NoSQL Oct 14, 2013 1 / 37 Outline 1 NoSQL Motivation 2 NoSQL Systems 3 NoSQL Examples 4
More informationThe 3 questions to ask yourself about BIG DATA
The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.
More informationBIG DATA TOOLS. Top 10 open source technologies for Big Data
BIG DATA TOOLS Top 10 open source technologies for Big Data We are in an ever expanding marketplace!!! With shorter product lifecycles, evolving customer behavior and an economy that travels at the speed
More informationGigaSpaces Real-Time Analytics for Big Data
GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and
More informationLambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com
Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...
More informationSo What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
More informationMaking Sense of NoSQL Dan McCreary Ann Kelly
Sample Chapter Making Sense of NoSQL Dan McCreary Ann Kelly Chapter 1 Copyright 2013 Manning Publications brief contents PART 1 INTRODUCTION...1 1 NoSQL: It s about making intelligent choices 3 2 NoSQL
More informationwww.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach
www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach Nic Caine NoSQL Matters, April 2013 Overview The Problem Current Big Data Analytics Relationship Analytics Leveraging
More informationEvaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing
Evaluating NoSQL for Enterprise Applications Dirk Bartels VP Strategy & Marketing Agenda The Real Time Enterprise The Data Gold Rush Managing The Data Tsunami Analytics and Data Case Studies Where to go
More informationMaking Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island
Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY ANN KELLY II MANNING Shelter Island contents foreword preface xvii xix acknowledgments xxi about this book xxii Part 1 Introduction
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK OVERVIEW ON BIG DATA SYSTEMATIC TOOLS MR. SACHIN D. CHAVHAN 1, PROF. S. A. BHURA
More informationData Modeling for Big Data
Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes
More informationAnalytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world
Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3
More informationChapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing
More informationCloud Computing and Advanced Relationship Analytics
Cloud Computing and Advanced Relationship Analytics Using Objectivity/DB to Discover the Relationships in your Data By Brian Clark Vice President, Product Management Objectivity, Inc. 408 992 7136 brian.clark@objectivity.com
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationBig Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012
Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation
More informationApplications for Big Data Analytics
Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail:
More informationWhy NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
More informationIntroduction to NoSQL Databases. Tore Risch Information Technology Uppsala University 2013-03-05
Introduction to NoSQL Databases Tore Risch Information Technology Uppsala University 2013-03-05 UDBL Tore Risch Uppsala University, Sweden Evolution of DBMS technology Distributed databases SQL 1960 1970
More informationHow To Write A Database Program
SQL, NoSQL, and Next Generation DBMSs Shahram Ghandeharizadeh Director of the USC Database Lab Outline A brief history of DBMSs. OSs SQL NoSQL 1960/70 1980+ 2000+ Before Computers Database DBMS/Data Store
More informationComposite Data Virtualization Data Virtualization Platform Maturity Model
Composite Data Virtualization Data Virtualization Platform Maturity Model Composite Software September 2010 TABLE OF CONTENTS INTRODUCTION... 3 EVOLVING NEEDS, EVOLVING SOLUTIONS... 4 HOW TO MEASURE DATA
More informationNoSQL and Hadoop Technologies On Oracle Cloud
NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath
More informationYou should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.
What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees
More informationThere s no way around it: learning about Big Data means
In This Chapter Chapter 1 Introducing Big Data Beginning with Big Data Meeting MapReduce Saying hello to Hadoop Making connections between Big Data, MapReduce, and Hadoop There s no way around it: learning
More informationextensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010
System/ Scale to Primary Secondary Joins/ Integrity Language/ Data Year Paper 1000s Index Indexes Transactions Analytics Constraints Views Algebra model my label 1971 RDBMS O tables sql-like 2003 memcached
More informationLecture Data Warehouse Systems
Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores
More informationBig Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
More informationNext-Generation Cloud Analytics with Amazon Redshift
Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional
More informationCrazy NoSQL Data Integration with Pentaho
Crazy NoSQL Data Integration with Pentaho NoSQL Matters, Cologne Germany May 30 th, 2012 Matt Casters About Matt Chief of Data Integration at Pentaho Lead Development Project manager Community contact
More informationWHITE PAPER. Four Key Pillars To A Big Data Management Solution
WHITE PAPER Four Key Pillars To A Big Data Management Solution EXECUTIVE SUMMARY... 4 1. Big Data: a Big Term... 4 EVOLVING BIG DATA USE CASES... 7 Recommendation Engines... 7 Marketing Campaign Analysis...
More informationNOSQL, BIG DATA AND GRAPHS. Technology Choices for Today s Mission- Critical Applications
NOSQL, BIG DATA AND GRAPHS Technology Choices for Today s Mission- Critical Applications 2 NOSQL, BIG DATA AND GRAPHS NOSQL, BIG DATA AND GRAPHS TECHNOLOGY CHOICES FOR TODAY S MISSION- CRITICAL APPLICATIONS
More informationInfiniteGraph: The Distributed Graph Database
A Performance and Distributed Performance Benchmark of InfiniteGraph and a Leading Open Source Graph Database Using Synthetic Data Objectivity, Inc. 640 West California Ave. Suite 240 Sunnyvale, CA 94086
More informationNoSQL Systems for Big Data Management
NoSQL Systems for Big Data Management Venkat N Gudivada East Carolina University Greenville, North Carolina USA Venkat Gudivada NoSQL Systems for Big Data Management 1/28 Outline 1 An Overview of NoSQL
More informationWA2192 Introduction to Big Data and NoSQL EVALUATION ONLY
WA2192 Introduction to Big Data and NoSQL Web Age Solutions Inc. USA: 1-877-517-6540 Canada: 1-866-206-4644 Web: http://www.webagesolutions.com The following terms are trademarks of other companies: Java
More informationManifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
More informationIntegrating Big Data into the Computing Curricula
Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big
More informationAdvanced Data Management Technologies
ADMT 2014/15 Unit 15 J. Gamper 1/44 Advanced Data Management Technologies Unit 15 Introduction to NoSQL J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE ADMT 2014/15 Unit 15
More informationSlave. Master. Research Scholar, Bharathiar University
Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper online at: www.ijarcsse.com Study on Basically, and Eventually
More informationOpen source large scale distributed data management with Google s MapReduce and Bigtable
Open source large scale distributed data management with Google s MapReduce and Bigtable Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory
More informationBig Data Defined Introducing DataStack 3.0
Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...
More informationBig Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置
More informationBIRT in the World of Big Data
BIRT in the World of Big Data David Rosenbacher VP Sales Engineering Actuate Corporation 2013 Actuate Customer Days Today s Agenda and Goals Introduction to Big Data Compare with Regular Data Common Approaches
More informationNoSQL Databases. Polyglot Persistence
The future is: NoSQL Databases Polyglot Persistence a note on the future of data storage in the enterprise, written primarily for those involved in the management of application development. Martin Fowler
More informationmultiparadigm programming Multiparadigm Data Storage for Enterprise Applications
focus multiparadigm programming Multiparadigm Data Storage for Enterprise Applications Debasish Ghosh, Anshin Software Storing data the same way it s used in an application simplifies the programming model,
More informationWhite Paper: Datameer s User-Focused Big Data Solutions
CTOlabs.com White Paper: Datameer s User-Focused Big Data Solutions May 2012 A White Paper providing context and guidance you can use Inside: Overview of the Big Data Framework Datameer s Approach Consideration
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationNoSQL for SQL Professionals William McKnight
NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to
More informationBig data for the Masses The Unique Challenge of Big Data Integration
Big data for the Masses The Unique Challenge of Big Data Integration White Paper Table of contents Executive Summary... 4 1. Big Data: a Big Term... 4 1.1. The Big Data... 4 1.2. The Big Technology...
More informationNavigating the Big Data infrastructure layer Helena Schwenk
mwd a d v i s o r s Navigating the Big Data infrastructure layer Helena Schwenk A special report prepared for Actuate May 2013 This report is the second in a series of four and focuses principally on explaining
More information5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
More informationNoSQL Evaluation. A Use Case Oriented Survey
2011 International Conference on Cloud and Service Computing NoSQL Evaluation A Use Case Oriented Survey Robin Hecht Chair of Applied Computer Science IV University ofbayreuth Bayreuth, Germany robin.hecht@uni
More informationBig Systems, Big Data
Big Systems, Big Data When considering Big Distributed Systems, it can be noted that a major concern is dealing with data, and in particular, Big Data Have general data issues (such as latency, availability,
More informationDecoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco
Decoding the Big Data Deluge a Virtual Approach Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco High-volume, velocity and variety information assets that demand
More informationW H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
More informationMEAP Edition Manning Early Access Program Neo4j in Action MEAP version 3
MEAP Edition Manning Early Access Program Neo4j in Action MEAP version 3 Copyright 2012 Manning Publications For more information on this and other Manning titles go to www.manning.com brief contents PART
More informationNoSQL. Thomas Neumann 1 / 22
NoSQL Thomas Neumann 1 / 22 What are NoSQL databases? hard to say more a theme than a well defined thing Usually some or all of the following: no SQL interface no relational model / no schema no joins,
More informationBig Data Analytics. Rasoul Karimi
Big Data Analytics Rasoul Karimi Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 1 Introduction
More informationJAVASCRIPT CHARTING. Scaling for the Enterprise with Metric Insights. 2013 Copyright Metric insights, Inc.
JAVASCRIPT CHARTING Scaling for the Enterprise with Metric Insights 2013 Copyright Metric insights, Inc. A REVOLUTION IS HAPPENING... 3! Challenges... 3! Borrowing From The Enterprise BI Stack... 4! Visualization
More informationPerformance and Scalability Overview
Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics platform. PENTAHO PERFORMANCE ENGINEERING
More informationBig Data Architectures. Tom Cahill, Vice President Worldwide Channels, Jaspersoft
Big Data Architectures Tom Cahill, Vice President Worldwide Channels, Jaspersoft Jaspersoft + Big Data = Fast Insights Success in the Big Data era is more than about size. It s about getting insight from
More informationwww.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization
More informationThe evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect
The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect IT Insight podcast This podcast belongs to the IT Insight series You can subscribe to the podcast through
More informationBig Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
More informationBig Data With Hadoop
With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials
More informationA REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information
More informationHadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
More informationUsing Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM
Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that
More informationTable of Contents. Développement logiciel pour le Cloud (TLC) Table of Contents. 5. NoSQL data models. Guillaume Pierre
Table of Contents Développement logiciel pour le Cloud (TLC) 5. NoSQL data models Guillaume Pierre Université de Rennes 1 Fall 2012 http://www.globule.org/~gpierre/ Développement logiciel pour le Cloud
More informationHow To Improve Performance In A Database
Some issues on Conceptual Modeling and NoSQL/Big Data Tok Wang Ling National University of Singapore 1 Database Models File system - field, record, fixed length record Hierarchical Model (IMS) - fixed
More informationBIG DATA: STORAGE, ANALYSIS AND IMPACT GEDIMINAS ŽYLIUS
BIG DATA: STORAGE, ANALYSIS AND IMPACT GEDIMINAS ŽYLIUS WHAT IS BIG DATA? describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information
More informationHow To Store Data In Nosql
White paper Sopen source solutions for big data management Your business technologists. Powering progress Open Source Solutions for Big Data Management Big Data Management is becoming a key issue in the
More informationBIG DATA SOLUTION DATA SHEET
BIG DATA SOLUTION DATA SHEET Highlight. DATA SHEET HGrid247 BIG DATA SOLUTION Exploring your BIG DATA, get some deeper insight. It is possible! Another approach to access your BIG DATA with the latest
More informationEvolution to Revolution: Big Data 2.0
Evolution to Revolution: Big Data 2.0 An ENTERPRISE MANAGEMENT ASSOCIATES (EMA ) White Paper Prepared for Actian March 2014 IT & DATA MANAGEMENT RESEARCH, INDUSTRY ANALYSIS & CONSULTING Table of Contents
More informationWhy Big Data in the Cloud?
Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data
More informationPreparing Your Data For Cloud
Preparing Your Data For Cloud Narinder Kumar Inphina Technologies 1 Agenda Relational DBMS's : Pros & Cons Non-Relational DBMS's : Pros & Cons Types of Non-Relational DBMS's Current Market State Applicability
More informationQLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering
QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...
More informationCISC 432/CMPE 432/CISC 832 Advanced Database Systems
CISC 432/CMPE 432/CISC 832 Advanced Database Systems Course Info Instructor: Patrick Martin Goodwin Hall 630 613 533 6063 martin@cs.queensu.ca Office Hours: Wednesday 11:00 1:00 or by appointment Schedule:
More informationPerformance and Scalability Overview
Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and
More informationRDF graph Model and Data Retrival
Distributed RDF Graph Keyword Search 15 2 Linked Data, Non-relational Databases and Cloud Computing 2.1.Linked Data The World Wide Web has allowed an unprecedented amount of information to be published
More informationBig Data & the Cloud: The Sum Is Greater Than the Parts
E-PAPER March 2014 Big Data & the Cloud: The Sum Is Greater Than the Parts Learn how to accelerate your move to the cloud and use big data to discover new hidden value for your business and your users.
More informationAn Approach to Implement Map Reduce with NoSQL Databases
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh
More informationA Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel
A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated
More information2.1.5 Storing your application s structured data in a cloud database
30 CHAPTER 2 Understanding cloud computing classifications Table 2.3 Basic terms and operations of Amazon S3 Terms Description Object Fundamental entity stored in S3. Each object can range in size from
More informationBig Data Solutions. Portal Development with MongoDB and Liferay. Solutions
Big Data Solutions Portal Development with MongoDB and Liferay Solutions Introduction Companies have made huge investments in Business Intelligence and analytics to better understand their clients and
More informationWhite Paper: Big Data and the hype around IoT
1 White Paper: Big Data and the hype around IoT Author: Alton Harewood 21 Aug 2014 (first published on LinkedIn) If I knew today what I will know tomorrow, how would my life change? For some time the idea
More informationAn Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
More informationStudy concluded that success rate for penetration from outside threats higher in corporate data centers
Auditing in the cloud Ownership of data Historically, with the company Company responsible to secure data Firewall, infrastructure hardening, database security Auditing Performed on site by inspecting
More informationTap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
More information