WA2192 Introduction to Big Data and NoSQL EVALUATION ONLY
|
|
- Alyson Chambers
- 8 years ago
- Views:
Transcription
1 WA2192 Introduction to Big Data and NoSQL Web Age Solutions Inc. USA: Canada: Web:
2 The following terms are trademarks of other companies: Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a trademark of Linus Torvalds in the United States, other countries, or both. IBM, WebSphere, DB2 and Tivoli are trademarks of the International Business Machines Corporation in the United States, other countries, or both. Other company, product, and service names may be trademarks or service marks of others. For customizations of this book or other sales inquiries, please contact us at: USA: , getinfousa@webagesolutions.com Canada: toll free, getinfo@webagesolutions.com Copyright 2013 Web Age Solutions Inc. This publication is protected by the copyright laws of Canada, United States and any other country where this book is sold. Unauthorized use of this material, including but not limited to, reproduction of the whole or part of the content, re-sale or transmission through fax, photocopy or is prohibited. To obtain authorization for any such activities, please write to: Web Age Solutions Inc. 439 University Ave Suite 820 Toronto Ontario, M5G 1Y8
3 Table of Contents Chapter 1 - Defining Big Data Transforming Data into Business Information Gartner's Definition of Big Data More Definitions of Big Data Challenges Posed by Big Data The Cloud and Big Data The Business Value of Big Data Big Data: Hype or Reality? Big Data Quiz Big Data Quiz Answers Summary...13 Chapter 2 - NoSQL and Big Data Systems Overview Limitations of Relational Databases What are NoSQL (Not Only SQL) Databases? NoSQL Past and Present NoSQL Database Properties NoSQL Benefits NoSQL Database Storage Types The CAP Theorem Limitations of NoSQL Databases Big Data Sharding Sharding Example Amazon S Amazon Storage SLAs Amazon Glacier Amazon S3 Security Data Lifecycle Management with Amazon S Amazon S3 Cost Monitoring OpenStack Object Store (Swift) Components of Swift Google BigTable BigTable-based Applications BigTable Design Google App Engine Google App Engine Billing Google Cloud Storage Hadoop Hadoop's Core Components Hadoop Distributed File System Accessing HDFS HBase HBase design MemcacheDB...38
4 2.33 MongoDB MongoDB Operational Intelligence MongoDB Use Cases Quiz Quiz Answers Summary...42 Chapter 3 - Big Data Business Intelligence and Analytics Comparison with other systems NoSQL Data Querying and Processing MapReduce programming model Example of Map & Reduce Operations using JavaScript Analyzing Big Data with Hadoop Hadoop's MapReduce Hadoop Streaming Making things simpler with Hadoop Pig Latin Example of a Pig Script in Batch Mode Amazon Elastic MapReduce Big Data in Google App Engine Example of Google AppEngine Java Datastore API MongoDB Data Model MongoDB Query Language (QL) The find and findone methods A MongoDB QL Example What is Hive Interfacing with Hive Business analytics with Hive The UnQL Specification Quiz Quiz Answers Summary...60 Chapter 4 - Big Data Real World Case Studies Yahoo Yahoo for Hadoop Yahoo!! Big Facebook Facebook Mailtrust (Rackspace's mail division) Summary...65 Chapter 5 - Adopting NoSQL Hype Cycle and Technology Adoption Model Barriers to Adoption Dismantling Barriers to Adoption Use Cases for NoSQL Database Systems Example Applications Industry trends Enterprise Big Data / NoSQL Offerings...72
5 5.8 NoSQL Technology Adoption Action Plan Summary...74
6
7 Chapter 1 - Defining Big Data Objectives In this chapter, participants will learn about: Big Data definitions Challenges posed by Big Data How businesses can benefit from Big Data 1.1 Transforming Data into Business Information Success of an organization is predicated on its ability to convert raw data from various sources into useful business information As a rule, the more data is available, the more information can be harvested from it The amount of information that can be obtained from raw data is in direct proportion to the volume of the raw data (increasing the size of input data sets leads to a larger amount of useful information) Nowadays, data can be easily acquired but it normally comes in unstructured forms In many instances, the [useful information]/[information noise] ratio in data sets is very low The quality of information harvested from the data depends on the sophistication of the data processing algorithm In many respects, getting business information is similar to extracting gold from ore OLAP and Data Mining systems (deployed in data warehouses) are the traditional tools used by organizations for extracting business intelligence from data Notes A person of average lifespan, literacy and cultural exposure processes about 650 million words. Ian Pearson, of British Telecom, estimated that over an 80-year lifespan we process 10 terabytes of data. Source: The World As Information: Overload and Personal Design By Robert Abbott, Robert D. Abbott
8 Chapter 1 - Defining Big Data 1.2 Gartner's Definition of Big Data Gartner's analyst Doug Laney defined three dimensions to data growth challenges: increasing volume (amount of data), velocity (speed of data in and out), and variety (range of data types and sources) In 2012, Gartner updated its definition as follows: "Big data are high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization." Volume Data sizes accumulated in many organizations come to hundreds of terabytes, approaching the petabyte levels Variety Big Data comes in different formats as well as unformatted (unstructured) and various types like text, audio, voice, VoIP, images, video, s, web traffic log files entries, sensor byte streams, etc. Velocity High traffic on-line banking web site can generate hundreds of TPS (transactions per second) each of which may be required to be subjected to fraud detection analysis in real or near-real time Figure source: 8
9 Chapter 1 - Defining Big Data Notes 1.3 More Definitions of Big Data There are different definitions of what Big Data is, however, one attribute of Big Data seems to more representative than others: The data gets mystically morphed into the Big Data category when traditional systems and tools (e.g. databases, OLAP and data-mining systems used in data marts or warehouses) may become either prohibitively expensive to handle the exponential growth of data volumes or found unsuitable for the job Big Data is stored electronically and lends itself to machine-oriented processing Processing of Big Data requires new approaches and tooling support NoSQL (Not Only SQL) databases have appeared, in part, to address the challenges posed by Big Data In some instances, Big Data sets may be seen as sparsely populated matrices or N-dimensional cubes with no rigid schema. A key value (KV) data set is an example of schema-less data. KV data sets include an array of key-value pairs where each key is the name of an attribute (sort of a column name in relational databases) pointing to the actual data. This kind of data does not always lend itself to processing using conventional database systems. 1.4 Challenges Posed by Big Data Traditional relational database technologies are not very well suited to accommodate the volume, variety and velocity characteristics of Big Data, in part, due to: Underlying rigid data model The database server, for the most part, is deployed on a single node with limited number of options for both vertical and horizontal scalability to accommodate over-capacity volumes Databases are a poor choice for elastic computing power provisioning required for handling rapid spikes in data volumes and throughput without increasing response time 9
10 Chapter 1 - Defining Big Data There is a growing number of use cases for real-time data processing (lightweight analytics is often sufficient) It is no longer enough to just capture, store and process Big Data using batch-oriented analytics in an offline environment (the "data-at-rest" processing paradigm) Applications are required to provide real-time, in-place data analysis without moving the data to a warehouse (the "data-in-motion" processing paradigm) Many organizations are faced with the piling up of unprocessed data that has the potential to aid their business in making informed tactical and strategic decisions Notes In response to the introduction of the XML data type, many database vendors introduced the special XML column type for storing XML documents in their databases. Things are always changing and now there is a new lightweight data-interchange format called JSON ((JavaScript Object Notation) very popular with Web 2.0-style dynamic web sites. Are vendors now going to introduce a new column type to support JSON format? The jury is still out on this one Database schema must be defined using a DDL (Data Description Language) during the database logical design phase; changes in the schema requires recreating tables with the new structure. An example of a system that provides real-time in-place data analysis without moving the data to a warehouse is the IBM InfoSphere Stream computing framework which enables "continuous and extremely fast analysis of massive volumes of information-in-motion to help improve business insights 10
11 Chapter 1 - Defining Big Data and decision making" Notes: 1.5 The Cloud and Big Data Gone are the days when only large corporations could afford storing massive data sets Physical storage capacity is increasing while the cost of data storage goes down The commodity hard drives are now have capacities over 1 TB (a million million [10 12 ] bytes) of data Still, on-premise physical storage constitutes a significant factor in the Total Cost of Ownership (TCO) for organizations Cloud vendors offer services for storing Big Data sets (Swift from Rackspace and OpenStack, S3 from Amazon, HRD from Google App Engine, etc.). If required, in-place processing capabilities are also available In cases when data security / confidentiality is a concern, the data to be stored and processed in the Cloud needs first to be sanitized or encrypted before uploading to the Cloud Cloud storage refers to any type of data storage that resides in the Cloud, including: services that provide database-like functionality; unstructured data services (file storage of digital media, for example); data synchronization services; or Network Attached Storage (NAS) services. Data services are often consumed in a pay-as-you-go model or, in this case, a pay-per-gb model (including both stored and transferred data). Cloud storage offers a number of benefits, such as the ability to store and retrieve large amounts of data in any location at any time. Data storage services are fast, inexpensive, and almost infinitely scalable; however, reliability can be an issue, as even the best services do sometimes fail. Transaction support is also an issue with cloud-based storage systems, a significant problem that needs to be addressed for storage services to be widely used in the enterprise. Source: 11
12 Chapter 1 - Defining Big Data 1.6 The Business Value of Big Data Most organizations use just a fraction of the data available to them as it is either too expensive to process it or business has no expertise to extract the relevant information Businesses that effectively leverage Big Data (that was originally discarded or not processed due to technology limitations) get a competitive advantage over their competitors Insights from Big Data help improve services and products, develop deeper customer relationships in a more agile and predictive manner and uncover new monetization opportunities Since storage costs of Big Data in many cases is not an issue, businesses may request their IT to extend retention period of some data feeds and come up with usage ideas later on Specialized Big Data solutions can offer real or near real-time analytics Overall, with Big Data, business agility is achieved New features can be incorporated into applications quickly and easily 1.7 Big Data: Hype or Reality? In its report " Hype Cycle for Cloud Computing, 2012", Gartner predicts that "Big Data will deliver transformational benefits to enterprises within two to five years, and by 2015 will enable enterprises adopting this technology to outperform competitors by 20% in every available financial metric." In the same report, Gartner places Big Data near the Peak of Inflated Expectations in the hype cycle, which can be defined as a phase that generates high amounts of enthusiasm and unrealistic expectations (i.e. what most people would call a buzzword). Source: Big Data Quiz 1. What are the three main characteristics of Big Data 12
13 Chapter 1 - Defining Big Data 2. Name any one limitation of relational databases 3. What is the difference between the "data-at-rest" and "data-in-motion" processing 1. Volume, Variety and Velocity (V 3 ) 1.9 Big Data Quiz Answers 2. Rigid data model (as prescribed by a DDL) 3. "Data-at-rest" is a batch-oriented process running in offline settings, while "data-in-motion" refers to real-time, in-place data processing and analysis 1.10 Summary Nowadays, information can be easily acquired but making effective use of it beyond what can be achieved with traditional technologies requires introduction of new concepts, re-thinking the usefulness of data and getting new tooling support Organizations are faced with the growing amount of unprocessed data that can and should be used more intelligently Businesses that found ways to take advantage of Big Data are ahead of the competition 13
14
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationSo What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
More informationImplement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
More informationThe 3 questions to ask yourself about BIG DATA
The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.
More informationINTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK OVERVIEW ON BIG DATA SYSTEMATIC TOOLS MR. SACHIN D. CHAVHAN 1, PROF. S. A. BHURA
More informationWhy NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
More informationArchitecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
More informationTransforming the Telecoms Business using Big Data and Analytics
Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe
More informationYou should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.
What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees
More informationComposite Data Virtualization Composite Data Virtualization And NOSQL Data Stores
Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores Composite Software October 2010 TABLE OF CONTENTS INTRODUCTION... 3 BUSINESS AND IT DRIVERS... 4 NOSQL DATA STORES LANDSCAPE...
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationAn Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
More informationChapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:
Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationApplications for Big Data Analytics
Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail:
More informationAnalytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world
Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationBig Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012
Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation
More informationMicrosoft Big Data Solutions. Anar Taghiyev P-TSP E-mail: b-anarta@microsoft.com;
Microsoft Big Data Solutions Anar Taghiyev P-TSP E-mail: b-anarta@microsoft.com; Why/What is Big Data and Why Microsoft? Options of storage and big data processing in Microsoft Azure. Real Impact of Big
More informationQLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment
More informationBIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationCloud Scale Distributed Data Storage. Jürmo Mehine
Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationTesting 3Vs (Volume, Variety and Velocity) of Big Data
Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used
More informationRole of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop
Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,
More informationStep by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015
Step by Step: Big Data Technology Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015 Data Sources IT Infrastructure Analytics 2 B y 2015, 20% of Global 1000 organizations
More informationIBM Software Top tips for securing big data environments
IBM Software Top tips for securing big data environments Why big data doesn t have to mean big security challenges 2 Top Comprehensive tips for securing data big protection data environments for physical,
More informationBig Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
More informationBig Data: Tools and Technologies in Big Data
Big Data: Tools and Technologies in Big Data Jaskaran Singh Student Lovely Professional University, Punjab Varun Singla Assistant Professor Lovely Professional University, Punjab ABSTRACT Big data can
More informationHadoop Big Data for Processing Data and Performing Workload
Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationThe Next Wave of Data Management. Is Big Data The New Normal?
The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management
More informationEvaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing
Evaluating NoSQL for Enterprise Applications Dirk Bartels VP Strategy & Marketing Agenda The Real Time Enterprise The Data Gold Rush Managing The Data Tsunami Analytics and Data Case Studies Where to go
More informationSearch and Real-Time Analytics on Big Data
Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its
More informationHow Big Is Big Data Adoption? Survey Results. Survey Results... 4. Big Data Company Strategy... 6
Survey Results Table of Contents Survey Results... 4 Big Data Company Strategy... 6 Big Data Business Drivers and Benefits Received... 8 Big Data Integration... 10 Big Data Implementation Challenges...
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationAffordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale
WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept
More information5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
More informationBig Data and Hadoop for the Executive A Reference Guide
Big Data and Hadoop for the Executive A Reference Guide Overview The amount of information being collected by companies today is incredible. Wal- Mart has 460 terabytes of data, which, according to the
More informationData Modeling for Big Data
Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes
More informationHadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services
Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationBig Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based
More information#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld
Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case
More informationBig Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
More informationThe evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect
The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect IT Insight podcast This podcast belongs to the IT Insight series You can subscribe to the podcast through
More informationReal Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationLuncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
More informationTapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
More informationTap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
More informationBig Data Solutions. Portal Development with MongoDB and Liferay. Solutions
Big Data Solutions Portal Development with MongoDB and Liferay Solutions Introduction Companies have made huge investments in Business Intelligence and analytics to better understand their clients and
More informationCost-Effective Business Intelligence with Red Hat and Open Source
Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,
More informationW H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
More informationChapter 6. Foundations of Business Intelligence: Databases and Information Management
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
More informationBig Data and Natural Language: Extracting Insight From Text
An Oracle White Paper October 2012 Big Data and Natural Language: Extracting Insight From Text Table of Contents Executive Overview... 3 Introduction... 3 Oracle Big Data Appliance... 4 Synthesys... 5
More informationHadoop implementation of MapReduce computational model. Ján Vaňo
Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed
More informationAlexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data
INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are
More informationINTRODUCTION TO CASSANDRA
INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationOpen source large scale distributed data management with Google s MapReduce and Bigtable
Open source large scale distributed data management with Google s MapReduce and Bigtable Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory
More informationUsing Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM
Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that
More informationBig Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
More informationWhy Big Data in the Cloud?
Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data
More informationJournal of Environmental Science, Computer Science and Engineering & Technology
JECET; March 2015-May 2015; Sec. B; Vol.4.No.2, 202-209. E-ISSN: 2278 179X Journal of Environmental Science, Computer Science and Engineering & Technology An International Peer Review E-3 Journal of Sciences
More informationBig Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
More informationEnd to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
More informationHadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
More informationMicrosoft Big Data. Solution Brief
Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,
More informationIBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst
ESG Brief IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst Abstract: Many enterprise organizations claim that they already
More informationBig Data Big Data/Data Analytics & Software Development
Big Data Big Data/Data Analytics & Software Development Danairat T. danairat@gmail.com, 081-559-1446 1 Agenda Big Data Overview Business Cases and Benefits Hadoop Technology Architecture Big Data Development
More informationData Services Advisory
Data Services Advisory Modern Datastores An Introduction Created by: Strategy and Transformation Services Modified Date: 8/27/2014 Classification: DRAFT SAFE HARBOR STATEMENT This presentation contains
More informationGigaSpaces Real-Time Analytics for Big Data
GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and
More informationOracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
More informationDatabases & Business Intelligence Part 1
Welcome back! We will have more fun. Databases & Business Intelligence Part 1 BUSA345 Lecture #8-1 Claire Hitosugi, PhD, MBA In the previous lecture We learned Define Open Source Software (OSS) and provide
More informationDatenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
More informationA B S T R A C T. Index Terms: Hadoop, Clickstream, I. INTRODUCTION
Big Data Analytics with Hadoop on Cloud for Masses Rupali Sathe,Srijita Bhattacharjee Department of Computer Engineering Pillai HOC College of Engineering and Technology, Rasayani A B S T R A C T Businesses
More informationBig Data at Cloud Scale
Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For
More informationOpen Source Technologies on Microsoft Azure
Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions
More informationForecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014
Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/
More informationMicrosoft Azure Data Technologies: An Overview
David Chappell Microsoft Azure Data Technologies: An Overview Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Blobs... 3 Running a DBMS in a Virtual Machine... 4 SQL Database...
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationKeywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop
Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Transitioning
More informationHadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
More informationFoundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Wienand Omta Fabiano Dalpiaz 1 drs. ing. Wienand Omta Learning Objectives Describe how the problems of managing data resources
More informationLarge scale processing using Hadoop. Ján Vaňo
Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine
More informationThe 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
More informationBig Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
More informationIntegrating Big Data into the Computing Curricula
Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big
More informationBig Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置
More informationCloud Computing and Big Data What Technical Writers Need to Know
Cloud Computing and Big Data What Technical Writers Need to Know Greg Olson, Senior Director Black Duck Software For the Society of Technical Writers Berkeley Chapter Black Duck 2014 Agenda Introduction
More informationwww.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach
www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach Nic Caine NoSQL Matters, April 2013 Overview The Problem Current Big Data Analytics Relationship Analytics Leveraging
More informationBIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
More informationBig Data Defined Introducing DataStack 3.0
Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...
More informationDATA MINING WITH HADOOP AND HIVE Introduction to Architecture
DATA MINING WITH HADOOP AND HIVE Introduction to Architecture Dr. Wlodek Zadrozny (Most slides come from Prof. Akella s class in 2014) 2015-2025. Reproduction or usage prohibited without permission of
More informationBig Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect
on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze
More informationAligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed
More informationApache Hadoop: The Big Data Refinery
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
More informationNoSQL for SQL Professionals William McKnight
NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to
More informationwww.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization
More information