Data Science and Big Data: Below the Surface and Implications for Governance
|
|
|
- Verity Charles
- 10 years ago
- Views:
Transcription
1 Data Science and Big Data: Below the Surface and Implications for Governance Randy Soper The views expressed are those of the author and do not reflect the official position or policy of the Defense Intelligence Agency, the Department of Defense or its components, or the United States Government. 1
2 A (Typical?) Data Science/Big Data Story From Scott Adams s Dilbert Pointy Haired Boss technically (and managerially) clueless, always chasing the latest buzzword Dogbert high-paid consultant, questionable ethical framework 2
3 A (Typical?) Data Science/Big Data Story Companies seem to be really excited about the Big Data -thingy maybe I should contract out for some of that? Dogbert the Data Scientist 3
4 A (Typical?) Data Science/Big Data Story I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Dogbert the Data Scientist 4
5 A (Typical?) Data Science/Big Data Story I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Is the P.H.B. happy??? 5
6 A (Typical?) Data Science/Big Data Story I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Of course!!! 6
7 A (Typical?) Data Science/Big Data Story But I used big data and machine learning to (beyond our understanding build a predictive of the analytics capability for personalities involved your inventory here) flows for just-in-time delivery and I also developed a dashboard There are concepts we need to based on customer sentiment analysis of understand social media feeds to push alerts to your And questions sales staff we about should real-time be asking regional trends of interest in your product line. 7
8 What s Really Going On? Let s Unpack This I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Dogbert the Data Scientist 8
9 What s Really Going On? Let s Unpack This I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Dogbert the Data Scientist What s a data scientist? 9
10 Data Science is a Team Sport Subject matter knowledge/domain expert IT skills (development/ infrastructure) Statistics/mathematical skills The Data Science Venn Diagram by D. Conway; Booz, Allen, Hamilton; and others 10
11 Data Science is a Team Sport Subject matter knowledge/domain expert IT skills (development/ infrastructure) Statistics/mathematical skills The Unicorn (rare and wonderful) The Data Science Venn Diagram by D. Conway; Booz, Allen, Hamilton; and others 11
12 What s Really Going On? Let s Unpack This I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Dogbert the Data Scientist Is this what we need? What s our requirement? 12
13 Start with the Requirements Data science/big data is about infrastructure, data, data pre-processing and aggregation, analytic tools, data scientists, analytic techniques, actionable deliverable product Buy systems, buy data, buy tools, hire talent? The data and the tools are the shiny objects First step what are my business objectives? These should drive everything (architecture, data, tools) 13
14 What s Really Going On? Let s Unpack This I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. What s big data? Dogbert the Data Scientist 14
15 Big Data is??? Big data may be more than just a lot of data Big data isn t just unstructured data/nosql/hadoop (Although these are frequently powerful components!) Big data is fundamentally about the three (four) V s Volume, Variety, Velocity, (Veracity) 15
16 The V s of Big Data Volume Corporate data warehousing Log data, sensor data ( IoT) Social media Document corpus Speech, image, video Etc., etc., etc. Variety Structured, semi-structured, unstructured Velocity Rate of ingest, rate of analysis, decision automation Veracity Untrusted/unknown source, untreated data (Doesn t define big data like the others, but frequently accompanies ) 16
17 Not Only SQL (NoSQL) Rational Database Management System (RDBMS) Emphasis on ACID properties (Atomicity, Consistency, Isolation and Durability) NoSQL Schema-free ( V = variety!) High performance (no joins!) Scalable ( V = volume!) NoSQL does not address the velocity V 17
18 Not Only SQL (NoSQL) Rational Database Management System (RDBMS) Emphasis on ACID properties (Atomicity, Consistency, Isolation and Durability) NoSQL Schema-free ( V = variety!) High performance (no joins!) Scalable ( V = volume!) NoSQL does not address the velocity V NoSQL couchdb accumulo 18
19 Hadoop / MapReduce Master node Cluster 19 Distributed computation on commodity hardware (Intel/AMD x86 processors) across cluster against key-value pair operations Data/compute collocated Scalable, schema-free suitable for NoSQL computation Redundant storage resistant to node failure
20 Big Data vs. Data Science Data science application of IT capability, domain knowledge, and/or statistics to obtain new business value from (conglomerations of) data Big Data data challenges involving one or more of the V s 20
21 Big Data vs. Data Science Data science application of IT capability, domain knowledge, and/or statistics to obtain new business value from (conglomerations of) data Big Data data challenges involving one or more of the V s 21 All but the most pedestrian of big data problems use data science. Not all data science problems involve the V s of big data.
22 What s Really Going On? Let s Unpack This I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Dogbert the Data Scientist What are data science techniques? 22
23 The Work of Data Science Data acquisition Internal (policy, transfer) Purchase Stream e.g., social media, etc., exposed via Application Programming Interface (API) Data manipulation, extract, transform, load (ETL), aggregation Data lake Natural language parsing (including sentiment analysis) Statistics Characteristics: ordinal/likert data, mixed inputs categories, geospatial data, binary/yes-no results, etc. Special regressions (e.g., logistic) Numerical techniques including supervised/unsupervised machine learning Random forests, clustering, Bayesian analysis, deep-learning neural nets, Monte Carlo simulations Visualization, sense-making 23
24 Data Science Ecosystem Written report Alerts/dashboards Exposed API Analytic tools and discoverable data Product delivery Machine learning Natural language processing Regression Visualization Analytic tools Data lake ETL Manual munging & wrangling Parsing Tagging Data conditioning & aggregation Owned batch data Owned streaming data (log, sensor, etc.) External/purchased data (batch) External streaming data Data sources Infrastructure 24
25 Social Media as Customer Data Twitter exposes 1% of all tweets on a public, no charge API 100% of tweets are available, live, stream through cost service if you tweet, you are Twitter s product! Companies use for real-time, geolocated information about customer (and competitor) behavior 25
26 26 Comparative Word Clouds of ISACA International and ISACA NCAC Official Twitter Feeds
27 What s Really Going On? Let s Unpack This I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Dogbert the Data Scientist What are big data/data science products or enterprise delivery options? 27
28 Variety of Mechanisms to Deliver Data Science Value Streaming information, processed at scale in real-time? May want to consider real-time alerting for immediate decision But, need to make sure decision-making framework and personnel are prepared to capitalize Other more traditional options may be just as viable 28
29 Predictive Analytics Business/government moving from using data for retrospective understanding Patterns, sense-making, visualization to predictive tools for proactive response Predictive models built from statistical analysis Still primarily a future state 29
30 Some Thoughts on Governance/Controls I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Dogbert the Data Scientist Warning big data may mean new data sources, data sharing, and data policies 30
31 Data Science and Big Data Can Mean Unprecedented sharing of data Unprecedented accumulations of data Investment to purchase data Corporate recognition of increased business value in data and in more kinds of data Direct sale of or exposure of data or direct derivatives What are your use-case specific best controls? 31
32 Data Science and Big Data Can Mean Unprecedented sharing of data I want to lay out three things that the private sector can Unprecedented do today that will protect accumulations them from the vast of majority data of attacks, from the Chinese and elsewhere. One: Patch IT software obsessively. Investment to purchase data Two: Segment your data. A single breach shouldn t give attackers access to a mother lode of proprietary data Corporate Three: Pay recognition attention to the threat of bulletins increased that DHS business and value FBI in put data out. and in more kinds of data And, if there s a fourth commandment, it s this: Teach folks what spear phishing looks like. Direct sale of or exposure of data or direct - Director of National Intelligence Clapper at the 2015 derivatives International Conference on Cyber Security What are your use-case specific best controls? 32
33 Data Science and Big Data Can Mean Unprecedented sharing of data I want to lay out three things that the private sector can Unprecedented do today that will protect accumulations them from the vast of majority data of attacks, from the Chinese and elsewhere. One: Patch IT software obsessively. Investment to purchase data Two: Segment your data. A single breach shouldn t give attackers access to a mother lode of proprietary data Corporate Three: Pay recognition attention to the threat of bulletins increased that DHS business and value FBI in put data out. and in more kinds of data And, if there s a fourth commandment, it s this: Teach folks what spear phishing looks like. Direct sale of or exposure of data or direct - Director of National Intelligence Clapper at the 2015 derivatives International Conference on Cyber Security What are your use-case specific best controls? 33
34 Some Thoughts on Governance/Controls I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Dogbert the Data Scientist Warning uncontrolled data science = development tools in production environment against live data 34
35 Doing Data Science Many commercial and open source data science/big data capabilities available: Focused on log-file analysis, visualization, business analytics, data integration, democratization of analytics, etc. 35
36 it s about playing with the data! Initially and for evolution/maintenance, data scientists will want to bring flexible analytics to real business data 36
37 it s about playing with the data! Initially and for evolution/maintenance, data scientists will want to bring flexible analytics to real business data The domain of business value discovery for data science 37
38 it s about playing with the data! Initially and for evolution/maintenance, data scientists will want to bring flexible analytics to real business data Excel Hadoop: - MapReduce - Pig - Hive MATLAB Python 38 R
39 Some Thoughts on Governance/Controls I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Dogbert the Data Scientist Warning application of illdefined/broad concepts could lead inconsistent/non-repeatable results in key business processes 39
40 Some Thoughts on Governance/Controls I used big data and machine learning to build a predictive analytics capability for your inventory flows for just-in-time delivery and I also developed a dashboard based on customer sentiment analysis of social media feeds to push alerts to your sales staff about real-time regional trends of interest in your product line. Dogbert the Data Scientist Warning analysis of RoI on start-up big data/data science efforts especially challenging, but needs to be baked in 40
41 41 Questions?
42 42 Backup
43 Governance/Controls Bonus What s this Internet of Things (IoT)??? Imagine your car navigation, calendar, clock, and coffeemaker having the ability to communicate. You have a high priority, early morning meeting. Your navigation system knows that there s a major traffic accident and your commute will be longer than normal. Therefore your clock automatically resets your wakeup alarm earlier and your coffee maker resets your auto-brew time earlier to get you to your meeting on time! Now ask: what are the IT security implications of this degree of connectedness? 43
44 Governance/Controls Bonus What s this Internet of Things (IoT)??? Imagine your car navigation, calendar, clock, and coffeemaker having the ability to communicate. You have a high priority, early morning meeting. Your navigation system knows that there s a major traffic accident and your commute will be longer than normal. Therefore your clock automatically resets your wakeup alarm earlier and your coffee maker resets your auto-brew time earlier to get you to your meeting on time! Now ask: what are the IT security implications of this degree of connectedness? 44
Integrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
This Symposium brought to you by www.ttcus.com
This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data
End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
Are You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
Keywords Big Data, NoSQL, Relational Databases, Decision Making using Big Data, Hadoop
Volume 4, Issue 1, January 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Transitioning
Understanding traffic flow
White Paper A Real-time Data Hub For Smarter City Applications Intelligent Transportation Innovation for Real-time Traffic Flow Analytics with Dynamic Congestion Management 2 Understanding traffic flow
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
Transforming the Telecoms Business using Big Data and Analytics
Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe
The 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
Are You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
The Internet of Things and Big Data: Intro
The Internet of Things and Big Data: Intro John Berns, Solutions Architect, APAC - MapR Technologies April 22 nd, 2014 1 What This Is; What This Is Not It s not specific to IoT It s not about any specific
Exploiting Data at Rest and Data in Motion with a Big Data Platform
Exploiting Data at Rest and Data in Motion with a Big Data Platform Sarah Brader, [email protected] What is Big Data? Where does it come from? 12+ TBs of tweet data every day 30 billion RFID tags
Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
Data Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns
How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization
Information Builders Mission & Value Proposition
Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns
BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
Big Data Are You Ready? Jorge Plascencia Solution Architect Manager
Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data: The Datafication Of Everything Thoughts Devices Processes Thoughts Things Processes Run the Business Organize data to do something
How To Make Data Streaming A Real Time Intelligence
REAL-TIME OPERATIONAL INTELLIGENCE Competitive advantage from unstructured, high-velocity log and machine Big Data 2 SQLstream: Our s-streaming products unlock the value of high-velocity unstructured log
The Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems
Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems Volker Markl [email protected] dima.tu-berlin.de dfki.de/web/research/iam/ bbdc.berlin Based on my 2014 Vision Paper On
5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
Testing Big data is one of the biggest
Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing
HDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
Business Intelligence for Big Data
Business Intelligence for Big Data Will Gorman, Vice President, Engineering May, 2011 2010, Pentaho. All Rights Reserved. www.pentaho.com. What is BI? Business Intelligence = reports, dashboards, analysis,
Introduction to Big Data! with Apache Spark" UC#BERKELEY#
Introduction to Big Data! with Apache Spark" UC#BERKELEY# So What is Data Science?" Doing Data Science" Data Preparation" Roles" This Lecture" What is Data Science?" Data Science aims to derive knowledge!
2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist
2015 Analyst and Advisor Summit Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist Agenda Key Facts Offerings and Capabilities Case Studies When to Engage
Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.
Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their
Big Data Analytics Roadmap Energy Industry
Douglas Moore, Principal Consultant, Architect June 2013 Big Data Analytics Energy Industry Agenda Why Big Data in Energy? Imagine Overview - Use Cases - Readiness Analysis - Architecture - Development
How the oil and gas industry can gain value from Big Data?
How the oil and gas industry can gain value from Big Data? Arild Kristensen Nordic Sales Manager, Big Data Analytics [email protected], tlf. +4790532591 April 25, 2013 2013 IBM Corporation Dilbert
Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed
Big Data. Fast Forward. Putting data to productive use
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
Big Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
Addressing Open Source Big Data, Hadoop, and MapReduce limitations
Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?
Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services
Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the
Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
Big Data Big Data/Data Analytics & Software Development
Big Data Big Data/Data Analytics & Software Development Danairat T. [email protected], 081-559-1446 1 Agenda Big Data Overview Business Cases and Benefits Hadoop Technology Architecture Big Data Development
Testing 3Vs (Volume, Variety and Velocity) of Big Data
Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used
Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
Improving Data Processing Speed in Big Data Analytics Using. HDFS Method
Improving Data Processing Speed in Big Data Analytics Using HDFS Method M.R.Sundarakumar Assistant Professor, Department Of Computer Science and Engineering, R.V College of Engineering, Bangalore, India
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms
Microsoft SQL Server 2012 with Hadoop
Microsoft SQL Server 2012 with Hadoop Debarchan Sarkar Chapter No. 1 "Introduction to Big Data and Hadoop" In this package, you will find: A Biography of the author of the book A preview chapter from the
Datenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
How To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 [email protected] www.scch.at Michael Zwick DI
Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate
Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate Description The Helzberg School of Management has launched two graduate-level certificates: one in Data
Cloud Big Data Architectures
Cloud Big Data Architectures Lynn Langit QCon Sao Paulo, Brazil 2016 About this Workshop Real-world Cloud Scenarios w/aws, Azure and GCP 1. Big Data Solution Types 2. Data Pipelines 3. ETL and Visualization
Cloud Integration and the Big Data Journey - Common Use-Case Patterns
Cloud Integration and the Big Data Journey - Common Use-Case Patterns A White Paper August, 2014 Corporate Technologies Business Intelligence Group OVERVIEW The advent of cloud and hybrid architectures
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
Big Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD
Big Analytics for Space Exploration, Entrepreneurship and Policy Opportunities Tiffani Crawford, PhD Big Analytics Characteristics Large quantities of many data types Structured Unstructured Human Machine
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
Fight fire with fire when protecting sensitive data
Fight fire with fire when protecting sensitive data White paper by Yaniv Avidan published: January 2016 In an era when both routine and non-routine tasks are automated such as having a diagnostic capsule
Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?
Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner [email protected] @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily
Embedded inside the database. No need for Hadoop or customcode. True real-time analytics done per transaction and in aggregate. On-the-fly linking IP
Operates more like a search engine than a database Scoring and ranking IP allows for fuzzy searching Best-result candidate sets returned Contextual analytics to correctly disambiguate entities Embedded
SURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
Hadoop implementation of MapReduce computational model. Ján Vaňo
Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed
DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights
DATA EXPERTS We accelerate research and transform data to help you create actionable insights WE MINE WE ANALYZE WE VISUALIZE Domains Data Mining Mining longitudinal and linked datasets from web and other
BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE
BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld
Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case
Data Modeling for Big Data
Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes
HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica
HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica So What s the market s definition of Big Data? Datasets whose volume, velocity, variety
Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com
Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012
Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.
Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology
SAP and Hortonworks Reference Architecture
SAP and Hortonworks Reference Architecture Hortonworks. We Do Hadoop. June Page 1 2014 Hortonworks Inc. 2011 2014. All Rights Reserved A Modern Data Architecture With SAP DATA SYSTEMS APPLICATIO NS Statistical
An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP Oracle ESG Data Systems Architecture
An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP ESG Data Systems Architecture Big Data & Analytics as a Service Components Unstructured Data / Sparse Data of Value
W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
North Highland Data and Analytics. Data Governance Considerations for Big Data Analytics
North Highland and Analytics Governance Considerations for Big Analytics Agenda Traditional BI/Analytics vs. Big Analytics Types of Requiring Governance Key Considerations Information Framework Organizational
Analyzing Big Data with AWS
Analyzing Big Data with AWS Peter Sirota, General Manager, Amazon Elastic MapReduce @petersirota What is Big Data? Computer generated data Application server logs (web sites, games) Sensor data (weather,
Big Data for Investment Research Management
IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment Management firms turn big data into actionable
Big Data and Analytics in Government
Big Data and Analytics in Government Nov 29, 2012 Mark Johnson Director, Engineered Systems Program 2 Agenda What Big Data Is Government Big Data Use Cases Building a Complete Information Solution Conclusion
P4.1 Reference Architectures for Enterprise Big Data Use Cases Romeo Kienzler, Data Scientist, Advisory Architect, IBM Germany, Austria, Switzerland
P4.1 Reference Architectures for Enterprise Big Data Use Cases Romeo Kienzler, Data Scientist, Advisory Architect, IBM Germany, Austria, Switzerland IBM Center of Excellence for Data Science, Cognitive
BIG DATA STRATEGY. Rama Kattunga Chair at American institute of Big Data Professionals. Building Big Data Strategy For Your Organization
BIG DATA STRATEGY Rama Kattunga Chair at American institute of Big Data Professionals Building Big Data Strategy For Your Organization In this session What is Big Data? Prepare your organization Building
A New Era Of Analytic
Penang egovernment Seminar 2014 A New Era Of Analytic Megat Anuar Idris Head, Project Delivery, Business Analytics & Big Data Agenda Overview of Big Data Case Studies on Big Data Big Data Technology Readiness
The Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
Big Data for everyone Democratizing big data with the cloud. Steffen Krause Technical Evangelist @AWS_Aktuell [email protected]
Big Data for everyone Democratizing big data with the cloud Steffen Krause Technical Evangelist @AWS_Aktuell [email protected] Does this Data make me look big? Overview Designing big data solutions in
How To Use Big Data For Business
Big Data Maturity - The Photo and The Movie Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson Mike
Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager [email protected]
Internals of Hadoop Application Framework and Distributed File System
International Journal of Scientific and Research Publications, Volume 5, Issue 7, July 2015 1 Internals of Hadoop Application Framework and Distributed File System Saminath.V, Sangeetha.M.S Abstract- Hadoop
BIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
PALANTIR CYBER An End-to-End Cyber Intelligence Platform for Analysis & Knowledge Management
PALANTIR CYBER An End-to-End Cyber Intelligence Platform for Analysis & Knowledge Management INTRODUCTION Traditional perimeter defense solutions fail against sophisticated adversaries who target their
Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015
Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO Big Data Everywhere Conference, NYC November 2015 Agenda 1. Challenges with Risk Data Aggregation and Risk Reporting (RDARR) 2. How a
Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges
Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges James Campbell Corporate Systems Engineer HP Vertica [email protected] Big
Towards Smart and Intelligent SDN Controller
Towards Smart and Intelligent SDN Controller - Through the Generic, Extensible, and Elastic Time Series Data Repository (TSDR) YuLing Chen, Dell Inc. Rajesh Narayanan, Dell Inc. Sharon Aicler, Cisco Systems
An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise
An Oracle White Paper October 2011 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5
This Conference brought to you by www.ttcus.com
This Conference brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com U.S. Army Intelligence and Security Command Army
Tap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
