Big Data Analytics. What to Do with Big Data? V. CHRISTOPHIDES. Department of Computer Science University of Crete. Data contains value and knowledge

Size: px
Start display at page:

Download "Big Data Analytics. What to Do with Big Data? V. CHRISTOPHIDES. Department of Computer Science University of Crete. Data contains value and knowledge"

Transcription

1 Big Data Analytics V. CHRISTOPHIDES Department of Computer Science University of Crete 1 What to Do with Big Data? Data contains value and knowledge But to extract the knowledge data needs to be Stored Managed And ANALYZED Data Analysis include: Mine/summarize large datasets Extract knowledge from past data Predict trends in future data Data Mining Big Data Data Analytics Data Science 2 1 1

2 A Bit of Terminology Data mining is the old big data: an overused term including anything such as collecting, storing, curating and visualizing data machine learning / AI (which predates the term data mining) non-ml data mining (as in "knowledge discovery", where the focus is on new knowledge, not on learning of existing knowledge) "Business intelligence", "business analytics are marketing terms stressing that more data leads to better business decisions (periodic reporting as well as ad hoc queries, importance of tools and dashboards); Most "Big Data" today isn't ML: It's Extract, Transform, Load (ETL), so it is replacing data warehousing (except computational advertisement) Business Intelligence aims at descriptive statistics with data with high information density to measure things, detect trends etc. Big Data targets inductive statistics with data with low information density whose huge volume allow to infer laws (regressions ) and thus giving (with the limits of inference reasoning) to Big Data some predictive capabilities (called Deep Analytics) 3 Data Analysis: ERP & CRM Examples What is the most effective distribution channel? Who are our lowest/highest margin customers? Who are my customers and what products are they buying? What product prom- -otions have the biggest impact on revenue? Agrawal et al., VLDB 2010 Tutorial What impact will new products/services have on revenue and margins? Which customers are most likely to go to the competition? 4 2 2

3 Data Analysis Examples in Urban Computing What would the impacts be of a Fare change? Where are our lowest/highest margin passengers? What is the distribution of trip lengths? What is the quickest route from midtown To downtown at 4pm on Monday? Agrawal et al., VLDB 2010 Tutorial What impact will the introduction of additional medallions have? Where should drivers go to get passengers? 5 Data Analysis in Computational Advertizing&Marketing Computational advertising finds the best match between a given user in a given context and a suitable advertisement In 2011, over $100 billion was spent In online advertising (emarketer) A modern advertising analytic platform: Will build behavioral profiles on 100 million plus individuals Use full statistical models (not rules) for targeting Re-analyze all of the data each night Serve 10,000 s of ads per second using statistical models Respond <100ms (with analytics < 10 ms) Use real time geolocation data Do analytics at machine speed Be driven by an analyst with only modest training A. Broder, V. Josifovski, Introduction to computational advertising Autumn

4 Recommendation as Data Mining Estimate a utility function that automatically predicts how a user will like an item Based on: Past behavior Relations to other users Item similarity Context Web search context (SERP advertising) Web page content context (content match advertising and banners) Mobile, ambient context The Recommender Problem Revisited X. Amatriain B. Mobasher KDD 2014 Tutorial 7 Data Information Knowledge Wisdom Hierarchy Cognitive knowledge: "know-what" Advanced skills: "know-how" Systems understanding: "know-why" Self-motivated creativity: "care-why" 8 4 4

5 Interestingness of Patterns Interestingness criteria: Understandable: humans should be able to easily interpret the pattern Valid: hold on new or test data with some certainty Useful: should be possible to act on Unexpected: non-obvious to the system that validates some hypothesis that an analyst seeks to confirm Objective vs. subjective interestingness measures Objective: based on statistics and structures of patterns, e.g., support, confidence, etc. Subjective: based on user s beliefs in the data, e.g., unexpectedness, novelty, actionable, etc. 9 The Data Analysis Spectrum Source: Gartner Value Why did it happen? What happened? Descriptive Analytics Diagnostic Analytics How can we make it happen? Prescriptive Analytics What might happen? Predictive Analytics What is happening? Monitoring (Dashboards, Scorecards) Difficulty

6 Data Mining Methods Use some variables to predict unknown or future values of other variables Find human-interpretable patterns that describe the data 11 Large-Scale, Real-World Analytics Question How can I identify fraudulent activity? How do I segment my customers? Does this product appeal to some segments more than others? Which campaign is working better? How is product ownership distributed across customer segments? How do I target my marketing efforts towards customers most likely to churn? What new products should I offer my customers? What are my customers saying about the new product launch? Method Variable Selection, Logistic Regression K-means Clustering Log-likelihood Mann-Whitney U Test SQL, Cumulative Distribution Functions Logistic Regression Cosine similarity, k-nearest Neighbors MapReduce, NLP, sparse vectors Tools and Technologies for Big Data Steven HillionV.P. Analytics EMC Data Computing Division

7 Algorithmic vs. Statistical Perspectives Computer Scientists Data: are a record of everything that happened Goal: process the data to find interesting patterns and associations Methodology: Develop approximation algorithms under different models of data access since the goal is typically computationally hard Statisticians Data: are a particular random instantiation of an underlying process describing unobserved patterns in the world Goal: is to extract information about the world from noisy data Methodology: Make inferences (perhaps about unseen events) by positing a model that describes the random variability of the data around the deterministic model 13 The Two Perspectives are NOT Incompatible Statistical/probabilistic ideas are central to recent work on developing improved randomized algorithms for matrix problems Intractable optimization problems on graphs/networks yield to approximation when assumptions are made about network participants In boosting (a statistical technique that fits an additive model by minimizing an objective function with a method such as gradient descent), the computation parameter (i.e., the number of iterations) also serves as a regularization parameter Algorithmic and Statistical Perspectives on Large-Scale Data Analysis Michael W. Mahoney Stanford University Feb

8 Data Mining: Different Cultures Data mining overlaps with: Databases (DB): Large-scale data, simple queries Machine Learning (ML): Small data, Complex models Computer Science Theory: (Randomized) Algorithms Different cultures: To a DB person, data mining is an extreme form of analytic processing queries that examine large amounts of data Result is the query answer To a ML person, data-mining is the inference of models Result is the parameters of the model Machine Learning/ Big Data urges for a cross-culture curriculum stressing Statistics on Scalability /AI Algorithms Data Pattern Computing architectures Mining Recognition Automation for handling large data Database systems 15 Small Data are Good but Big - Data Mining The differences, gains and application areas Peter Cochrane

9 No Data Like More Data! Big - Data Mining The differences, gains and application areas Peter Cochrane 17 Underlying Motivation: The Law of Large Numbers! In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed. [Wikipedia]

10 10 10 A Note on the Meaningfulness of Mined Patterns A big data-mining risk is that you may discover patterns that are meaningless Bonferroni s principle: (roughly) if you look in more places for interesting patterns than your amount of data will support, you are bound to find crap Example: Rhine Paradox Joseph Rhine was a parapsychologist in the 1950 s who hypothesized that some people had Extra-Sensory Perception (ESP) He devised an experiment where subjects were asked to guess 10 hidden cards: red or blue He discovered that almost 1 in 1000 had ESP: they were able to get all 10 right! He told these people they had ESP and called them in for another test of the same type Alas, he discovered that almost all of them had lost their ESP What did he conclude? 19 Extracting Knowledge From (Big) Data Exploration & analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns and make predictions KDD : Knowledge Discovery from Databases Iterative and Interactive Process Need to manage the data exploration process 20

11 11 11 We ve Moved into a New Era of Data Analytics 12+ terabytes of Tweets create daily. 5+ million trade events per second. Volume Velocity 100 s of different types of data. Variety Veracity Only1 in 3 decision makers trust their information. 21 Big Data: Small Analysis vs. Big Analysis If you want to analyze the whole set by accessing data several times, it can be much harder Many trial-and-error steps, easy to get lost Most existing data mining/ml methods were designed without considering data access and communication of intermediate results They iteratively use data by assuming they are readily available How to integrate these two worlds together? Efficient in analyzing/mining data Do not scale Efficient in managing big data Does not analyze or mine the data 22

12 12 12 Big Data: Small Analysis vs. Big Analysis Lots of intensive computations for complex math operations need new support in parallel/distributed settings Matrix multiplication QR decomposition (QR factorization) Singular Value Decomposition (SVD) Linear regression So we are facing many challenges methods not ready tools are not convenient platforms rapidly change, 23 Big Data Analytics This is an on-going research topic Roughly there are two types of approaches Parallelize existing (single-machine) algorithms Design new algorithms particularly for distributed settings of course there are things in between To have technical breakthroughs for bigdata analytics, we should know both algorithms and systems well, and consider them together Focused Services (Deep Insights) Deep Analytics Big Data Platform 24

13 13 13 The Evolution of Business Intelligence Speed BI Reporting OLAP & Dataware house Interactive Business Intelligence & In-memory RDBMS Scale QliqView, Tableau, HANA Big Data: Batch Processing & Distributed Data Store Business Objects, Scale SAS, Informatica, Cognos other SQL Reporting Tools Hadoop/Spark; HBase/Cassandra 1990 s 2000 s 2010 s Big Data: Real Time & Single View Graph Databases Speed 25 Example: Matrix-Matrix Product on One Machine Have you ever worried about calculating a math operation in one computer? Probably not: You can use Excel, statistical software (e.g., R, SAS), and many things else and we seldom care internally how these tools work Consider a simple operation like matrix-matrix products: where A segment of C code (assume n = m here) for (i=0;i<n;i++) for (j=0;j<n;j++) { c[i][j]=0; for (k=0;k<n;k++) c[i][j] += a[i][k]*b[k][j]; } Big-data Analytics: Challenges and Opportunities Chih-Jen Lin Department of Computer Science National Taiwan University August 30,

14 14 14 Example: Matrix-matrix Product (Cont'd) For 3000 x 3000 matrices $ gcc -O3 mat.c $ time./a.out 3m24.843s But on Matlab (single-thread mode) $matlab -singlecompthread >> tic; c = a*b; toc Elapsed time is seconds How can Matlab be much faster than ours? The fast implementation comes from some deep R&D Matlab calls optimized BLAS (Basic Linear Algebra Subroutines) that was developed in 80's-90's Our implementation is slow because data are not available for computation Big-data Analytics: Challenges and Opportunities Chih-Jen Lin Department of Computer Science National Taiwan University August 30, Example: Matrix-Matrix Product (Cont'd) increasing in capacity decreasing in speed Optimized BLAS: try to make data available in a higher level of memory You don't waste time to frequently move data Big-data Analytics: Challenges and Opportunities Chih-Jen Lin Department of Computer Science National Taiwan University August 30,

15 15 15 Example: Matrix-Matrix Product (Cont'd) Optimized BLAS uses block algorithms If we compare the number of page faults (cache misses) Ours: much larger Block: much smaller For big-data analytics, we want to run mathematical algorithms (classification and clustering) in a complicated architecture (distributed system) But we are like at the time point before optimized BLAS was developed Big-data Analytics: Challenges and Opportunities Chih-Jen Lin Department of Computer Science National Taiwan University August 30, Big Data Mining Challenges Analytics Architecture combine historical with real-time data at the same time Statistical Significance achieve significant statistical results, and not be fooled by randomness Distributed Mining paralyze data mining techniques with practical & theoretical guarantees Stream Mining learn from evolving data streams Hidden Big Data currently only 3% of the potentially useful data is tagged, and even less is analyzed Sampling and Compression subsampling is easy on one machine, but may not be in a distributed 30 setting

16 16 16 Big Data Mining Projects Apache Mahout Open-source package on Hadoop for data mining and machine learning Revolution R (R-Hadoop) Extensions to R package to run on Hadoop 31 Other Aspects of BIG Data Bigger Data are not always Better Data Big will evolve/change Not all Data are equivalent Just because it is accessible doesn t make it ethical 32

17 17 17 References CS246: Mining Massive Datasets Jure Leskovec, Stanford University, 1014 CS525: Special Topics in DBs Large-Scale Data Management Advanced Analytics on Hadoop Mohamed Eltabakh Spring 2013 Big-data Analytics: Challenges and Opportunities Chih-Jen Lin Department of Computer Science National Taiwan University August 30, 2014 Knowledge Discovery and Data Mining Evgueni Smirnov Maastricht School on Data Mining Department of Knowledge Engineering, Maastricht University, Maastricht, The Netherlands August 27 - August 30,

18 18 18 Big Data Processing and Analysis Framework

19 What Matters when Mining Data J. Leskovec, Stanford CS246: Mining Massive Datasets 38

20 20 20 Data-Mining Tasks Classification Task Regression Task Clustering Task Association-Rule Task 39 Classification Task Given: a collection of instances (training set) Each instances is represented by a set of attributes, one of the attributes is the class attribute Find: a classifier for the class attribute as a function of the values of other attributes Goal: previously unseen instances should be assigned a class as accurately as possible 40

21 Example 1 Tid Refund Marital Status Taxable Income Cheat Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes No Single 75K? Yes Married 50K? No Married 150K? Yes Divorced 90K? No Single 40K? No Married 80K? Test Set 9 No Married 75K No 10 No Single 90K Yes Training Set Learn Classifier Classifier 41 Example 2 Fraud Detection Goal: Predict fraudulent cases in credit card transactions. Approach: Use credit card transactions and the information on its accountholder as attributes When does a customer buy, what does he buy, how often he pays on time, etc Label past transactions as fraud or fair transactions. This forms the class attribute Learn a model for the class of the transactions Use this model to detect fraud by observing credit card transactions on an account 42

22 22 22 Regression Task Predict a value of a given continuous valued variable based on the values of other variables, assuming a linear or nonlinear model of dependency Examples: Predicting sales amounts of new product based on advertising expenditure Predicting wind velocities as a function of temperature, humidity, air pressure, etc. Time series prediction of stock market indices 43 Clustering Task Given a set of data points, each having a set of attributes, and a similarity measure among them, find clusters such that: Data points in one cluster are more similar; Data points in separate clusters are less similar Intra-cluster distances are minimized Inter-cluster distances are maximized 44

23 23 23 Example Market Segmentation: Goal: subdivide a market into distinct subsets of customers where any subset may conceivably be selected as a market target to be reached with a distinct marketing mix Approach: Collect different attributes of customers based on their geographical and lifestyle related information Find clusters of similar customers Measure the clustering quality by observing buying patterns of customers in same cluster vs. those from different clusters 45 Association-Rule Task Given a set of records each of which contain some number of items from a given collection; Produce dependency rules which will predict occurrence of an item based on occurrences of other items TID Items 1 Bread, Coke, Milk 2 Beer, Bread 3 Beer, Coke, Diaper, Milk 4 Beer, Bread, Diaper, Milk 5 Coke, Diaper, Milk Rules Discovered: Milk --> Coke Diaper, Milk --> Beer 46

24 24 24 Example Supermarket shelf management Goal: To identify items that are bought together by sufficiently many customers Approach: Process the point-of-sale data collected with barcode scanners to find dependencies among items. A classic rule -- If a customer buys diaper and milk, then he is very likely to buy beer So, don t be surprised if you find six-packs stacked next to diapers! 47

Big-data Analytics: Challenges and Opportunities

Big-data Analytics: Challenges and Opportunities Big-data Analytics: Challenges and Opportunities Chih-Jen Lin Department of Computer Science National Taiwan University Talk at 台 灣 資 料 科 學 愛 好 者 年 會, August 30, 2014 Chih-Jen Lin (National Taiwan Univ.)

More information

Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Data Mining: Introduction Lecture Notes for Chapter 1 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused - Web

More information

EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data

EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data EMC Greenplum Driving the Future of Data Warehousing and Analytics Tools and Technologies for Big Data Steven Hillion V.P. Analytics EMC Data Computing Division 1 Big Data Size: The Volume Of Data Continues

More information

Foundations of Artificial Intelligence. Introduction to Data Mining

Foundations of Artificial Intelligence. Introduction to Data Mining Foundations of Artificial Intelligence Introduction to Data Mining Objectives Data Mining Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees Present

More information

Introduction to Artificial Intelligence G51IAI. An Introduction to Data Mining

Introduction to Artificial Intelligence G51IAI. An Introduction to Data Mining Introduction to Artificial Intelligence G51IAI An Introduction to Data Mining Learning Objectives Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees

More information

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs 1 Big Data Analytics Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs Montevideo, 22 nd November 4 th December, 2015 INFORMATIQUE

More information

Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining

Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining Introduction of Information Visualization and Visual Analytics Chapter 4 Data Mining Books! P. N. Tan, M. Steinbach, V. Kumar: Introduction to Data Mining. First Edition, ISBN-13: 978-0321321367, 2005.

More information

Data Mining: Introduction

Data Mining: Introduction Data Mining: Introduction Introducing the course How the course is organized How students are evaluated Deadlines Data Mining [Chapt. 1 of course book] What is it about? The KDD process Relations to other

More information

Analytics on Big Data

Analytics on Big Data Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Advanced In-Database Analytics

Advanced In-Database Analytics Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??

More information

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D.

Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D. Data Mining on Social Networks Dionysios Sotiropoulos Ph.D. 1 Contents What are Social Media? Mathematical Representation of Social Networks Fundamental Data Mining Concepts Data Mining Tasks on Digital

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

Architectures for Big Data Analytics A database perspective

Architectures for Big Data Analytics A database perspective Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum

More information

Integrating a Big Data Platform into Government:

Integrating a Big Data Platform into Government: Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Mining Big Data. Pang-Ning Tan. Associate Professor Dept of Computer Science & Engineering Michigan State University

Mining Big Data. Pang-Ning Tan. Associate Professor Dept of Computer Science & Engineering Michigan State University Mining Big Data Pang-Ning Tan Associate Professor Dept of Computer Science & Engineering Michigan State University Website: http://www.cse.msu.edu/~ptan Google Trends Big Data Smart Cities Big Data and

More information

Using Data Mining and Machine Learning in Retail

Using Data Mining and Machine Learning in Retail Using Data Mining and Machine Learning in Retail Omeid Seide Senior Manager, Big Data Solutions Sears Holdings Bharat Prasad Big Data Solution Architect Sears Holdings Over a Century of Innovation A Fortune

More information

Data Mining. Yeow Wei Choong Anne Laurent

Data Mining. Yeow Wei Choong Anne Laurent Data Mining Yeow Wei Choong Anne Laurent Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused Web data, e-commerce purchases at department/ grocery stores Bank/Credit Card

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

CoolaData Predictive Analytics

CoolaData Predictive Analytics CoolaData Predictive Analytics 9 3 6 About CoolaData CoolaData empowers online companies to become proactive and predictive without having to develop, store, manage or monitor data themselves. It is an

More information

Navigating Big Data business analytics

Navigating Big Data business analytics mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what

More information

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems

Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems Volker Markl volker.markl@tu-berlin.de dima.tu-berlin.de dfki.de/web/research/iam/ bbdc.berlin Based on my 2014 Vision Paper On

More information

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Big Data Analytics DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Tom Haughey InfoModel, LLC 868 Woodfield Road Franklin Lakes, NJ 07417 201 755 3350 tom.haughey@infomodelusa.com

More information

CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining. Chengkai Li University of Texas at Arlington Spring 2016

CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining. Chengkai Li University of Texas at Arlington Spring 2016 CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining Chengkai Li University of Texas at Arlington Spring 2016 Big Data http://dilbert.com/strip/2012-07-29 Big Data http://www.ibmbigdatahub.com/infographic/four-vs-big-data

More information

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved. Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

Machine Learning over Big Data

Machine Learning over Big Data Machine Learning over Big Presented by Fuhao Zou fuhao@hust.edu.cn Jue 16, 2014 Huazhong University of Science and Technology Contents 1 2 3 4 Role of Machine learning Challenge of Big Analysis Distributed

More information

DATA MINING - 1DL105, 1Dl111

DATA MINING - 1DL105, 1Dl111 1 DATA MINING - 1DL105, 1Dl111 Fall 2006 An introductory class in data mining http://www.it.uu.se/edu/course/homepage/infoutv/ht06 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users.

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users. Bonus Chapter Ten Major Predictive Analytics Vendors In This Chapter Angoss FICO IBM RapidMiner Revolution Analytics Salford Systems SAP SAS StatSoft, Inc. TIBCO This chapter highlights ten of the major

More information

Data Warehouse design

Data Warehouse design Data Warehouse design Design of Enterprise Systems University of Pavia 10/12/2013 2h for the first; 2h for hadoop - 1- Table of Contents Big Data Overview Big Data DW & BI Big Data Market Hadoop & Mahout

More information

Quick Introduction of Data Mining Techniques

Quick Introduction of Data Mining Techniques Quick Introduction of Data Mining Techniques *Sources partially from Introduction to Data Mining, by P.-N. Tan, M. Steinbach, V. Kumar, Addison-Wesley, 2005. Main Data Mining Techniques Link Analysis Associations

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

HIGH PERFORMANCE ANALYTICS FOR TERADATA

HIGH PERFORMANCE ANALYTICS FOR TERADATA F HIGH PERFORMANCE ANALYTICS FOR TERADATA F F BORN AND BRED IN FINANCIAL SERVICES AND HEALTHCARE. DECADES OF EXPERIENCE IN PARALLEL PROGRAMMING AND ANALYTICS. FOCUSED ON MAKING DATA SCIENCE HIGHLY PERFORMING

More information

ANALYTICS IN BIG DATA ERA

ANALYTICS IN BIG DATA ERA ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut

More information

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist 2015 Analyst and Advisor Summit Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist Agenda Key Facts Offerings and Capabilities Case Studies When to Engage

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

Outline. What is Big data and where they come from? How we deal with Big data?

Outline. What is Big data and where they come from? How we deal with Big data? What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,

More information

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014 Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

Why is Internal Audit so Hard?

Why is Internal Audit so Hard? Why is Internal Audit so Hard? 2 2014 Why is Internal Audit so Hard? 3 2014 Why is Internal Audit so Hard? Waste Abuse Fraud 4 2014 Waves of Change 1 st Wave Personal Computers Electronic Spreadsheets

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

Mammoth Scale Machine Learning!

Mammoth Scale Machine Learning! Mammoth Scale Machine Learning! Speaker: Robin Anil, Apache Mahout PMC Member! OSCON"10! Portland, OR! July 2010! Quick Show of Hands!# Are you fascinated about ML?!# Have you used ML?!# Do you have Gigabytes

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Using In-Memory Data Fabric Architecture from SAP to Create Your Data Advantage

Using In-Memory Data Fabric Architecture from SAP to Create Your Data Advantage SAP HANA Using In-Memory Data Fabric Architecture from SAP to Create Your Data Advantage Deep analysis of data is making businesses like yours more competitive every day. We ve all heard the reasons: the

More information

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata Up Your R Game James Taylor, Decision Management Solutions Bill Franks, Teradata Today s Speakers James Taylor Bill Franks CEO Chief Analytics Officer Decision Management Solutions Teradata 7/28/14 3 Polling

More information

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

Using Tableau Software with Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data

More information

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction

More information

Let the data speak to you. Look Who s Peeking at Your Paycheck. Big Data. What is Big Data? The Artemis project: Saving preemies using Big Data

Let the data speak to you. Look Who s Peeking at Your Paycheck. Big Data. What is Big Data? The Artemis project: Saving preemies using Big Data CS535 Big Data W1.A.1 CS535 BIG DATA W1.A.2 Let the data speak to you Medication Adherence Score How likely people are to take their medication, based on: How long people have lived at the same address

More information

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms

More information

Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction

Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction Data Mining and Exploration Data Mining and Exploration: Introduction Amos Storkey, School of Informatics January 10, 2006 http://www.inf.ed.ac.uk/teaching/courses/dme/ Course Introduction Welcome Administration

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

Data Mining for Fun and Profit

Data Mining for Fun and Profit Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools

More information

Big Data Means at Least Three Different Things. Michael Stonebraker

Big Data Means at Least Three Different Things. Michael Stonebraker Big Data Means at Least Three Different Things. Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume With simple (SQL) analytics With complex (non-sql) analytics Big Velocity Drink from a fire

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH 205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Advanced analytics at your hands

Advanced analytics at your hands 2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

White Paper. Redefine Your Analytics Journey With Self-Service Data Discovery and Interactive Predictive Analytics

White Paper. Redefine Your Analytics Journey With Self-Service Data Discovery and Interactive Predictive Analytics White Paper Redefine Your Analytics Journey With Self-Service Data Discovery and Interactive Predictive Analytics Contents Self-service data discovery and interactive predictive analytics... 1 What does

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

What Does Big Data Mean and Who Will Win? Michael Stonebraker

What Does Big Data Mean and Who Will Win? Michael Stonebraker What Does Big Data Mean and Who Will Win? Michael Stonebraker The Meaning of Big Data - 3 V s Big Volume Business stuff with simple (SQL) analytics Business stuff with complex (non-sql) analytics Science

More information

The Future of Business Analytics is Now! 2013 IBM Corporation

The Future of Business Analytics is Now! 2013 IBM Corporation The Future of Business Analytics is Now! 1 The pressures on organizations are at a point where analytics has evolved from a business initiative to a BUSINESS IMPERATIVE More organization are using analytics

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

1. Introduction to Data Mining

1. Introduction to Data Mining 1. Introduction to Data Mining Road Map What is data mining Steps in data mining process Data mining methods and subdomains Summary 2 Definition ([Liu 11]) Data mining is also called Knowledge Discovery

More information

Francois Ajenstat, Tableau Stephanie McReynolds, Aster Data Steve e Wooledge, Aster Data

Francois Ajenstat, Tableau Stephanie McReynolds, Aster Data Steve e Wooledge, Aster Data Deep Data Exploration: Find Patterns in Your Data Faster & Easier Curt Monash, Founder and President, Monash Research Francois Ajenstat, Tableau Stephanie McReynolds, Aster Data Steve e Wooledge, Aster

More information

Scalable Machine Learning - or what to do with all that Big Data infrastructure

Scalable Machine Learning - or what to do with all that Big Data infrastructure - or what to do with all that Big Data infrastructure TU Berlin blog.mikiobraun.de Strata+Hadoop World London, 2015 1 Complex Data Analysis at Scale Click-through prediction Personalized Spam Detection

More information

SURVEY REPORT DATA SCIENCE SOCIETY 2014

SURVEY REPORT DATA SCIENCE SOCIETY 2014 SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses

More information

Understanding the Value of In-Memory in the IT Landscape

Understanding the Value of In-Memory in the IT Landscape February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to

More information

The Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Analyst @ Expedia

The Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Analyst @ Expedia The Impact of Big Data on Classic Machine Learning Algorithms Thomas Jensen, Senior Business Analyst @ Expedia Who am I? Senior Business Analyst @ Expedia Working within the competitive intelligence unit

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

How To Understand Business Intelligence

How To Understand Business Intelligence An Introduction to Advanced PREDICTIVE ANALYTICS BUSINESS INTELLIGENCE DATA MINING ADVANCED ANALYTICS An Introduction to Advanced. Where Business Intelligence Systems End... and Predictive Tools Begin

More information

Pentaho Data Mining Last Modified on January 22, 2007

Pentaho Data Mining Last Modified on January 22, 2007 Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

More information

AMIS 7640 Data Mining for Business Intelligence

AMIS 7640 Data Mining for Business Intelligence The Ohio State University The Max M. Fisher College of Business Department of Accounting and Management Information Systems AMIS 7640 Data Mining for Business Intelligence Autumn Semester 2013, Session

More information

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are

More information

Big Data at Spotify. Anders Arpteg, Ph D Analytics Machine Learning, Spotify

Big Data at Spotify. Anders Arpteg, Ph D Analytics Machine Learning, Spotify Big Data at Spotify Anders Arpteg, Ph D Analytics Machine Learning, Spotify Quickly about me Quickly about Spotify What is all the data used for? Quickly about Spark Hadoop MR vs Spark Need for (distributed)

More information

Map-Reduce for Machine Learning on Multicore

Map-Reduce for Machine Learning on Multicore Map-Reduce for Machine Learning on Multicore Chu, et al. Problem The world is going multicore New computers - dual core to 12+-core Shift to more concurrent programming paradigms and languages Erlang,

More information

Machine Learning Big Data using Map Reduce

Machine Learning Big Data using Map Reduce Machine Learning Big Data using Map Reduce By Michael Bowles, PhD Where Does Big Data Come From? -Web data (web logs, click histories) -e-commerce applications (purchase histories) -Retail purchase histories

More information

This Symposium brought to you by www.ttcus.com

This Symposium brought to you by www.ttcus.com This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data

More information

<Insert Picture Here> Oracle Retail Data Model Overview

<Insert Picture Here> Oracle Retail Data Model Overview Oracle Retail Data Model Overview The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

More information

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015 Hadoop MapReduce and Spark Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015 Outline Hadoop Hadoop Import data on Hadoop Spark Spark features Scala MLlib MLlib

More information

Hexaware E-book on Predictive Analytics

Hexaware E-book on Predictive Analytics Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Data mining for prediction

Data mining for prediction Data mining for prediction Prof. Gianluca Bontempi Département d Informatique Faculté de Sciences ULB Université Libre de Bruxelles email: gbonte@ulb.ac.be Outline Extracting knowledge from observations.

More information

Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

More information

Introduction to Data Mining and Business Intelligence Lecture 1/DMBI/IKI83403T/MTI/UI

Introduction to Data Mining and Business Intelligence Lecture 1/DMBI/IKI83403T/MTI/UI Introduction to Data Mining and Business Intelligence Lecture 1/DMBI/IKI83403T/MTI/UI Yudho Giri Sucahyo, Ph.D, CISA (yudho@cs.ui.ac.id) Faculty of Computer Science, University of Indonesia Objectives

More information

ANALYTICS IN BIG DATA ERA

ANALYTICS IN BIG DATA ERA ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut

More information

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate

More information

Conjugating data mood and tenses: Simple past, infinite present, fast continuous, simpler imperative, conditional future perfect

Conjugating data mood and tenses: Simple past, infinite present, fast continuous, simpler imperative, conditional future perfect Matteo Migliavacca (mm53@kent) School of Computing Conjugating data mood and tenses: Simple past, infinite present, fast continuous, simpler imperative, conditional future perfect Simple past - Traditional

More information