Sunnie Chung. Cleveland State University

Size: px
Start display at page:

Download "Sunnie Chung. Cleveland State University"

Transcription

1 Sunnie Chung Cleveland State University

2 Data Scientist Big Data Processing Data Mining 2

3 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills: to Handle Big Data to Collect, Process and Extract value from Big Data (giant and diverse data sets) to Understand, Visualize and Present their findings to non-data scientists Ability to Create Data-driven Solutions that boost profits, reduce costs and even help save the world 3

4 And tackle big data projects on every level Big Data and Cloud Projects are in Every CEO s To Do List The Defense Department NASA : Predict Earthquake (specially after Nepal s Earthquake) NSA, Homeland Security : Predict and Prevent Terrorists Acts Internet start-ups Financial institutions 4

5 Volume : Unprecedentedly Huge Volume of Data fueled by web based business, social networking, micro blogs (e.g., click streams captured in web server logs) e.g.) Ebay processes 8 Peta Bytes data per night Various Structures of Data (No Structure) : Structured (Database, Data Warehouse) Semi-structured (Web pages) and Unstructured (Web Server Log, Sensor Data) most of time!! Velocity : Unprecedentedly generate new data at a high rate e.g.) Streaming Twitter Messages Machine-generated data streaming in from smart devices, sensors, monitors and meters needs big data analytics 5

6 Numerous new analytic and business intelligence opportunities like: Fraud detection Customer profiling Customer loyalty analysis All of which directly affect revenue of business and critical business decisions. 6

7 Identifying Field Specific Motive/Purposes Identify Nature of Big Data Source and Data Specific Processes Decisions on Building IT Infrastructure of Big Data Processing Systems Public Cloud/Private Cloud Which MPP Big Data Systems should be built for our specific Big Data Source and Volume Execution of Data Analytics Data Source Modeling Apply Data Mining Strategies Research solutions Implement Big Data Processing Steps for Solutions/Strategies Analyze Results/Interpretation -- Feedback 7

8 Massively Parallel Processing (MPP) Parallel Data Warehouse (PDW) System Oracle, IBM, Teradata, Microsoft Hadoop System with Map Reduce Google, Yahoo, Facebook, Twitter, LinkedIn Hybrid of Both MPP System on Cloud Amazon, Google, Microsoft, Oracle 8

9 MPP System Virtual Machine (VM) Cloud Type Cloud as Service Cloud as Platform Cloud as Service Amazon Elastic Cloud Google Cloud Microsoft Cloud: Azure 9

10 Anomaly detection The identification of unusual data records, that might be interesting or data errors that require further investigation. Association rule learning (Dependency modelling) Searches for relationships between variables. For example a supermarket might gather data on customer purchasing habits. Using association rule learning, the supermarket can determine which products are frequently bought together and use this information for marketing purposes. This is sometimes referred to as market basket analysis. Clustering The task of discovering groups and structures in the data that are in some way or another "similar", without using known structures in the data. Classification The task of generalizing known structure to apply to new data. For example, an program might attempt to classify an as "legitimate" or as "spam". Regression attempts to find a function which models the data with the least error. Summarization Providing a more compact representation of the data set, including visualization and report generation. Results validation 10

11 Statistics Naive Bayes, Clustering > 25 year old Machine Learning Classification Algorithms: Decision Tree, Neural Network >20 year old Database Association Rule Mining, Data Warehouse OLAP > 15 year old All about Big Data Processing Most Current still evolving in fast rate 11

12 Databases Advanced Modern Databases and Data Processing Strategies Big Data Processing with: Parallel Data Warehouse and OLAP (Online Analytic Processing) Map Reduce Hadoop Based MPP Systems Statistics Data Mining - research from Database: Association Rule Mining - research from Statistics: Clustering - research from Machine Learning: Neural Network And More on recent developments 12

13 MPP Systems PDW Based Systems : Oracle, IBM, Tera Data, Microsoft PDW In Memeory NEW SQL Systems Hadoop/MapReduce Based Systems: No SQL systems Mongo DB Pig Latin Hbase Hive And So many Others Cloud: Big Data Processing Systems on Cloud Google Cloud, Amazon Cloud, Microsoft Azure, Oracle, IBM 13

14 14

15 Major Commercial: SAS Enterprise Miner Microsoft Business Intelligence Data Analytic Tool using Databases Popular Free Open Source R/ Map R: A programming language and software environment for statistical computing, data mining, and graphics. GNU Project. Weka: A suite of machine learning software applications written in the Java programming language UIMA:(Unstructured Information Management Architecture) is a component framework for analyzing unstructured content such as text, audio and video originally developed by IBM 15

16 On Databases CIS 530 : Database Concept and Modern Database Processing CIS 611 : Advanced Data Processing Techniques in PDW Parallel Data Warehouse and OLAP On Big Data Processing and Management Systems CIS 612 : Big Data Processing Systems and Modern Database Programming Hadoop and MapReduce - VM(Virtual Machine), Cloud CIS 695: Practicum in Data Analytics and Big Data Processing (Scheduled to be created in Spring 2016) CIS 696: One more new Sunnie courses Chung Cleveland will State be University created on recent research 16

17 Data Mining CIS 660: Data Mining Techniques from Database, Statistics and Machin Learning EEC 525 Data Mining: Web Data Mining Techniques from Database CIS 667: Bioinformatics (Possibly) 17

18 Math and Statistics Graduate Certificate in Applied Predictive Modeling MTH 521 : Time Series Analysis MTH 531 : Categorical Data Analysis MTH 537 : Operation Research MTH 567 : Applied Linear Models I MTH 638 : Operation Research II MTH 668 : Applied Linear Models II MTH 675 : Applied Multivariate Statistics 18

19 Business Analytic Certificates Focus on SAS Certificate with SAS Enterprise Miner Tool BUS 575 : Introduction to Business Analytics BUS 600 : Applied Business Analytics BUS 601 : Managing Databases for Business Analytics BUS 602 : Strategy for Business Analytics BUS 603 : SAS for Data and Statistical Analysis BUS 604: Advanced Business Analytics I BUS 606: Practicum in Business Analytics 19

20 Explorys by IBM website: https://www.explorys.com/ Data Analytic/ Big Data Processing on Health and Wellness Data Data Analytic for Cleveland Clinic (Tera Data PDW), Metro Health Progressive Big Data Processing on Auto Insurance : Hadoop Based MPP Systems PNC (Tera Data MPP PDW) Big Data Processing Systems on Financial Data 20

21 Hadoop Big Data Processing Workshop/Meetup EECS Dept of CSU Planning to host the meeting annually to connect our students to the local Big Data Companies Data Scientist Group Regular webinar on Advanced Data Analytic Topics 21

22 Current Research/Publications at CSU (by Sunnie Chung) Research on the Problems in Developing MPP Systems Research on Integrating Big Data Management Systems (BDBMS) -- Most recent research trends Research on Data Mining for Machine Fault Detection 22

23 10 out of 23 Programs are Master Degrees on Business Analytics Limited in Basic Statistics and Marketing/Business Oriented Data Mining Tools Only (SAS, MS BI Data Analysis Tool) For Data Scientist Oriented Programs (Typical East Coast Theory Oriented Programs: Columbia, NYU, DePaul, etc) Focus on Predictive Analysis Skill (Math and Stats), Computational Theory on Machine Learning Algorithms Oriented Lack of Practical Data Processing Courses or Big Data System/Cloud Not Many Courses are available Good Data Analytics Programs with Good Balance of Core Subjects, Anaytic Skills and Practicum North Western University Indiana University Bloomington Canegie Mellon 23

24 MSIA 401 Statistical Methods for Data Mining MSIA 431 Analytics for Big Data MSIA 489 Industry Practicum MSIA Predictive Models for Credit Risk Managment MSIA Healthcare Analytics MSIA Intro to Java Programming MSIA Social Networks Analysis MSIA 490 Intro to Databases & Information Retrieval MSIA 411 Data Visualization MSIA 420 Predictive Analytics MSIA 421 Data Mining MSIA 430 Introduction to Data Warehousing and Workflow Management MSIA Text Analytics MSIA Topics in Analytics with Python MSIA 440 Optimization and Heuristics 24

25 2 years of Master of Data Science/Data Analytics or Hybrid : Master of Data Science and Computer Information Science Good balance of Courses on Core Subjects: Big Data Processing Application Advanced Database Advanced Algorithm Statistics Data Mining Security in Network System Information Visualization Cloud Computing Variety of good related Courses are available 25

26 MSIT Business Intelligence & Data Analytics Curriculum: Prerequisite: OOP Programming Courses and 3 years Working Experience Course # Core Courses (60 units required) Units Database Management Statistics for IT Managers Economic Analysis Data Warehousing Privacy in the Digital Age Exploring and Visualizing Data Data Mining Analytics and Business Intelligence Advanced Business Analytics 6 26

27 30 credit hours in 2 years CIS 530 : Database Concept and Modern Database Processing CIS 611 : Advanced Data Processing Techniques in Parallel Data Warehouse and OLAP CIS 612 : Big Data Processing Systems and Information Retrieval Hadoop and MapReduce VM(Virtual Machine), Cloud CIS 695: Practicum in Data Analytics and Big Data Processing (In Spring 2016) CIS 660: Data Mining Techniques from Database, Statistics and Machin Learning EEC 525 Data Mining: Web Data Mining Techniques from Database CIS 660: Advanced Algorithm CIS 340: System Programming CIS 260: Java Programming CIS 675 Information Security EEC 693 Network Security and Privacy Applied Predictive Modeling: MTH 531 : Categorical Data Analysis MTH 567 : Applied Linear Models I MTH 668 : Applied Linear Models II MTH 675 : Applied Multivariate Statistics BUS 603 : SAS for Data and Statistical Analysis BUS 604: Advanced Business Analytics I BUS 606: Practicum in Business Analytics 27

28 Data Visualization 28

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University They are very new technologies to Computer Science in rise of Web Service on Internet (IoT) They were fast developed and fast evolving Research and Developments

More information

Cleveland State University

Cleveland State University Cleveland State University CIS 695 Big Data Processing and Data Analytics (3-0-3) 2016 Section 51 Class Nbr. 5493. Tues, Thur TBA Prerequisites: CIS 505 and CIS 530. CIS 612, CIS 660 Preferred. Instructor:

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

Cleveland State University

Cleveland State University Cleveland State University CIS 612 Modern Database Processing & Big Data (3-0-3) Fall 2015 Section 50 Class Nbr. 5378. Tues, Thu 4:30 5:45 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred. Instructor:

More information

An interdisciplinary model for analytics education

An interdisciplinary model for analytics education An interdisciplinary model for analytics education Raffaella Settimi, PhD School of Computing, DePaul University Drew Conway s Data Science Venn Diagram http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Cleveland State University

Cleveland State University Cleveland State University CIS 612 Modern Database Programming & Big Data Processing (3-0-3) Fall 2014 Section 50 Class Nbr. 2670. Tues, Thur 4:00 5:15 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred.

More information

Big Data on Microsoft Platform

Big Data on Microsoft Platform Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4

More information

This Symposium brought to you by www.ttcus.com

This Symposium brought to you by www.ttcus.com This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance. Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics

More information

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard Hadoop and Relational base The Best of Both Worlds for Analytics Greg Battas Hewlett Packard The Evolution of Analytics Mainframe EDW Proprietary MPP Unix SMP MPP Appliance Hadoop? Questions Is Hadoop

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal

Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal Information has gone from scarce to super-abundant. That brings huge new benefits. The Economist

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

The Inside Scoop on Hadoop

The Inside Scoop on Hadoop The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. Orion.Gebremedhin@Neudesic.COM B-orgebr@Microsoft.com @OrionGM The Inside Scoop

More information

Big Data Storage Challenges for the Industrial Internet of Things

Big Data Storage Challenges for the Industrial Internet of Things Big Data Storage Challenges for the Industrial Internet of Things Shyam V Nath Diwakar Kasibhotla SDC September, 2014 Agenda Introduction to IoT and Industrial Internet Industrial & Sensor Data Big Data

More information

ADVANCED ANALYTICS AND FRAUD DETECTION THE RIGHT TECHNOLOGY FOR NOW AND THE FUTURE

ADVANCED ANALYTICS AND FRAUD DETECTION THE RIGHT TECHNOLOGY FOR NOW AND THE FUTURE ADVANCED ANALYTICS AND FRAUD DETECTION THE RIGHT TECHNOLOGY FOR NOW AND THE FUTURE Big Data Big Data What tax agencies are or will be seeing! Big Data Large and increased data volumes New and emerging

More information

Taking Data Analytics to the Next Level

Taking Data Analytics to the Next Level Taking Data Analytics to the Next Level Implementing and Supporting Big Data Initiatives What Is Big Data and How Is It Applicable to Anti-Fraud Efforts? 2 of 20 Definition Gartner: Big data is high-volume,

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

P4.1 Reference Architectures for Enterprise Big Data Use Cases Romeo Kienzler, Data Scientist, Advisory Architect, IBM Germany, Austria, Switzerland

P4.1 Reference Architectures for Enterprise Big Data Use Cases Romeo Kienzler, Data Scientist, Advisory Architect, IBM Germany, Austria, Switzerland P4.1 Reference Architectures for Enterprise Big Data Use Cases Romeo Kienzler, Data Scientist, Advisory Architect, IBM Germany, Austria, Switzerland IBM Center of Excellence for Data Science, Cognitive

More information

Applications for Big Data Analytics

Applications for Big Data Analytics Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail:

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

Big Data Specialized Studies

Big Data Specialized Studies Information Technologies Programs Big Data Specialized Studies Accelerate Your Career extension.uci.edu/bigdata Offered in partnership with University of California, Irvine Extension s professional certificate

More information

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Big Data Analytics DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Tom Haughey InfoModel, LLC 868 Woodfield Road Franklin Lakes, NJ 07417 201 755 3350 tom.haughey@infomodelusa.com

More information

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the

More information

W H I T E P A P E R. Building your Big Data analytics strategy: Block-by-Block! Abstract

W H I T E P A P E R. Building your Big Data analytics strategy: Block-by-Block! Abstract W H I T E P A P E R Building your Big Data analytics strategy: Block-by-Block! Abstract In this white paper, Impetus discusses how you can handle Big Data problems. It talks about how analytics on Big

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools

More information

Hexaware E-book on Predictive Analytics

Hexaware E-book on Predictive Analytics Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Application and practice of parallel cloud computing in ISP. Guangzhou Institute of China Telecom Zhilan Huang 2011-10

Application and practice of parallel cloud computing in ISP. Guangzhou Institute of China Telecom Zhilan Huang 2011-10 Application and practice of parallel cloud computing in ISP Guangzhou Institute of China Telecom Zhilan Huang 2011-10 Outline Mass data management problem Applications of parallel cloud computing in ISPs

More information

Predictive Analytics: Too Important to Ignore The six secrets to success with predictive analytics

Predictive Analytics: Too Important to Ignore The six secrets to success with predictive analytics Predictive Analytics: Too Important to Ignore The six secrets to success with predictive analytics Webinar December 18, 2013 Sponsored by: Tony Cosentino VP & Research Director, Business Analytics Ventana

More information

Building Analytics and Big Data Capabilities Tom Davenport CDB Annual Conference May 23, 2012

Building Analytics and Big Data Capabilities Tom Davenport CDB Annual Conference May 23, 2012 Building Analytics and Big Data Capabilities Tom Davenport CDB Annual Conference May 23, 2012 A Bright Idea Informatics/Analytics on Small and Big Data It works for: Old companies (GE, P&G, Marriott, Bank

More information

Big Data. Lyle Ungar, University of Pennsylvania

Big Data. Lyle Ungar, University of Pennsylvania Big Data Big data will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus. McKinsey Data Scientist: The Sexiest Job of the 21st Century -

More information

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Please note the following IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice

More information

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of

More information

The Big Data Market: Business Case, Market Analysis & Forecasts 2015-2020

The Big Data Market: Business Case, Market Analysis & Forecasts 2015-2020 Brochure More information from http://www.researchandmarkets.com/reports/2983902/ The Big Data Market: Business Case, Market Analysis & Forecasts 2015-2020 Description: Big Data refers to a massive volume

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Big Data Defined Introducing DataStack 3.0

Big Data Defined Introducing DataStack 3.0 Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...

More information

Are You Ready for Big Data?

Are You Ready for Big Data? Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?

More information

Getting Started Practical Input For Your Roadmap

Getting Started Practical Input For Your Roadmap Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

Play with Big Data on the Shoulders of Open Source

Play with Big Data on the Shoulders of Open Source OW2 Open Source Corporate Network Meeting Play with Big Data on the Shoulders of Open Source Liu Jie Technology Center of Software Engineering Institute of Software, Chinese Academy of Sciences 2012-10-19

More information

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW Roger Breu PDW Solution Specialist Microsoft Western Europe Marcus Gullberg PDW Partner Account Manager Microsoft Sweden

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

Bringing Big Data to People

Bringing Big Data to People Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process

More information

Modernizing Your Data Warehouse for Hadoop

Modernizing Your Data Warehouse for Hadoop Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

Azure Data Lake Analytics

Azure Data Lake Analytics Azure Data Lake Analytics Compose and orchestrate data services at scale Fully managed service to support orchestration of data movement and processing Connect to relational or non-relational data

More information

Il mondo dei DB Cambia : Tecnologie e opportunita`

Il mondo dei DB Cambia : Tecnologie e opportunita` Il mondo dei DB Cambia : Tecnologie e opportunita` Giorgio Raico Pre-Sales Consultant Hewlett-Packard Italiana 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject

More information

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem: Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Introduction to Big Data! with Apache Spark UC#BERKELEY# Introduction to Big Data! with Apache Spark" UC#BERKELEY# So What is Data Science?" Doing Data Science" Data Preparation" Roles" This Lecture" What is Data Science?" Data Science aims to derive knowledge!

More information

Harnessing the power of advanced analytics with IBM Netezza

Harnessing the power of advanced analytics with IBM Netezza IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Big data for the Masses The Unique Challenge of Big Data Integration

Big data for the Masses The Unique Challenge of Big Data Integration Big data for the Masses The Unique Challenge of Big Data Integration White Paper Table of contents Executive Summary... 4 1. Big Data: a Big Term... 4 1.1. The Big Data... 4 1.2. The Big Technology...

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required. What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees

More information

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 ISSN 2278-7763. BIG DATA: A New Technology

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 ISSN 2278-7763. BIG DATA: A New Technology International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May-2014 18 BIG DATA: A New Technology Farah DeebaHasan Student, M.Tech.(IT) Anshul Kumar Sharma Student, M.Tech.(IT)

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

White Paper: Datameer s User-Focused Big Data Solutions

White Paper: Datameer s User-Focused Big Data Solutions CTOlabs.com White Paper: Datameer s User-Focused Big Data Solutions May 2012 A White Paper providing context and guidance you can use Inside: Overview of the Big Data Framework Datameer s Approach Consideration

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

Firebird meets NoSQL (Apache HBase) Case Study

Firebird meets NoSQL (Apache HBase) Case Study Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

Big Data. Fast Forward. Putting data to productive use

Big Data. Fast Forward. Putting data to productive use Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation

More information

Big Data and Telecom Analytics Market: Business Case, Market Analysis & Forecasts 2014-2019

Big Data and Telecom Analytics Market: Business Case, Market Analysis & Forecasts 2014-2019 Brochure More information from http://www.researchandmarkets.com/reports/2643647/ Big Data and Telecom Analytics Market: Business Case, Market Analysis & Forecasts 2014-2019 Description: Big Data refers

More information

Data Science Certificate Program

Data Science Certificate Program Information Technologies Programs Data Science Certificate Program Accelerate Your Career extension.uci.edu/datascience Offered in partnership with University of California, Irvine Extension s professional

More information

TABLE OF CONTENTS 1 Chapter 1: Introduction 2 Chapter 2: Big Data Technology & Business Case 3 Chapter 3: Key Investment Sectors for Big Data

TABLE OF CONTENTS 1 Chapter 1: Introduction 2 Chapter 2: Big Data Technology & Business Case 3 Chapter 3: Key Investment Sectors for Big Data TABLE OF CONTENTS 1 Chapter 1: Introduction 1.1 Executive Summary 1.2 Topics Covered 1.3 Key Findings 1.4 Target Audience 1.5 Companies Mentioned 2 Chapter 2: Big Data Technology & Business Case 2.1 Defining

More information

Predictive Analytics Certificate Program

Predictive Analytics Certificate Program Information Technologies Programs Predictive Analytics Certificate Program Accelerate Your Career Offered in partnership with: University of California, Irvine Extension s professional certificate and

More information

Big Data and Industrial Internet

Big Data and Industrial Internet Big Data and Industrial Internet Keijo Heljanko Department of Computer Science and Helsinki Institute for Information Technology HIIT School of Science, Aalto University keijo.heljanko@aalto.fi 16.6-2015

More information

Register on projectbotticelli.com. Introduction to BI & Big Data DAX MDX Data Mining

Register on projectbotticelli.com. Introduction to BI & Big Data DAX MDX Data Mining Objectives The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary

More information

SAP and Hortonworks Reference Architecture

SAP and Hortonworks Reference Architecture SAP and Hortonworks Reference Architecture Hortonworks. We Do Hadoop. June Page 1 2014 Hortonworks Inc. 2011 2014. All Rights Reserved A Modern Data Architecture With SAP DATA SYSTEMS APPLICATIO NS Statistical

More information

Tax Fraud in Increasing

Tax Fraud in Increasing Preventing Fraud with Through Analytics Satya Bhamidipati Data Scientist Business Analytics Product Group Copyright 2014 Oracle and/or its affiliates. All rights reserved. 2 Tax Fraud in Increasing 27%

More information

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing

More information

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business Instructor: Kunpeng Zhang (kzhang@rmsmith.umd.edu) Lecture-Discussions:

More information

Big Data: Tools and Technologies in Big Data

Big Data: Tools and Technologies in Big Data Big Data: Tools and Technologies in Big Data Jaskaran Singh Student Lovely Professional University, Punjab Varun Singla Assistant Professor Lovely Professional University, Punjab ABSTRACT Big data can

More information