1 Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level? Dr. Frank Lee Chair, ECE/CS/IT New York Institute of Technology Old Westbury, NY 11568
2 Topics This talk describes: a new course: CSCI 372 Big Data Analytics the CS program concentration in Big Data Management and Analytics and our partnership with IBM to use their IBM systems in this concentration.
3 The Program Goals The CS program is designed to allow students to gain theoretical knowledge and apply it to developing an in depth specialization in one area of concentration. It prepares graduates to be creative, inquisitive, analytical, and detail oriented.
4 The Undergraduate CS Curriculum The CS curriculum consists of 57 credits in CSrelated courses: 36 credits in CS core courses (12 courses), 6 credits of CS electives (2 courses), One final senior project (3 credits), and A 12 credit concentration in either Network Security or Big Data Management and Analytics
5 The Undergraduate CS Curriculum The 12 CS core courses are: CSCI 125 Computer Programming I CSCI 155 Computer Organization and Architecture CSCI 185 Computer Programming II CSCI 235 Elements of Discrete Structures CSCI 260 Data Structures CSCI 270 Probability and Statistics for CS CSCI 312 Theory of Computation CSCI 318 Programming Language Concepts CSCI 330 Operating Systems CSCI 335 Design and Analysis of Algorithm CSCI 345 Computer Networks CSCI 380 Introduction to Software Engineering
6 The Undergraduate CS Curriculum CSCI 455 Senior Project all students in the CS program are required to complete a substantial project, which utilizes the full extent of the technical skills and knowledge gained throughout the curriculum. CS Program Concentration By the end of the first term of their junior year, computer science majors must select a 12 credit concentration in Network Security or Big Data Management and Analytics
7 The CS Concentration in Big Data Management and Analytics To meet the increasing big data challenges and opportunities, the CS department has: revised its CS program concentration in Big Data Management and Analytics and created a new course, CSCI 372: Big Data Analytics.
8 The CS Concentration in Big Data Management and Analytics The students in this concentration must choose 4 courses from following: CSCI 365 Information Retrieval CSCI 372 Big Data Analytics CSCI 401 Database Interfaces and Programming CSCI 415 Introduction to Data Mining CSCI 405 Distributed Database Systems
9 The CS Concentration in Big Data Management and Analytics This concentration focuses on the management and analysis of big data. It provides students with deep analytic skills to design and implement information systems. It equips students with both the technical knowledge and analytic acumen necessary to extract meaning from big data.
10 Our Partnership with IBM Our partnership with IBM starts with the System z Academic Initiative It is prepared to: Assist and enable NYIT to use and teach IBM Enterprise Systems Connect IBM clients with NYIT to hire students learning critical systems skills
11 Our Partnership with IBM Since IBM is a leader in the big data analytics area, the CS faculty decided to address this growing demand by engaging with IBM in their Academic Initiative and by introducing a course in big data analytics to be included in the concentration.
12 CSCI 372: Big Data Analytics The new course will embrace the IBM Academic Initiative and will introduce the IBM InfoSphere systems, and in partnership with IBM will provide remote access to the Enterprise System and data for a hands onlab experience with big data analytics.
13 CSCI 372: Big Data Analytics Course contents: the basics of data analysis (e.g. R), the tools of big data analytics (e.g. Hadoop, MapReduce, Pig, Hive), the analysis of unstructured data using NoSQL and Hadoop/MapReduce, IBM InfoSphere systems and application areas including finance, banking, defense, and health.
14 To lecture the theoretic fundamentals, the first nine weeks of course work is based on the text book written by Alex Holmes (ISBN: ). Text book
15 To learn IBM InfoSphere Systems, we study: Paul Zikopoulos, Dirk deroos, Krishnan Parasuraman, Thomas Deutsch, James Giles, David Corrigan, Harness the Power of Big Data The IBM Big Data Platform, McGraw Hill, 2013 Reference
16 CSCI 372: Big Data Analytics Its topics: Introduction to Big Data Analytics: Big Data overview Relational Databases & Data Mining Cloud & Big Data Architectures Basics of Data Analysis Introduction to R Analyzing and exploring data with R Statistics for model building and evaluation
17 CSCI 372: Big Data Analytics Introduction to Data Analytics Tools The Hadoop architecture The Hadoop Distributed File System (HDFS) MapReduce Using the Pig platform to create MapReduce programs Using the Hive data warehouse system to query and analyze large data sets Other related Hadoop technologies
18 CSCI 372: Big Data Analytics Analysis of Unstructured Data Using MapReduce/Hadoop for analyzing unstructured data Using NoSQL Scale up vs. Scale out.
19 CSCI 372: Big Data Analytics Data Lab Exercises using IBM InfoSphere BigInsights What is IBM InfoSphere BigInsights? Downloading BigInsights Installing BigInsights 2.0 Setting up a Hadoop cluster on the IBM SmartCloud Enterprise BigInsights Web Console overview
20 CSCI 372: Big Data Analytics Lab Exercises: Stream Computing using IBM InfoSphere Streams What is IBM InfoSphere Streams? Downloading Streams 3.0 Installing Streams 3.0 Introducing the Streams Studio graphical editing environment Introducing the InfoSphere Streams runtime environment Introducing the data visualization capabilities in the Streams Console Use Cases
21 Course Work: Week 1 Reading: Ch. 1 (Textbook); Ch. 1, 2 (Reference); Topic: Big Data; Applying Big Data to Business Problems; Data Storage and Analysis; A Brief History of Hadoop; Apache Hadoop and the Hadoop Ecosystem; Exercises: Downloading and installing Hadoop
22 Course Work: Week 2 Reading: Ch. 2 (Textbook); Topic: Running Hadoop; Moving data in and out of Hadoop; Exercises: # 1 Using Sqoop to import/export data from/to MySQL
23 Course Work: Week 3 Reading: Ch. 3, 4 (Textbook); Topic: Data serialization: working with text and beyond; applying MapReduce patterns to big data Exercises: #2 MapReduce with HBase as a data source; #3: Integrating Protocol Buffers with MapReduce
24 Course Work: Week 4 Reading: Ch. 5, 6 (Textbook); Topic: Streamlining HDFS for big data; Diagnosing and tuning performance problems Exercises: #4 Compression with HDFS, MapReduce, Pig, and Hive, #5 Using stack dumps to discover unoptimized user code
25 Course Work: Week 5 Reading: Ch. 7 (Textbook) Topic: Utilizing data structures and algorithms Exercises: #6 Calculate PageRank over a web graph
26 Course Work: Week 6 Reading: Ch. 8 (Textbook) Topic: Integrating R and Hadoop for statistics and more Exercises: #7 Calculate the cumulative moving average for stocks using RHadoop.
27 Course Work: Week 7 Reading: Ch. 9 (Textbook) Topic: Predictive analytics with Mahout; Exercises: #8 Using Mahout to train and test a spam classifier
29 Course Work: Week 9 Reading: Ch. 10, 11 (Textbook) Topic: Hacking with Hive; Programming pipelines with Pig Exercises: #9 Tuning Hive joins, #10 Combining data in Pig, #11 Pig optimizations
30 Course Work: Week 10 Reading: Ch. 3, 4 (Reference) Topic: The IBM PureData Systems: A Big Data Platform for High Performance Deep Analytics Exercises: none
31 Course Work: Week 11 Reading: Ch. 5 (Reference) Topic: IBM s Enterprise Hadoop: InfoSphere BigInsights Exercises: Installation of IBM InfoSphere BigInsights 2.0 Case study No. 1
32 Course Work: Week 12 Reading: Ch. 6 (Reference) Topic: Real Time Analytical Processing with IBM InfoSphere Streams Exercises: Installation of IBM InfoSphere Streams 3.0 Case study No. 2
33 Course Work: Week 13 Reading: Ch. 7 (Reference) Topic: Unlocking Big Data: Data Exploration and Discovery Exercises: Case study No. 3
34 Course Work: Week 14 Reading: Ch. 8, 9 (Reference) Topic: Text Analysis: The IBM Big Data Analytic Accelerators Exercises: Text Analysis: The IBM Big Data Analytic Accelerators
35 Learning Outcomes At the completion of this course, the students will be able to: 1. Describe the characteristics of the Big Data model vs. the Relational Database model. 2. Use the programming language R for Big Data analysis. 3. Use the MapReduce algorithm as a data mining tool. 4. Distinguish between the Map step and Reduce step of the MapReduce algorithm. 5. Analyze data with Hadoop. 6. Describe the key attributes and differences of the NoSQL database model vs. Relational database model.
36 Assessment LOs 1, 4, 6 will be assessed using Essay, and homework questions to assess the student s understanding of the Big Data Model, the MapReduce algorithm and the NoSQL database model. LOs 2, 3, 5 will be assessed with programming projects: these projects will include data analysis, and data mining using R, Hadoop, and MapReduce.
37 Assessment LOs 1, 3, 4, and 5 will be assessed through exam questions. These exam questions will assess the student s ability to use the MapReduce algorithm, and basic statistical data analysis using R and Hadoop.
38 Conclusion The new course and concentration meet the demands of industry. The new course and concentration meet the Accreditation Board for Engineering and Technology (ABET) criteria which were the primary challenges for the CS faculty. The new concentration will embrace the IBM z Enterprise Academic Initiative and will introduce the IBM systems.
39 Conclusion The partnership with IBM will, in addition to providing remote access to the IBM systems, provide access to data for a hands on lab experience with big data analytics. The Big Data Management and Analytics concentration and lab experience will give our students an important advantage in the data engineering marketplace.
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools
What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees
Dealing with Data Especially Big Data INFO-GB-2346.30 Spring 2016 Very Rough Draft Subject to Change Professor Norman White Background: Most courses spend their time on the concepts and techniques of analyzing
Dakota State University 1 Integrating analytics into the Graduate DEGREE curriculum IBM Workshop: Smarter Analytics August 15, 2013 Amit Deokar Associate Professor Dakota State University Madison, South
brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 1 Hadoop in a heartbeat 3 2 Introduction to YARN 22 PART 2 DATA LOGISTICS...59 3 Data serialization working with text and beyond 61 4 Organizing and
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
OW2 Open Source Corporate Network Meeting Play with Big Data on the Shoulders of Open Source Liu Jie Technology Center of Software Engineering Institute of Software, Chinese Academy of Sciences 2012-10-19
Driving Better Marketing Results with Big Data and Analytics David Corrigan, IBM, Director of Product Marketing Optimizing Marketing with Big Data and Analytics Leverage Social Media Datacentric Marketing
IBM Big Data Platform Turning big data into smarter decisions Stefan Söderlund. IBM kundarkitekt, Försvarsmakten Sesam vår-seminarie Big Data, Bigga byte kräver Pigga Hertz! May 16, 2013 By 2015, 80% of
Global IDs gets big into 'big data' management Analyst: Krishna Roy 29 May, 2013 Global IDs has so far largely focused on automating a range of tasks such as scanning, integrating, profiling, cleansing,
WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley Disclaimer: This material is protected under copyright act AnalytixLabs, 2011. Unauthorized use and/ or duplication of this material or
A Professional Big Data Master s Program to train Computational Specialists Anoop Sarkar, Fred Popowich, Alexandra Fedorova! School of Computing Science! Education for Employable Graduates: Critical Questions
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus
IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems Proactively address regulatory compliance requirements and protect sensitive data in real time Highlights Monitor and audit data activity
Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team email@example.com @rob1lancaster Organizer of Chicago
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
What is big data? Raul F. Chong Senior program manager Big data, DB2, and Cloud IM Cloud Computing Center of Competence - IBM Toronto Lab, Canada 1 2011 IBM Corporation Agenda The world is changing What
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced
IBM InfoSphere BigInsights Enterprise Edition Efficiently manage and mine big data for valuable insights Highlights Advanced analytics for structured, semi-structured and unstructured data Professional-grade
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM 1. Introduction 1.1 Big Data Introduction What is Big Data Data Analytics Bigdata Challenges Technologies supported by big data 1.2 Hadoop Introduction
SAS and Hadoop Technology Overview SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS and Hadoop Technology: Overview. Cary, NC: SAS Institute
Information Technologies Programs Big Data Specialized Studies Accelerate Your Career extension.uci.edu/bigdata Offered in partnership with University of California, Irvine Extension s professional certificate
IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
Big Data and Hadoop Module 1: Introduction to Big Data and Hadoop Learn about Big Data and the shortcomings of the prevailing solutions for Big Data issues. You will also get to know, how Hadoop eradicates
BIG DATA IS MESSY PARTNER WITH SCALABLE SCALABLE SYSTEMS HADOOP SOLUTION WHAT IS BIG DATA? Each day human beings create 2.5 quintillion bytes of data. In the last two years alone over 90% of the data on
BIG Data Analytics Move to Competitive Advantage where is technology heading today Standardization Open Source Automation Scalability Cloud Computing Mobility Smartphones/ tablets Internet of Things Wireless
Big Data Processing, 2014/15 Lecture 10: HBase!! Claudia Hauff (Web Information Systems)! firstname.lastname@example.org 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind the
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at email@example.com.
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
IBM Big in Government Turning big data into smarter decisions Deepak Mohapatra Sr. Consultant Government IBM Software Group firstname.lastname@example.org The Big Paradigm Shift 2 Big Creates A Challenge And an
Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like
Mike Winer IBM Information Management IBM Big Data Platform The big data opportunity Extracting insight from an immense volume, variety and velocity of data, in a timely and cost-effective manner. Variety:
Building Scalable Big Data Pipelines NOSQL SEARCH ROADSHOW ZURICH Christian Gügi, Solution Architect 19.09.2013 AGENDA Opportunities & Challenges Integrating Hadoop Lambda Architecture Lambda in Practice
BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM An Overview Contents Contents... 1 BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM... 1 Program Overview... 4 Curriculum... 5 Module 1: Big Data: Hadoop
BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON Overview * Introduction * Multiple faces of Big Data * Challenges of Big Data * Cloud Computing
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
Cleveland State University CIS 612 Modern Database Programming & Big Data Processing (3-0-3) Fall 2014 Section 50 Class Nbr. 2670. Tues, Thur 4:00 5:15 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred.
BIG DATA AND MICROSOFT Susie Adams CTO Microsoft Federal THE WORLD OF DATA IS CHANGING Cloud What s making this possible? Electrical efficiency of computers doubles every year and ½. Laptops and mobile
Educational Opportunities in Big Data Could current Big Gaps in Talent fill the void and Big Market Demand? Dr. KRS Murthy Dr.Sri.Murthy@Gmail.Com BigDataExpert@Gmail.Com (408)-464-3333 Big Gaps in Big
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager email@example.com
Easy CramBible Lab ** Single-user License ** 00M-643 IBM Information Management Solution Sales Mastery This copy can be only used by yourself for educational purposes Web: http://www.crambible.com/ E-mail:firstname.lastname@example.org
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten MC Brown, Director of Documentation Linas Virbalas, Senior Software Engineer. About Tungsten Replicator Open source drop-in
Big Data and Apache Hadoop s MapReduce Michael Hahsler Computer Science and Engineering Southern Methodist University January 23, 2012 Michael Hahsler (SMU/CSE) Hadoop/MapReduce January 23, 2012 1 / 23
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
CS 698: Special Topics in Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Ins0tute of Technology Some of the slides have been provided through the courtesy of Dr. Ching-Yung Lin at
Big Data Strategies with IMS #16103 Richard Tran IMS Development email@example.com Insert Custom Session QR if Desired. Agenda Big Data in an Information Driven economy Why start with System z IMS strategies
Big Data Big data will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus. McKinsey Data Scientist: The Sexiest Job of the 21st Century -
BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business Instructor: Kunpeng Zhang (firstname.lastname@example.org) Lecture-Discussions:
Big Data Training - Hackveda Become a Hackveda Certified Big Data Professional - (Beginner) Skill level: Beginner Training fee: INR 9000 only (Topics covered: 108) Chief Trainer: Mr. Devanshu Shukla Training
1) Introduction to BigData & Hadoop What is Big Data? Why all industries are talking about Big Data? What are the issues in Big Data? Storage What are the challenges for storing big data? Processing What
Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,
Scientific Bulletin Economic Sciences, Volume 14/ Issue 1 BIG DATA IN BUSINESS ENVIRONMENT Logica BANICA 1, Alina HAGIU 2 1 Faculty of Economics, University of Pitesti, Romania email@example.com 2 Faculty
Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases Lecture 14 Big Data Management IV: Big-data Infrastructures (Background, IO, From NFS to HFDS) Chapter 14-15: Abideboul
Systems Engineering II Pramod Bhatotia TU Dresden pramod.bhatotia@tu- dresden.de About me! Since May 2015 2015 2012 Research Group Leader cfaed, TU Dresden PhD Student MPI- SWS Research Intern Microsoft
Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their
Programme Specification Postgraduate Programmes Awarding Body/Institution Teaching Institution University of London Goldsmiths, University of London Name of Final Award and Programme Title MSc Data Science
INTEGRATING R AND HADOOP FOR BIG DATA ANALYSIS Bogdan Oancea "Nicolae Titulescu" University of Bucharest Raluca Mariana Dragoescu The Bucharest University of Economic Studies, BIG DATA The term big data
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
1 Hadoop Job Oriented Training Agenda Kapil CK firstname.lastname@example.org Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader