Cleveland State University
|
|
- Sharlene Gregory
- 8 years ago
- Views:
Transcription
1 Cleveland State University CIS 695 Big Data Processing and Data Analytics (3-0-3) 2016 Section 51 Class Nbr Tues, Thur TBA Prerequisites: CIS 505 and CIS 530. CIS 612, CIS 660 Preferred. Instructor: Dr. Sunnie S. Chung Office Location: FH211 Phone: Webpage: Office Time: Tues, Thurs 2:00 3:30 PM and 5:45 6:45 PM (or by appointment) Class Location: TBA Section 51 Tue & Thu TBA Key Concepts: Big Data Processing and Parallel Computing, Google s MapReduce and Aphathe Hadoop, Hadoop/Map Reduce Based Big Data Processing Systems: Google s Big Table, Facebook HBase, Hive, Apathe Pig Latin, Key Value Store Systems: MongoDB, Cassandra, Cloud Computing Systems: Google Cloud, Amazon Elastic Cloud, Parallel Data Warehouse Based Big Data Processing Systems, Data Analytics using OLAP Cube, Text Data Analytics, Data Analytic for Big Data, Building Big Data Processing Infrastructures for Data Analytics, Data Analytics in Twitter, LinkedIn, and Practicum in Data Analytics: Case Study from Progressive, Explorys by IBM and Data Think List of Required Materials: Microsoft SQL Server 2012, Microsoft Visual Studio 2012 or any higher Microsoft SQL Server Business Intelligence Data Analytic Tool 2012 OLAP Server and SQL Server Data Tool They are available at the Microsoft Academic Alliance program: Adventure Works 2012 Data Warehouse Database for SQL Server 2012 Will be directed in class Applied Analytics Using SAS Enterprise Miner Apathe Pig Latin: WEKA: R and MapR Detailed Instructions on Installation and Setting Up for Big Data Processing Systems will be given in Class. Text Book: 1. Will be given in class 2. List of Selected Industry R&D Papers on Data Analytics and Big Data Processing will be given in class Supplement Text Book: 1. Data Mining with Microsoft SQL Server BI Data Tool (2008 or 2012), Jamie MacLennan, Bogdan Crivat, Publisher: Wiley; 1 Edition ISBN-10: ISBN-13: Edition: 1 Official Calendar Please consult the page Final exam: Thur, Dec 11 4:00-6:00 PM.
2 Grading: The course grade is based on a student's overall performance through the entire Semester. The final grade is distributed among the following components: 1. Exams (Midterm & Final) 45% (15% Midterm, 30% Final) 2. Computer Labs 25% (about 3-4 Assignments) 3. 1 Project on Big data processing: 2-3 person group project (20%) 4. Research Topic Presentation : 10% -- Tentative A 94% + A: Outstanding (student's performance is genuinely excellent) A- 90% - 93% B+ 87% - 89% B 80% - 86% B: Very Good (student's performance is clearly commendable but not necessarily outstanding) B- 70% - 79% C <70% C: Good (student's performance meets every course requirement and is acceptable; not distinguished) D F < 60% D: Below Average (student's performance fails to meet course objectives and standards) F: Failure (student's performance is unacceptable) Examination Policy: Students are allowed to bring to the tests a summary page (standard letter size) with their own notes. During the exams: (1) the use of books, cell phones, calculators, or any electronic devices is prohibited, and (2) students must not share any materials. Make-Up Exam Policy: No makeup exams will be given unless notified and agreed to in advance. Requests will be considered only in case of exceptional demonstrated need. Homework Policy: The students are expected to attend all classes. The students are responsible for collecting the notes, handouts and any other course material distributed during the class period. All assignments must be individually and independently completed and must represent the effort of the student turning in the assignment. Should two or more students turn in substantially the same solution or output, in the judgment of the instructor, the solution will be considered group effort. All involved in group effort homework will receive a zero grade for that assignment. A student turning in a group effort assignment more than once will automatically receive an F grade for the course. Late Assignment: All lab assignments are due at the beginning of class on the date specified. Laboratory Assignments handed in after the class has begun will be accepted with a 25% grade penalty for up to a week and then not accepted at all. All laboratory assignments must be completed. Failure to do so will lower your course grade one additional letter grade. Student Conduct: Students are expected to do their own work. Academic misconduct, student misconduct, cheating and plagiarism will not be tolerated. Violations will be subject to disciplinary action as specified in the CSU Student Conduct Code. A copy can be obtained on the web page at: or by contacting Valerie Hinton Hannah, Judicial Affairs Officer in the Department of Student Life (MC 106 v.hintonhannah@csuohio.edu ). For more information consult the following web page CSU Judicial Affairs available at Course Schedule: The tentative schedule of topics and their order of coverage is given below. The schedule and topics to be covered may vary depending upon the students progress made. Week of Topic Reading
3 1-2 Big Data Processing and Parallel Computing Google s MapReduce and Aphathe Hadoop, Listed Papers Google Map Reduce Tutorial Apache Hadoop File System MapReduce: Simplified Data Processing on Large Clusters by Jeffrey Dean (Google) and Sanjay Ghemawat (Google) in the proceedings of OSDI Apathy Hadoop in White Papers by Apache, Yahoo Lab 1: MapReduce Programming on Hadoop 3 5 Hadoop/Map Reduce Based Big Data Processing Systems: Google s Big Table Data Models for Unstructured Data Processing, Listed Papers Bigtable: A Distributed Storage System for Structured Data, by Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Google, Inc. in the Proceedings of OSDI Facebook HBase, Hive Data Warehousing and Analytics Infrastructure at Facebook, in SIGMOD 2010 by Ashish Thusoo (Facebook), et al, Petabyte Scale Databases and Storage Systems Deployed at Facebook, in SIGMOD 2013 by Dhruba Borthakur Apathe Pig Latin Pig Latin: A Not-So-Foreign Language for Data Processing in SIGMOD Key Value Stores MongoDB, Cassandra Cloud Google Cloud Amazon Elastic Cloud
4 Lab2:Processing Streaming Twitter Logging Data to build Data Warehouse or Mongo DB on Hadoop 6 7 Parallel Data Warehouse Based Big Data Processing Systems: Data Warehouse and OLAP - Decision Support Technology - On Line Analytical Processing - Star Schema - OLAP Aggregation Operators: Data Cube, Roll Up, Drill Down - Building BI Data Analytics using DW and OLAP - MDX, DMX Listed papers. An Overview of Data Warehousing and OLAP Technology by Surajit Chaudhuri (Microsoft) and Umeshwar Dayal (HP Labs), in the proceedings of IEEE 1995 Data Cube: A Relational Aggregation Operator Generalizing Group By, Cross Tab, and SubTotals by Jim Gray (Microsoft), et al, in the proceedings of IEEE 1996 MS PDW Optimization Query Optimization in Microsoft SQL Server Parallel Data Warehouse in SIGMOD 2012 by Srinath Shankar, Microsoft, David DeWitt, Microsoft, César Galindo-Legaria, Microsoft, et al. Parallel Data Warehouse with OLAP Query Processing: Microsoft Extended PDW with Map Reduce and Hadoop : Oracle, Teradata Columnar Databases : SAP HANA Databases Extended PDW with Columnar Data Processing :Teradata, Microsoft PDW on Cloud Azure Cloud System by Micrsoft published in IEEE campbell Big Data Processing System with PDW on Cloud Big Data Processing Techniques for Data Analytic Tasks MapReduce Join Algorithms Pig Latin Job Execution Steps and Operators: Filter By, Group, Cogroup Unstructured Data Processing: Data Transformation from Unstructured/Semi Structure to Object Exchange Model (JASON, XML) or Relation/CSV Introduction to Information Retrieval and Web Logging Data Processing Text Data Analytics, Data Analytic for Big Data Optimization Techniques for Big Data Processing. Lab3: Unstructured Logging Data Procession to JSON format for Object Exchange Model (OEM)
5 10-11 Data Analytics in Twitter and LinkedIn: Twitter: Fast Data in the Era of Big Data: Twitter s Real-Time Related Query Suggestion Architecture Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, Jimmy Lin (Twitter, Inc.) Listed Papers, LinkedIn: The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah (LinkedIn) pdf Avatara: OLAP for Webscale Analytics Products Lili Wu, et al. (LinkedIn) Lab4: Data Analytics with Steaming Twitter Messages Practicum in Data Analytics: Case Study Progressive Explorys by IBM Data Think 14 Data Analytics and Security Text mining Intrusion detection in Network and System Database Security Security in Cloud Presentation of Significant Research Papers on Data Analytics and Big Data Processing: List of will be given in class. NOTE: The instructor reserves the right to retain, for pedagogical reasons, either the original or a copy of your work submitted either individually or as a group project for this class. Students' names will be deleted from any retained items.
Cleveland State University
Cleveland State University CIS 612 Modern Database Processing & Big Data (3-0-3) Fall 2015 Section 50 Class Nbr. 5378. Tues, Thu 4:30 5:45 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred. Instructor:
More informationCleveland State University
Cleveland State University CIS 612 Modern Database Programming & Big Data Processing (3-0-3) Fall 2014 Section 50 Class Nbr. 2670. Tues, Thur 4:00 5:15 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred.
More informationSunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
More informationAn Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov
An Industrial Perspective on the Hadoop Ecosystem Eldar Khalilov Pavel Valov agenda 03.12.2015 2 agenda Introduction 03.12.2015 2 agenda Introduction Research goals 03.12.2015 2 agenda Introduction Research
More informationSunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University They are very new technologies to Computer Science in rise of Web Service on Internet (IoT) They were fast developed and fast evolving Research and Developments
More informationData Warehouses and Business Intelligence ITP 487 (3 Units) Fall 2013. Objective
Data Warehouses and Business Intelligence ITP 487 (3 Units) Objective Fall 2013 While the increased capacity and availability of data gathering and storage systems have allowed enterprises to store more
More informationFast Data in the Era of Big Data: Twitter s Real-
Fast Data in the Era of Big Data: Twitter s Real- Time Related Query Suggestion Architecture Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, Jimmy Lin Presented by: Rania Ibrahim 1 AGENDA Motivation
More informationApplication Development. A Paradigm Shift
Application Development for the Cloud: A Paradigm Shift Ramesh Rangachar Intelsat t 2012 by Intelsat. t Published by The Aerospace Corporation with permission. New 2007 Template - 1 Motivation for the
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationBig Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
More informationUSC Viterbi School of Engineering
USC Viterbi School of Engineering INF 551: Foundations of Data Management Units: 3 Term Day Time: Spring 2016 MW 8:30 9:50am (section 32411D) Location: GFS 116 Instructor: Wensheng Wu Office: GER 204 Office
More informationA Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
More informationThe Big Data Ecosystem at LinkedIn. Presented by Zhongfang Zhuang
The Big Data Ecosystem at LinkedIn Presented by Zhongfang Zhuang Based on the paper The Big Data Ecosystem at LinkedIn, written by Roshan Sumbaly, Jay Kreps, and Sam Shah. The Ecosystems Hadoop Ecosystem
More informationNative Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
More informationThe Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn
The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn Presented by :- Ishank Kumar Aakash Patel Vishnu Dev Yadav CONTENT Abstract Introduction Related work The Ecosystem Ingress
More informationPELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS ADVANCED DATABASE MANAGEMENT SYSTEMS CSIT 2510
PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS ADVANCED DATABASE MANAGEMENT SYSTEMS CSIT 2510 Class Hours: 2.0 Credit Hours: 3.0 Laboratory Hours: 2.0 Revised: Fall 2012 Catalog Course Description:
More informationBUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business
BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business Instructor: Kunpeng Zhang (kzhang@rmsmith.umd.edu) Lecture-Discussions:
More informationThe Inside Scoop on Hadoop
The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. Orion.Gebremedhin@Neudesic.COM B-orgebr@Microsoft.com @OrionGM The Inside Scoop
More informationDepartment of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 15
Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases Lecture 15 Big Data Management V (Big-data Analytics / Map-Reduce) Chapter 16 and 19: Abideboul et. Al. Demetris
More informationEnhancing Massive Data Analytics with the Hadoop Ecosystem
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3, Issue 11 November, 2014 Page No. 9061-9065 Enhancing Massive Data Analytics with the Hadoop Ecosystem Misha
More informationSAP BusinessObjects Business Intelligence 4.1 One Strategy for Enterprise BI. May 2013
SAP BusinessObjects Business Intelligence 4.1 One Strategy for Enterprise BI May 2013 SAP s Strategic Focus on Business Intelligence Core Self-service Mobile Extreme Social Core for innovation Complete
More informationA Comparative Study on Operational Database, Data Warehouse and Hadoop File System T.Jalaja 1, M.Shailaja 2
RESEARCH ARTICLE A Comparative Study on Operational base, Warehouse Hadoop File System T.Jalaja 1, M.Shailaja 2 1,2 (Department of Computer Science, Osmania University/Vasavi College of Engineering, Hyderabad,
More informationBig Data and Hadoop with components like Flume, Pig, Hive and Jaql
Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.
More informationAGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
More informationTapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
More informationBig Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012
Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation
More informationBig Data With Hadoop
With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials
More informationBringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
More informationHadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN
Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current
More informationCan the Elephants Handle the NoSQL Onslaught?
Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented
More informationYou should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.
What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees
More informationSQL Server 2012 Business Intelligence Boot Camp
SQL Server 2012 Business Intelligence Boot Camp Length: 5 Days Technology: Microsoft SQL Server 2012 Delivery Method: Instructor-led (classroom) About this Course Data warehousing is a solution organizations
More informationData Management Course Syllabus
Data Management Course Syllabus Data Management: This course is designed to give students a broad understanding of modern storage systems, data management techniques, and how these systems are used to
More informationArchitecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
More informationANALYTICS CENTER LEARNING PROGRAM
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
More informationBig Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
More informationITP 300: Database Web Development. Database Web Development (Monday section) http://webdev.usc.edu/itp300m Fall 2012 Course 32031 3 Units
ITP 300: Database Web Development Course: Lecture/Lab: Instructor: Database Web Development (Monday section) http://webdev.usc.edu/itp300m Fall 2012 Course 32031 3 Units Mondays from 2 4:50 p.m. in KAP267
More informationAzure Data Lake Analytics
Azure Data Lake Analytics Compose and orchestrate data services at scale Fully managed service to support orchestration of data movement and processing Connect to relational or non-relational data
More informationBig Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline
More informationBig Data and Apache Hadoop s MapReduce
Big Data and Apache Hadoop s MapReduce Michael Hahsler Computer Science and Engineering Southern Methodist University January 23, 2012 Michael Hahsler (SMU/CSE) Hadoop/MapReduce January 23, 2012 1 / 23
More informationCollaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
More informationAdvanced Database Management MISM Course F14-95704 A Fall 2014
Advanced Database Management MISM Course F14-95704 A Fall 2014 Carnegie Mellon University Instructor: Randy Trzeciak Office: Software Engineering Institute / CERT CIC Office hours: By Appointment Phone:
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationMicrosoft Big Data. Solution Brief
Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,
More informationBig Data and Industrial Internet
Big Data and Industrial Internet Keijo Heljanko Department of Computer Science and Helsinki Institute for Information Technology HIIT School of Science, Aalto University keijo.heljanko@aalto.fi 16.6-2015
More informationImplementing Data Models and Reports with Microsoft SQL Server 20466C; 5 Days
Lincoln Land Community College Capital City Training Center 130 West Mason Springfield, IL 62702 217-782-7436 www.llcc.edu/cctc Implementing Data Models and Reports with Microsoft SQL Server 20466C; 5
More informationSQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse
SQL Server 2012 PDW Ryan Simpson Technical Solution Professional PDW Microsoft Microsoft SQL Server 2012 Parallel Data Warehouse Massively Parallel Processing Platform Delivers Big Data HDFS Delivers Scale
More informationCourse Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning
Course Outline: Course: Implementing a Data with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning Duration: 5.00 Day(s)/ 40 hrs Overview: This 5-day instructor-led course describes
More informationData Modeling for Big Data
Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes
More informationCSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)
CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model
More informationChapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing
More informationTap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
More informationAnalytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
More informationCSE-E5430 Scalable Cloud Computing Lecture 2
CSE-E5430 Scalable Cloud Computing Lecture 2 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 14.9-2015 1/36 Google MapReduce A scalable batch processing
More informationLEARNING SOLUTIONS website milner.com/learning email training@milner.com phone 800 875 5042
Course 20467A: Designing Business Intelligence Solutions with Microsoft SQL Server 2012 Length: 5 Days Published: December 21, 2012 Language(s): English Audience(s): IT Professionals Overview Level: 300
More informationPlay with Big Data on the Shoulders of Open Source
OW2 Open Source Corporate Network Meeting Play with Big Data on the Shoulders of Open Source Liu Jie Technology Center of Software Engineering Institute of Software, Chinese Academy of Sciences 2012-10-19
More informationBig Data: Tools and Technologies in Big Data
Big Data: Tools and Technologies in Big Data Jaskaran Singh Student Lovely Professional University, Punjab Varun Singla Assistant Professor Lovely Professional University, Punjab ABSTRACT Big data can
More informationDepartment of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14
Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases Lecture 14 Big Data Management IV: Big-data Infrastructures (Background, IO, From NFS to HFDS) Chapter 14-15: Abideboul
More informationImplementing Data Models and Reports with Microsoft SQL Server
Course 20466C: Implementing Data Models and Reports with Microsoft SQL Server Course Details Course Outline Module 1: Introduction to Business Intelligence and Data Modeling As a SQL Server database professional,
More informationCISC 432/CMPE 432/CISC 832 Advanced Database Systems
CISC 432/CMPE 432/CISC 832 Advanced Database Systems Course Info Instructor: Patrick Martin Goodwin Hall 630 613 533 6063 martin@cs.queensu.ca Office Hours: Wednesday 11:00 1:00 or by appointment Schedule:
More informationMicrosoft 20466 - Implementing Data Models and Reports with Microsoft SQL Server
1800 ULEARN (853 276) www.ddls.com.au Microsoft 20466 - Implementing Data Models and Reports with Microsoft SQL Server Length 5 days Price $4070.00 (inc GST) Version C Overview The focus of this five-day
More informationAn Approach to Implement Map Reduce with NoSQL Databases
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh
More information6500:305- Business Analytics Fall 2014
6500-305 Fall 2014 Page 1 College of Business Administration, UA 6500:305- Business Analytics Fall 2014 Instructor: B. Vijayaraman (Vijay) Office: CBA 357 Office Hours: Mon/Wed from 1:00 pm to 2:00 pm;
More informationCourse Design Document. IS417: Data Warehousing and Business Analytics
Course Design Document IS417: Data Warehousing and Business Analytics Version 2.1 20 June 2009 IS417 Data Warehousing and Business Analytics Page 1 Table of Contents 1. Versions History... 3 2. Overview
More informationSENG 520, Experience with a high-level programming language. (304) 579-7726, Jeff.Edgell@comcast.net
Course : Semester : Course Format And Credit hours : Prerequisites : Data Warehousing and Business Intelligence Summer (Odd Years) online 3 hr Credit SENG 520, Experience with a high-level programming
More informationDATA MINING WITH HADOOP AND HIVE Introduction to Architecture
DATA MINING WITH HADOOP AND HIVE Introduction to Architecture Dr. Wlodek Zadrozny (Most slides come from Prof. Akella s class in 2014) 2015-2025. Reproduction or usage prohibited without permission of
More informationwww.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization
More informationThis Symposium brought to you by www.ttcus.com
This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationApriori-Map/Reduce Algorithm
Apriori-Map/Reduce Algorithm Jongwook Woo Computer Information Systems Department California State University Los Angeles, CA Abstract Map/Reduce algorithm has received highlights as cloud computing services
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationBig Data and Analytics (Fall 2015)
Big Data and Analytics (Fall 2015) Core/Elective: MS CS Elective MS SPM Elective Instructor: Dr. Tariq MAHMOOD Credit Hours: 3 Pre-requisite: All Core CS Courses (Knowledge of Data Mining is a Plus) Every
More informationAlejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer
Alejandro Vaisman Esteban Zimanyi Data Warehouse Systems Design and Implementation ^ Springer Contents Part I Fundamental Concepts 1 Introduction 3 1.1 A Historical Overview of Data Warehousing 4 1.2 Spatial
More informationSo What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
More informationInfomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
More informationBig Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
More informationCSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS
CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS COURSE OVERVIEW & STRUCTURE Fall 2015 Marion Neumann ABOUT Marion Neumann email: m dot neumann at wustl dot edu office: Jolley Hall 403 office hours:
More informationWA2192 Introduction to Big Data and NoSQL EVALUATION ONLY
WA2192 Introduction to Big Data and NoSQL Web Age Solutions Inc. USA: 1-877-517-6540 Canada: 1-866-206-4644 Web: http://www.webagesolutions.com The following terms are trademarks of other companies: Java
More informationIntroduction to Big Data! with Apache Spark" UC#BERKELEY#
Introduction to Big Data! with Apache Spark" UC#BERKELEY# So What is Data Science?" Doing Data Science" Data Preparation" Roles" This Lecture" What is Data Science?" Data Science aims to derive knowledge!
More informationBig Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com
More informationBig Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
More informationHadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
More informationHurtownie Danych i Business Intelligence: Big Data
Hurtownie Danych i Business Intelligence: Big Data Robert Wrembel Politechnika Poznańska Instytut Informatyki Robert.Wrembel@cs.put.poznan.pl www.cs.put.poznan.pl/rwrembel Outline Introduction to Big Data
More informationA STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS
A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS Dr. Ananthi Sheshasayee 1, J V N Lakshmi 2 1 Head Department of Computer Science & Research, Quaid-E-Millath Govt College for Women, Chennai, (India)
More informationApplications for Big Data Analytics
Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail:
More informationStep by Step: Big Data Technology. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015
Step by Step: Big Data Technology Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 25 August 2015 Data Sources IT Infrastructure Analytics 2 B y 2015, 20% of Global 1000 organizations
More information#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld
Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case
More informationBig Data Scenario mit Power BI vs. SAP HANA Gerhard Brückl
Big Data Scenario mit Power BI vs. SAP HANA Gerhard Brückl About me Gerhard Brückl Working with Microsoft BI since 2006 Started working with SAP HANA in 2013 focused on Analytics and Reporting Blog: email:
More informationHadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?
Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777
Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777 Course Outline Module 1: Introduction to Data Warehousing This module provides an introduction to the key components of a data warehousing
More informationHadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard
Hadoop and Relational base The Best of Both Worlds for Analytics Greg Battas Hewlett Packard The Evolution of Analytics Mainframe EDW Proprietary MPP Unix SMP MPP Appliance Hadoop? Questions Is Hadoop
More informationBig Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect
Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate
More informationHadoop & its Usage at Facebook
Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the The Israeli Association of Grid Technologies July 15, 2009 Outline Architecture
More informationFast Data in the Era of Big Data: Tiwtter s Real-Time Related Query Suggestion Architecture
Fast Data in the Era of Big Data: Tiwtter s Real-Time Related Query Suggestion Architecture Gilad Mishne, Jeff Dalton, Zhenghua Li, Aneesh Sharma, Jimmy Lin Adeniyi Abdul 2522715 Agenda Abstract Introduction
More informationLearn how to store and analyze Big Data Learn about the cloud and its services for Big Data
CS-495/595 Big Data: Syllabus Spring 2015 Wed. 4:20PM - 7:00PM Constant Hall 1043 Instructor: Dr. Cartledge http://www.cs.odu.edu/ ccartled/teaching Big data is quadrupling every year!! Everyone is creating
More informationKeywords Big Data Analytic Tools, Data Mining, Hadoop and MapReduce, HBase and Hive tools, User-Friendly tools.
Volume 3, Issue 8, August 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Significant Trends
More informationData Warehouse design
Data Warehouse design Design of Enterprise Systems University of Pavia 10/12/2013 2h for the first; 2h for hadoop - 1- Table of Contents Big Data Overview Big Data DW & BI Big Data Market Hadoop & Mahout
More informationOpen Source Technologies on Microsoft Azure
Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions
More informationIntegrating analytics into the Graduate DEGREE curriculum
Dakota State University 1 Integrating analytics into the Graduate DEGREE curriculum IBM Workshop: Smarter Analytics August 15, 2013 Amit Deokar Associate Professor Dakota State University Madison, South
More information