Cleveland State University



Similar documents
Cleveland State University

Cleveland State University

Data Warehouses and Business Intelligence ITP 487 (3 Units) Fall Objective

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS ADVANCED DATABASE MANAGEMENT SYSTEMS CSIT 2510

Sunnie Chung. Cleveland State University

USC Viterbi School of Engineering

CISC 432/CMPE 432/CISC 832 Advanced Database Systems

Objectives of Lecture 1. Class and Office Hours. Labs and TAs. CMPUT 391: Introduction. Introduction

Technology and Online Computer Access Requirements: Lake-Sumter State College Course Syllabus

Database Web Development ITP 300 (3 Units)

Sunnie Chung. Cleveland State University

Advanced Database Management MISM Course F A Fall 2014

CSE 412/598 Database Management Spring 2012 Semester Syllabus

CS 649 Database Management Systems. Fall 2011

Syllabus for Course : Database Systems Engineering at Kinneret College

Native Connectivity to Big Data Sources in MSTR 10

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business

Instructor: Michael J. May Semester 1 of The course meets 9:00am 11:00am on Sundays. The Targil for the course is 12:00pm 2:00pm on Sundays.

Big Data Technologies Compared June 2014

ITP 300: Database Web Development. Database Web Development (Monday section) Fall 2012 Course Units

Objectives of Lecture 1. Labs and TAs. Class and Office Hours. CMPUT 391: Introduction. Introduction

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

BUS Computer Concepts and Applications for Business Fall 2012

CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS

ISQS 3358 BUSINESS INTELLIGENCE FALL 2014

IS Management Information Systems

Lake-Sumter Community College Course Syllabus

Syllabus. HMI 7437: Data Warehousing and Data/Text Mining for Healthcare

How To Learn Data Analytics

Course Design Document. IS417: Data Warehousing and Business Analytics

College of Charleston School of Business DSCI : Management Information Systems Fall 2014

Business Computer Applications CGS 1100 Course Syllabus. Course Title:

CS 564: DATABASE MANAGEMENT SYSTEMS

INFO 2130 Introduction to Business Computing Fall 2014

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

CS 425 Software Engineering

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

COURSE NUMBER AND TITLE: Management Information Systems Concepts

Workshop on Hadoop with Big Data

Dealing with Data Especially Big Data

ACADEMIC YEAR CALENDAR FALL SEMESTER First Half-Semester Courses

INFORMATION TECHNOLOGY EDUCATION PROGRAMMING & ANALYSIS COURSE SYLLABUS. Instructor: Debbie Reid. Course Credits: Office Location:

Name of Module: Big Data ECTS: 6 Module-ID: Person Responsible for Module (Name, Mail address): Angel Rodríguez, arodri@fi.upm.es

How To Pass A Management Course At Anciento State University

Northeastern University Online College of Professional Studies Course Syllabus

Web-Based Database Applications ITP 300x (3 Units)

Enterprise Information Systems ITP 320x (4 Units)

Tap into Hadoop and Other No SQL Sources

General Psychology. Course Syllabus

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

CSC 341, section 001 Principles of Operating Systems Spring 2015 Monday/Wednesday 1:00 PM 2:15 PM

ANALYTICS CENTER LEARNING PROGRAM

ORANGE COUNTY COMMUNITY COLLEGE FINAL as of MARCH 10, 2015 ACADEMIC YEAR CALENDAR FALL SEMESTER 2015

INTRODUCTION TO CASSANDRA

CS 425 Software Engineering. Course Syllabus

PRINCIPLES OF FINANCIAL ACCOUNTING/ACC 120 N1WA FALL SEMESTER 2015

MIS636 AWS Data Warehousing and Business Intelligence Course Syllabus

IST359 INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Academic Calendars. Term I (20081) Term II (20082) Term III (20083) Weekend College. International Student Admission Deadlines

165 17% C: points Attendance 35 4% D: Total % F: 600 & below

MAT 117: College Algebra Fall 2013 Course Syllabus

CSC-570 Introduction to Database Management Systems

Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level?

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

COMM Interpersonal Communication Course Syllabus Fall 2013

Belk College of Business Administration, University of North Carolina at Charlotte. INFO : BUSINESS ANALYTICS Fall 2015

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Course Syllabus PEHR Sports Marketing, Game Management & Promotions Dixie State College of Utah Fall 2012

Introduction to Database Systems CS4320/CS5320. CS4320/4321: Introduction to Database Systems. CS4320/4321: Introduction to Database Systems

CS 5890: Introduction to Data Science Syllabus, Utah State University, Fall

CSE 544 Principles of Database Management Systems. Magdalena Balazinska (magda) Spring 2006 Lecture 1 - Class Introduction

Data Modeling for Big Data

Course Syllabus for Commercial Photography 1

Learn how to store and analyze Big Data Learn about the cloud and its services for Big Data

OPERATIONS, BUSINESS ANALYTICS & INFORMATION SYSTEMS

Introduction to Criminal Justice Fall 2012 CJS

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

COURSE SYLLABUS. Enterprise Information Systems and Business Intelligence

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Application Tool for Experiments on SQL Server 2005 Transactions

Syllabus for Accounting 300 Applied Managerial Accounting California State University Channel Islands Fall 2004

A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect

BUSN 1250 Fall 2015 Syllabus/Lesson Plan **Disclaimer Statements** ****Instructor reserves the right to change the syllabus and/or lesson plan as

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Can the Elephants Handle the NoSQL Onslaught?

CS 425 Software Engineering. Course Syllabus

Ursuline College Accelerated Program

Upon successful completion of this course, a student will meet the following outcomes:

Canisius College Richard J. Wehle School of Business Department of Marketing & Information Systems Spring 2015

How To Handle Big Data With A Data Scientist

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

ISB 205 Management Software Fall 2014 Semester

6500:305- Business Analytics Fall 2014

Systems and Internet Marketing Syllabus Fall 2012 Department of Management, Marketing and International Business

SENG 520, Experience with a high-level programming language. (304) , Jeff.Edgell@comcast.net

Transcription:

Cleveland State University CIS 612 Modern Database Programming & Big Data Processing (3-0-3) Fall 2014 Section 50 Class Nbr. 2670. Tues, Thur 4:00 5:15 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred. Instructor: Dr. Sunnie S. Chung Office Location: BU346 Phone: 216 687 4732 Email: sschung.cis@gmail.com Webpage: http://grail.csuohio.edu/~sschung Office Time: Tues, Thurs 1:30 3:30 PM and 5:15 5:45 PM (or by appointment) Class Location: BU 0207 Section 50 Tue & Thu 4:00-5:15 PM Catalog Description: Detailed study of modern database programming and non-relational database systems for big data processing. First, the course presents practical approaches to modern database programming and its applications. The course advances with study of semi-structured/non-structured databases, XML data processing, XPATH, and XQuery. The course continues study of advanced features of modern databases with data warehouse and OLAP, data mining algorithms and applications. The course extends study with architectures and features of modern parallel computing systems for big data processing Parallel Data Warehouse (PDW) with OLAP and Map Reduce with Hadoop. It continues an exploration of applications and tools of big data processing systems with HIVE, HBase, PigLatin, Sparks, and NoSQL. Key Concepts: Modern database programming, web data processing, semi-structured/unstructured databases, XML data processing, XPath, XQuery, Parallel Data Warehouse (PDW), OLAP Cube, data mining algorithms, big data processing, Map Reduce, Hadoop, Hive, HBase, PigLatin, Sparks, NoSQL, and Cloud Computing. Expected Outcomes: Upon successful completion of the course, the student will be able to create modern database applications to process non-traditional data - web data, semi-structured/unstructured data, XML data. The student will be able to design data warehouse, create OLAP Cubes and write analytical OLAP queries. The student further advances skills for modern database applications by obtaining data mining concepts and techniques to create data mining applications with the data warehouse system. The student will be able to have comprehensive understanding of problems of big data processing and approaches for solutions with modern database systems. The student will further extend knowledge with architectures and features of modern parallel computing systems for big data processing Map Reduce and Hadoop. The student will be exposed to major tools to create Map Reduce applications for big data processing systems. Finally, the student will be able to obtain comprehensive knowledge and analytical skills to build an infrastructure for big data processing and its applications. List of Required Materials: Any RDBMS: Oracle Database 11g or higher - This is available at http://www.oracle.com/us/downloads/index.html Microsoft SQL Server, Microsoft Visual Studio 2012 or any higher Microsoft SQL Server 2012 Business Intelligence OLAP Server, SQL Server Data Tool They are available at the Microsoft Academic Alliance program: http://e5.onthehub.com/webstore/productsbymajorversionlist.aspx?ws=31b9929b-c09b-e011-969d- 0030487d8897 Text Book: 1. "Fundamentals of Database Systems". Elmasri / Navathe. Ed. Addison/Wesley Pub Co. 6th Edition, (2007). ISBN-10: 0321369572

2. Data Mining Concepts and Techniques Jiawei Han / Micheline Kamber, Morgan Kaufmann Publishers, (2001) ISBN 1-55860-489-8 Supplement Text Book: 1. Database Management Systems 3rd Edition by Raghu Ramakrishnan and Johannes Gehrke. Ed. McGraw-Hill. Available at http://pages.cs.wisc.edu/~dbbook/openaccess/thirdedition/supporting_ material.htm 2. Oracle Database 11g PL/SQL Programming (Osborne ORACLE Press Series) by D.CS. Michael McLaughlin Official Calendar Please consult the page http://www.csuohio.edu/enrollmentservices/registrar/calendar/index.html Final exam: Tues, Dec 10 4:00-6:00 PM. Term begins Aug 26 First Weekday Class Aug 26 Last Day to Add Sep 1 Last Day to Drop Sep 6 Last Day to Withdraw Nov 1 Last Day of Classes Dec 6 Final Exams Dec 9-14 Commencement Dec 16 Fall Incomplete Deadline May 3 Labor Day (University Holiday) Sep 2 Columbus Day (University Holiday) Oct 14 Veterans Day (Tuesday no classes) Nov 12 Thanksgiving Recess Nov 28 Dec 1 Grading: The course grade is based on a student's overall performance through the entire Semester. The final grade is distributed among the following components: 1. Exams (Exam 1, 2 & Final) 55% (15% each Exam, 25% Final) 2. Computer Projects 35% (about 4-5 lab assignments) 3. Research Topic Presentation: 10% A 94% + A: Outstanding (student's performance is genuinely excellent) A- 90% - 93% B+ 87% - 89% B 80% - 86% B: Very Good (student's performance is clearly commendable but not necessarily outstanding) B- 70% - 79% C <70% C: Good (student's performance meets every course requirement and is acceptable; not distinguished) D F < 60% D: Below Average (student's performance fails to meet course objectives and standards) F: Failure (student's performance is unacceptable) Examination Policy: Students are allowed to bring to the tests a summary page (standard letter size) with their own notes. During the exams: (1) the use of books, cell phones, calculators, or any electronic devices is prohibited, and (2) students must not share any materials. Make-Up Exam Policy: No makeup exams will be given unless notified and agreed to in advance. Requests will be considered only in case of exceptional demonstrated need. Homework Policy: The students are expected to attend all classes. The students are responsible for collecting the notes, handouts and any other course material distributed during the class period. All assignments must be individually and independently completed and must represent the effort of the student turning in the assignment. Should two or more students turn in substantially the same solution or output, in the judgment of the instructor, the solution will be considered group effort. All involved in group effort

homework will receive a zero grade for that assignment. A student turning in a group effort assignment more than once will automatically receive an F grade for the course. Late Assignment: All lab assignments are due at the beginning of class on the date specified. Laboratory Assignments handed in after the class has begun will be accepted with a 25% grade penalty for up to a week and then not accepted at all. All laboratory assignments must be completed. Failure to do so will lower your course grade one additional letter grade. Student Conduct: Students are expected to do their own work. Academic misconduct, student misconduct, cheating and plagiarism will not be tolerated. Violations will be subject to disciplinary action as specified in the CSU Student Conduct Code. A copy can be obtained on the web page at: http://www.csuohio.edu/studentlife/studentcodeofconduct.pdf or by contacting Valerie Hinton Hannah, Judicial Affairs Officer in the Department of Student Life (MC 106 email v.hintonhannah@csuohio.edu ). For more information consult the following web page CSU Judicial Affairs available at http://www.csuohio.edu/studentlife/jaffairs/faq.html Course Schedule: The schedule of topics and their order of coverage is given below. The schedule and topics to be covered may vary depending upon the progress made. Week of Topic Reading 1,2 Review: Database Foundations DBMS Architecture, SQL Query Processing Complex Queries, Views Elmasri. Chp. 8,10,11 2, 3, 4 Database Programming: Embedded SQL, Dynamic SQL, PHP Stored Procedure, User Defined Function (UDF), User Defined Type (UDT), User Defined Aggregate (UDA), Table Function, Database Triggers 4, 5, 6 Modern Databases: Enhanced Data Models for Advanced Applications Semi Structured and Unstructured Databases: XML Data Processing: - XML Schema, Syntax/Semantics, Protocol - XPath - XQuery Introduction to Information Retrieval and Web Search 7, 8 Review of Query Processing Techniques and Operators: Join, Projection, Selection, Group By, Aggregations Review of Query Optimization Techniques 8, 9 Data Warehouse and OLAP: - Decision Support Technology - On Line Analytical Processing - Star Schema - OLAP (Aggregation) Operators: Data Cube, Roll Up, Drill Down. OLAP Window Functions Elmasri, Elmasri 12, 19, 20, 21 Listed papers. Ramakrishnan 11, 12. Selected Papers J. Han Chap 1,2,3. Listed Papers,

An Overview of Data Warehousing and OLAP Technology by Surajit Chaudhuri (Microsoft) and Umeshwar Dayal (HP Labs), in the proceedings of IEEE 1995 Data Cube: A Relational Aggregation Operator Generalizing Group By, Cross Tab, and SubTotals by Jim Gray (Microsoft), et al, in the proceedings of IEEE 1996 9, 10, 11 Data Mining o Main Concepts o Support and Confidence o APRIORI Algorithm and Optimized Algorithms o Frequent Pattern Tree Algorithm o Association Rules o Decision Tree Algorithm o Lift Advanced Topics for Data Mining Applications J Han Chap 6. Listed Papers, 12, 13, 14 14, 15, 16 Big Data Processing and Parallel Computing: Google Map Reduce Tutorial Apache Hadoop File System Parallel Data Warehouse with OLAP Query Processing Pig Latin on Apathe Hadoop by Yahoo and Apathe Data Warehouse HIVE with Hadoop by Facebook HBase SPARKS MongoDB SQL vs NoSQL Map Reduce Join Algorithms Extended PDW for MR/Hadoop : Oracle, Teradata Presentation of Significant Database Industry Research Papers on Big Data Processing: List of Selected Papers will be given in class. Technical Topics: Select one and prepare a 25 min talk on the subject. 1. Semistructured/Unstructured Data Processing using Structured Data Model a. FaceBook b. EBay 2. Data Warehousing and Analytics Infrastructure at Facebook 3. Parallel Computing for Big Data Processing: Google Map Reduce with Apache Hadoop Parallel Data Warehouse (PDW) Columnar Databases : SAP Hana 4. MapReduce: Simplified Data Processing on Large Clusters by Google 5. Lammal, Ralf. Google's MapReduce Programming Model Revisited. 6. Open Source Apache Hadoop 7. Pig Latin, Hbase, Hive 8. Map Reduce Join Algorithmes, 9. Data Partition Techniques 10. SQL vs NoSQL 11. Processing MR/Hadoop with PDW : Oracle, Teradata 12. Information Retrieval: How Google Search Engine Listed papers. Selected Papers

Works 13. Cloud Computing. More New Topics to come here. NOTE: The instructor reserves the right to retain, for pedagogical reasons, either the original or a copy of your work submitted either individually or as a group project for this class. Students' names will be deleted from any retained items.