CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS

Size: px
Start display at page:

Download "CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS"

Transcription

1 CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS COURSE OVERVIEW & STRUCTURE Fall 2015 Marion Neumann

2 ABOUT Marion Neumann m dot neumann at wustl dot edu office: Jolley Hall 403 office hours: THU 11:00am- 1pm Course website: /cse- 427/ Please use Piazza (piazza.com/wustl/fall2015/cse427/home) for any questions about the course! Sign up here: piazza.com/wustl/fall2015/cse427 8/25/15 2

3 LECTURES AND HOMEWORKS Tuesday & Thursday 2:30-4:00pm in Cupples II / L009 Homework assignments Assigned on THU(before 5pm) Due following THU (before 2:30pm) Use SVN repository for submissions à find instructions how to use them on the course webpage TA office hours Kunyao Liu: WED 5:00-7:00pm in Jolley 431 Paul Scheid: TUE 9:30-11:30am in Jolley 431 8/25/15 3

4 IN- CLASS EXAMS 2 in- class exams Count for 25% of total class performance each Dates: Final: 16 Dec 2015 Midterm: 13 Oct 2015 or 15 Oct /25/15 4

5 GRADING POLICY Grading Summary 50% homework assignments 25% midterm 25% final Lecture participation is beneficial Black/white board notes Hands- on/practical examples 8/25/15 5

6 LATE POLICY, COLLABORATION AND ACADEMIC DISHONESTY Late Policy Your homework assignments must be turned in on time. No late assignments will be accepted except under extraordinary circumstances. I will grant the occasional extension, but you must at least two days before the deadline to make your extension request. There are absolutely no makeup quizzes or assignments for any reason. Collaboration Policy You are encouraged to discuss the course material with other students. Discussing the material, and the general form of solutions to the labs is a key part of the class. Since, for many of the assignments, there is no single right answer, talking to other students and to the TAs is a good thing. However, everything that you turn in should be your own work, unless we tell you otherwise. If you talk about assignments with another student, then you need to explicitly tell us on the hand- in. You are not allowed to copy answers or parts of answers from anyone else, or from material you find on the Internet. This will be considered as willful cheating, and will be dealt with according to the official collaboration policy. Your solutions will be compared to the solutions of other students and solutions available ONLINE! Academic Dishonesty Unless explicitly instructed otherwise, everything that you turn in for this course must be your own work. If you willfully misrepresent someone else s work as your own, you are guilty of cheating. Cheating, in any form, will not be tolerated in this class. There is zero tolerance of Academic Dishonesty. I will be actively searching for academic dishonesty on all homework assignments, quizzes, and exams. If you are guilty of cheating on any assignment or exam, you will receive and F in the course and be referred to the School of Engineering Discipline Committee. In severe cases, this can lead to expulsion from the University, as well as possible deportation for international students. If you copy from anyone in the class both parties will be penalized, regardless of which direction the information flowed. 08/24/2015 This is your only warning. 6

7 COURSE OBJECTIVE Introduction to big data applied parallel computing MapReduce Hadoop big data technologies/tools large- scale data management and analysis large- scale machine learning large- scale network/graph analysis handling large feature spaces 8/25/15 Contents may be subject to changes! 7

8 TOPICS TO BE COVERED (SYLLABUS) PART I: Data Storage and Analysis MapReduce General introduction Practical use of Hadoop MapReduce Algorithms using MapReduce Data Analysis Hadoop Pig, Hive, and Impala Data Management HDFS Hadoop tools (Crunch, Sqoop, Flume) 8/25/15 Contents may be subject to changes! 8

9 TOPICS TO BE COVERED (SYLLABUS) PART II: Algorithms Data Algorithms Introduction to Apache Spark Sorting/secondary sort Recommendation engines Large- scale Machine Learning Clustering in MapReduce and Spark Classification using MapReduce and Spark Introduction to Apache Mahout Large- scale support vector machines* 8/25/15 Contents may be subject to changes! 9

10 TOPICS TO BE COVERED (SYLLABUS) PART III: Structured and High- dimensional Data Graph Data Link Analysis using PageRank Introduction to Apache GiRaph (GraphLab(*)) Social network analysis(*) Information Retrieval/Finding Similar Items Big feature spaces Document retrieval Locality- sensitive hashing (*) we might not have time to talk about this 8/25/15 Contents may be subject to changes! 10

11 BACKGROUND & PREREQUISITES Programming Java*, Python**, or Pearl*** (SQL) databases & computer architecture Algorithms sorting hashing CSE 241 Maths matrices, linear algebra probabilities graphs machine learning (classification, clustering, SVMs) (SVD, PCA) * fully supported ** supported *** not supported 8/25/15 11

12 COURSE MATERIALS The content of this class is derived largely from the Cloudera Developer Training for Apache Hadoop and Cloudera Data Analyst Training: Using Pig, Hive, and Impala with Hadoop, which are made available to Washington University through the Cloudera Academic Parntership program. Further materials are adapted from the Mining Massive Data Sets book ( and class taught at Stanford by Jure Leskovec Books Mining Massive Data Sets by Jure Leskovec, Anand Rajaraman, Jeff Ullman (available online!) Hadoop: The Definite Guide by Tom White Data Algorithms: Recipes for Scaling Up with Hadoop and Spark by Mahmoud Parsian 8/25/15 12

13 SLIDE LAYOUT Notes! Note: These are usually useful. Questions? Question: What are your expectations of the class? Examples Quick calculations or examples: Small examples, ideas/thoughts, or calculations will appear in blue boxes. 8/25/15 13

14 SLIDE LAYOUT (2) Advantages, benefits, properties Problems and challenges more data! even more data New Section Additional Reading further readings videos/video lectures I will consider the materials to be course content. 8/25/15 14

15 SUMMARY All relevant information can be found on the course webpage: /cse- 427/ Ask all questions on Piazza!? Question: Do you have any questions? 8/25/15 15

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business

BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business Instructor: Kunpeng Zhang (kzhang@rmsmith.umd.edu) Lecture-Discussions:

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

Learn how to store and analyze Big Data Learn about the cloud and its services for Big Data

Learn how to store and analyze Big Data Learn about the cloud and its services for Big Data CS-495/595 Big Data: Syllabus Spring 2015 Wed. 4:20PM - 7:00PM Constant Hall 1043 Instructor: Dr. Cartledge http://www.cs.odu.edu/ ccartled/teaching Big data is quadrupling every year!! Everyone is creating

More information

CS 1340 Sec. A Time: TR @ 8:00AM, Location: Nevins 2115. Instructor: Dr. R. Paul Mihail, 2119 Nevins Hall, Email: rpmihail@valdosta.

CS 1340 Sec. A Time: TR @ 8:00AM, Location: Nevins 2115. Instructor: Dr. R. Paul Mihail, 2119 Nevins Hall, Email: rpmihail@valdosta. CS 1340 Sec. A Time: TR @ 8:00AM, Location: Nevins 2115 Course title: Computing for Scientists, Spring 2015 Instructor: Dr. R. Paul Mihail, 2119 Nevins Hall, Email: rpmihail@valdosta.edu Class meeting

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

Big Data Systems CS 5965/6965 FALL 2015

Big Data Systems CS 5965/6965 FALL 2015 Big Data Systems CS 5965/6965 FALL 2015 Today General course overview Expectations from this course Q&A Introduction to Big Data Assignment #1 General Course Information Course Web Page http://www.cs.utah.edu/~hari/teaching/fall2015.html

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

Cleveland State University

Cleveland State University Cleveland State University CIS 612 Modern Database Programming & Big Data Processing (3-0-3) Fall 2014 Section 50 Class Nbr. 2670. Tues, Thur 4:00 5:15 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred.

More information

Big Data Course Highlights

Big Data Course Highlights Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like

More information

DSBA6100-U01 And U90 - Big Data Analytics for Competitive Advantage (Cross listed as MBAD7090, ITCS 6100, HCIP 6103) Fall 2015

DSBA6100-U01 And U90 - Big Data Analytics for Competitive Advantage (Cross listed as MBAD7090, ITCS 6100, HCIP 6103) Fall 2015 DSBA6100-U01 And U90 - Big Data Analytics for Competitive Advantage (Cross listed as MBAD7090, ITCS 6100, HCIP 6103) Fall 2015 As created and co-taught by Dr. Wlodek and Dr. Chandra, 2015-2025 Dr. Wlodek

More information

CS 5890: Introduction to Data Science Syllabus, Utah State University, Fall 2015 http://digital.cs.usu.edu/~kyumin/cs5890/

CS 5890: Introduction to Data Science Syllabus, Utah State University, Fall 2015 http://digital.cs.usu.edu/~kyumin/cs5890/ CS 5890: Introduction to Data Science Syllabus, Utah State University, Fall 2015 http://digital.cs.usu.edu/~kyumin/cs5890/ 1. Credits: 3 a. Class Meets: Tuesday and Thursday 1:30pm - 2:45pm, Old Main (MAIN)

More information

MAT 103B College Algebra Part I Winter 2016 Course Outline and Syllabus

MAT 103B College Algebra Part I Winter 2016 Course Outline and Syllabus MAT 103B College Algebra Part I Winter 2016 Course Outline and Syllabus Instructor: Meeting Venue: Email: Caren LeVine Monday/Wednesday 6pm 7:50pm, E106 celevine@mail.ltcc.edu Office Hours (Outside The

More information

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

CSCI-599 DATA MINING AND STATISTICAL INFERENCE CSCI-599 DATA MINING AND STATISTICAL INFERENCE Course Information Course ID and title: CSCI-599 Data Mining and Statistical Inference Semester and day/time/location: Spring 2013/ Mon/Wed 3:30-4:50pm Instructor:

More information

CSE532 Theory of Database Systems Course Information. CSE 532, Theory of Database Systems Stony Brook University http://www.cs.stonybrook.

CSE532 Theory of Database Systems Course Information. CSE 532, Theory of Database Systems Stony Brook University http://www.cs.stonybrook. CSE532 Theory of Database Systems Course Information CSE 532, Theory of Database Systems Stony Brook University http://www.cs.stonybrook.edu/~cse532 Course Description The 3 credits course will cover advanced

More information

Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015

Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Syllabus for MATH 191 MATH 191 Topics in Data Science: Algorithms and Mathematical Foundations Department of Mathematics, UCLA Fall Quarter 2015 Lecture: MWF: 1:00-1:50pm, GEOLOGY 4645 Instructor: Mihai

More information

Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015

Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015 Course Information Introduction to Data Science: CptS 483-06 Syllabus First Offering: Fall 2015 Credit Hours: 3 Semester: Fall 2015 Meeting times and location: MWF, 12:10 13:00, Sloan 163 Course website:

More information

02-201: Programming for Scientists

02-201: Programming for Scientists 1. Course Information 1.1 Course description 02-201: Programming for Scientists Carl Kingsford Fall 2015 Provides a practical introduction to programming for students with little or no prior programming

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

CSE 6040 Computing for Data Analytics: Methods and Tools. Lecture 1 Course Overview

CSE 6040 Computing for Data Analytics: Methods and Tools. Lecture 1 Course Overview CSE 6040 Computing for Data Analytics: Methods and Tools Lecture 1 Course Overview DA KUANG, POLO CHAU GEORGIA TECH FALL 2014 Fall 2014 CSE 6040 COMPUTING FOR DATA ANALYSIS 1 Course Staff Instructor Da

More information

ITG Software Engineering

ITG Software Engineering Introduction to Cloudera Course ID: Page 1 Last Updated 12/15/2014 Introduction to Cloudera Course : This 5 day course introduces the student to the Hadoop architecture, file system, and the Hadoop Ecosystem.

More information

Big Data Management and Analytics

Big Data Management and Analytics Big Data Management and Analytics Lecture Notes Winter semester 2015 / 2016 Ludwig-Maximilians-University Munich Prof. Dr. Matthias Renz 2015 Based on lectures by Donald Kossmann (ETH Zürich), as well

More information

How To Learn Data Analytics

How To Learn Data Analytics COURSE DESCRIPTION Spring 2014 COURSE NAME COURSE CODE DESCRIPTION Data Analytics: Introduction, Methods and Practical Approaches INF2190H The influx of data that is created, gathered, stored and accessed

More information

Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level?

Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level? Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level? Dr. Frank Lee Chair, ECE/CS/IT New York Institute of Technology Old Westbury, NY 11568 Topics This talk describes:

More information

Data Analyst Program- 0 to 100

Data Analyst Program- 0 to 100 Development Data Analyst Program- 0 to 100 Master the Data Analysis tools like Pig and hive Data Science Build a recommendation engine 1 Data Analyst Program- 0 to 100 HADOOP SCHOOL OF TRAINING Basics

More information

Dealing with Data Especially Big Data

Dealing with Data Especially Big Data Dealing with Data Especially Big Data INFO-GB-2346.30 Spring 2016 Very Rough Draft Subject to Change Professor Norman White Background: Most courses spend their time on the concepts and techniques of analyzing

More information

WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley

WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley Disclaimer: This material is protected under copyright act AnalytixLabs, 2011. Unauthorized use and/ or duplication of this material or

More information

Estimating PageRank Values of Wikipedia Articles using MapReduce

Estimating PageRank Values of Wikipedia Articles using MapReduce Estimating PageRank Values of Wikipedia Articles using MapReduce Due: Sept. 30 Wednesday 5:00PM Submission: via Canvas, individual submission Instructor: Sangmi Pallickara Web page: http://www.cs.colostate.edu/~cs535/assignments.html

More information

BIG DATA - HADOOP PROFESSIONAL amron

BIG DATA - HADOOP PROFESSIONAL amron 0 Training Details Course Duration: 30-35 hours training + assignments + actual project based case studies Training Materials: All attendees will receive: Assignment after each module, video recording

More information

Los Angeles Pierce College. SYLLABUS Math 227: Elementary Statistics. Fall 2011 T Th 4:45 6:50 pm Section #3307 Room: MATH 1400

Los Angeles Pierce College. SYLLABUS Math 227: Elementary Statistics. Fall 2011 T Th 4:45 6:50 pm Section #3307 Room: MATH 1400 Los Angeles Pierce College SYLLABUS Math 227: Elementary Statistics Fall 2011 T Th 4:45 6:50 pm Section #3307 Room: MATH 1400 Instructor: Pauline Pham Office hours: T Th: 4:00 4:35 PM, Room Math 1409X

More information

CS 425 Software Engineering. Course Syllabus

CS 425 Software Engineering. Course Syllabus Department of Computer Science and Engineering College of Engineering, University of Nevada, Reno Fall 2013 CS 425 Software Engineering Course Syllabus Lectures: Instructor: Office hours: Catalog description:

More information

CS 425 Software Engineering. Course Syllabus

CS 425 Software Engineering. Course Syllabus Department of Computer Science and Engineering College of Engineering, University of Nevada, Reno Fall 2015 CS 425 Software Engineering Course Syllabus Lectures: TR, 9:30 10:45 am, LEG-212 Instructor:

More information

Unified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia

Unified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia Unified Big Data Processing with Apache Spark Matei Zaharia @matei_zaharia What is Apache Spark? Fast & general engine for big data processing Generalizes MapReduce model to support more types of processing

More information

Canisius College Computer Science Department Computer Programming for Science CSC107 & CSC107L Fall 2014

Canisius College Computer Science Department Computer Programming for Science CSC107 & CSC107L Fall 2014 Canisius College Computer Science Department Computer Programming for Science CSC107 & CSC107L Fall 2014 Class: Tuesdays and Thursdays, 10:00-11:15 in Science Hall 005 Lab: Tuesdays, 9:00-9:50 in Science

More information

MAT 183 - Elements of Modern Mathematics Syllabus for Spring 2011 Section 100, TTh 9:30-10:50 AM; Section 200, TTh 8:00-9:20 AM

MAT 183 - Elements of Modern Mathematics Syllabus for Spring 2011 Section 100, TTh 9:30-10:50 AM; Section 200, TTh 8:00-9:20 AM MAT 183 - Elements of Modern Mathematics Syllabus for Spring 2011 Section 100, TTh 9:30-10:50 AM; Section 200, TTh 8:00-9:20 AM Course Instructor email office ext. Thomas John, Ph.D. thjohn@syr.edu 224

More information

USC Viterbi School of Engineering

USC Viterbi School of Engineering USC Viterbi School of Engineering INF 551: Foundations of Data Management Units: 3 Term Day Time: Spring 2016 MW 8:30 9:50am (section 32411D) Location: GFS 116 Instructor: Wensheng Wu Office: GER 204 Office

More information

CS 207 - Data Science and Visualization Spring 2016

CS 207 - Data Science and Visualization Spring 2016 CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler sorelle@cs.haverford.edu An introduction to techniques for the automated and human-assisted analysis of data sets. These

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

INFO/CS 4302 Web Information Systems. FT 2012 Week 1: Course Introduction

INFO/CS 4302 Web Information Systems. FT 2012 Week 1: Course Introduction INFO/CS 4302 Web Information Systems FT 2012 Week 1: Course Introduction Who We Are - Instructors Bernhard Haslhofer Theresa Velden bh392@cornell.edu Office hours: TUE / THU 1:30-3:00 tav6@cornell.edu

More information

Video Game Programming ITP 380 (4 Units)

Video Game Programming ITP 380 (4 Units) Video Game Programming ITP 380 (4 Units) Objective This course provides students with an in-depth introduction to technologies and techniques used in the game industry today. At semester s end, students

More information

CSE 562 Database Systems

CSE 562 Database Systems UB CSE Database Courses CSE 562 Database Systems CSE 462 Database Concepts Introduction CSE 562 Database Systems Some slides are based or modified from originals by Database Systems: The Complete Book,

More information

Ali Ghodsi Head of PM and Engineering Databricks

Ali Ghodsi Head of PM and Engineering Databricks Making Big Data Simple Ali Ghodsi Head of PM and Engineering Databricks Big Data is Hard: A Big Data Project Tasks Tasks Build a Hadoop cluster Challenges Clusters hard to setup and manage Build a data

More information

Office: D-116-9. Instructor: Vanessa Jones. Phone: (714) 628-4948. Office Hours: Monday & Wednesday 1:30pm-2:30pm. Email: Jones Vanessa@sccollege.

Office: D-116-9. Instructor: Vanessa Jones. Phone: (714) 628-4948. Office Hours: Monday & Wednesday 1:30pm-2:30pm. Email: Jones Vanessa@sccollege. Fall Semester 2015 Santiago Canyon College: Mathematics & Sciences Division (Room SC-210) MATH 80: Intermediate Algebra (Section Number 10247) Tuesday & Thursday 10:30 am-1:00pm (Room SC-110) Instructor:

More information

BIG DATA HADOOP TRAINING

BIG DATA HADOOP TRAINING BIG DATA HADOOP TRAINING DURATION 40hrs AVAILABLE BATCHES WEEKDAYS (7.00AM TO 8.30AM) & WEEKENDS (10AM TO 1PM) MODE OF TRAINING AVAILABLE ONLINE INSTRUCTOR LED CLASSROOM TRAINING (MARATHAHALLI, BANGALORE)

More information

Email: justinjia@ust.hk Office: LSK 5045 Begin subject: [ISOM3360]...

Email: justinjia@ust.hk Office: LSK 5045 Begin subject: [ISOM3360]... Business Intelligence and Data Mining ISOM 3360: Spring 2015 Instructor Contact Office Hours Course Schedule and Classroom Course Webpage Jia Jia, ISOM Email: justinjia@ust.hk Office: LSK 5045 Begin subject:

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

CAS CS 565, Data Mining

CAS CS 565, Data Mining CAS CS 565, Data Mining Course logistics Course webpage: http://www.cs.bu.edu/~evimaria/cs565-10.html Schedule: Mon Wed, 4-5:30 Instructor: Evimaria Terzi, evimaria@cs.bu.edu Office hours: Mon 2:30-4pm,

More information

Oracle Big Data Fundamentals Ed 1 NEW

Oracle Big Data Fundamentals Ed 1 NEW Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big

More information

CSCD18: Computer Graphics

CSCD18: Computer Graphics CSCD18: Computer Graphics Professor: Office: Office hours: Teaching Assistant: Office hours: Lectures: Tutorials: Website: Leonid Sigal lsigal@utsc.utoronto.ca ls@cs.toronto.edu Room SW626 Monday 12:00-1:00pm

More information

Communicating with the Elephant in the Data Center

Communicating with the Elephant in the Data Center Communicating with the Elephant in the Data Center Who am I? Instructor Consultant Opensource Advocate http://www.laubersoltions.com sml@laubersolutions.com Twitter: @laubersm Freenode: laubersm Outline

More information

Big Data and Analytics (Fall 2015)

Big Data and Analytics (Fall 2015) Big Data and Analytics (Fall 2015) Core/Elective: MS CS Elective MS SPM Elective Instructor: Dr. Tariq MAHMOOD Credit Hours: 3 Pre-requisite: All Core CS Courses (Knowledge of Data Mining is a Plus) Every

More information

CSE 40437/60437 - Social Sensing and Cyber- Physical Systems - Spring 2015

CSE 40437/60437 - Social Sensing and Cyber- Physical Systems - Spring 2015 CSE 40437/60437 - Social Sensing and Cyber- Physical Systems - Spring 2015 Instructor Prof. Dong Wang dwang5 at nd dot edu Office Hours: Tue 3:15-5:15 PM, 214B Cushing Hall TA: Chao Huang chuang7 at nd

More information

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools

More information

Voice: (276) 619-4352 and (813) 507-9956 E-mail: bnorton@hgs.k12.va.us Office Hours: by appointment

Voice: (276) 619-4352 and (813) 507-9956 E-mail: bnorton@hgs.k12.va.us Office Hours: by appointment A. Linwood Holton Governor s School INTRODUCTION TO ENGINEERING METHODS and COMPUTER PROGRAMMING Course Syllabus Instructor: Dr. Bruce C. Norton Voice: (276) 619-4352 and (813) 507-9956 E-mail: bnorton@hgs.k12.va.us

More information

BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview

BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM An Overview Contents Contents... 1 BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM... 1 Program Overview... 4 Curriculum... 5 Module 1: Big Data: Hadoop

More information

B490 Mining the Big Data. 0 Introduction

B490 Mining the Big Data. 0 Introduction B490 Mining the Big Data 0 Introduction Qin Zhang 1-1 Data Mining What is Data Mining? A definition : Discovery of useful, possibly unexpected, patterns in data. 2-1 Data Mining What is Data Mining? A

More information

ANGELO STATE UNIVERSITY/GLEN ROSE HIGH SCHOOL DUAL CREDIT ALGEBRA II AND COLLEGE ALGEBRA/MATH 1302 2015-2016

ANGELO STATE UNIVERSITY/GLEN ROSE HIGH SCHOOL DUAL CREDIT ALGEBRA II AND COLLEGE ALGEBRA/MATH 1302 2015-2016 ANGELO STATE UNIVERSITY/GLEN ROSE HIGH SCHOOL DUAL CREDIT ALGEBRA II AND COLLEGE ALGEBRA/MATH 1302 2015-2016 I. INSTRUCTOR MRS. JAMI LOVELADY Office: 504 Tutorial Hours: Mornings Monday through Friday

More information

Programming for Big Data

Programming for Big Data Long Title: Language of Instruction: Programming for English Module Code: H8BGD Credits: 5 NFQ Level: LEVEL 8 Field of Study: Software and applications development and analysis Taxonomy: Blooms Module

More information

Cleveland State University

Cleveland State University Cleveland State University CIS 695 Big Data Processing and Data Analytics (3-0-3) 2016 Section 51 Class Nbr. 5493. Tues, Thur TBA Prerequisites: CIS 505 and CIS 530. CIS 612, CIS 660 Preferred. Instructor:

More information

1.00 Lecture 1. Course information Course staff (TA, instructor names on syllabus/faq): 2 instructors, 4 TAs, 2 Lab TAs, graders

1.00 Lecture 1. Course information Course staff (TA, instructor names on syllabus/faq): 2 instructors, 4 TAs, 2 Lab TAs, graders 1.00 Lecture 1 Course Overview Introduction to Java Reading for next time: Big Java: 1.1-1.7 Course information Course staff (TA, instructor names on syllabus/faq): 2 instructors, 4 TAs, 2 Lab TAs, graders

More information

ISM 4210: DATABASE MANAGEMENT

ISM 4210: DATABASE MANAGEMENT GENERAL INFORMATION: ISM 4210: DATABASE MANAGEMENT COURSE SYLLABUS Class Times: Tuesday, Thursday 9:35 11:30 AM Class Location: HVNR 240 Professor: Dr. Aditi Mukherjee Office; Phone: STZ 360, 39-20648

More information

Pierce College Online Math. Math 115. Section #0938 Fall 2013

Pierce College Online Math. Math 115. Section #0938 Fall 2013 1 Pierce College Online Math Math 115 Section #0938 Fall 2013 Class meets in room 1512 Mon. & Wed. 1:30pm 2:55pm Instructor: Dr. Forkeotes Office: 1409F Office hours: Mon.Wed.12:30-1:30pm, M-Th 6:45pm

More information

Prerequisite Math 115 with a grade of C or better, or appropriate skill level demonstrated through the Math assessment process, or by permit.

Prerequisite Math 115 with a grade of C or better, or appropriate skill level demonstrated through the Math assessment process, or by permit. Summer 2016 Math 125 Intermediate Algebra Section 0179, 5 units Online Course Syllabus Instructor Information Instructor: Yoon Yun Email: yunyh@lamission.edu Phone: (818)364-7691 MyMathLab: MyMathLab.com

More information

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

L1: Introduction to Hadoop

L1: Introduction to Hadoop L1: Introduction to Hadoop Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: December 1, 2014 Today we are going to learn... 1 General

More information

Lake-Sumter Community College Course Syllabus. STA 2023 Course Title: Elementary Statistics I. Contact Information: Office Hours:

Lake-Sumter Community College Course Syllabus. STA 2023 Course Title: Elementary Statistics I. Contact Information: Office Hours: Lake-Sumter Community College Course Syllabus Course / Prefix Number: STA 2023 Course Title: Elementary Statistics I CRN: 10105 (T TH) 10106 (M W) Credit: 3 Term: Fall 2011 Course Catalog Description:

More information

Hadoop Job Oriented Training Agenda

Hadoop Job Oriented Training Agenda 1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module

More information

BUS 1950-002-008 Computer Concepts and Applications for Business Fall 2012

BUS 1950-002-008 Computer Concepts and Applications for Business Fall 2012 BUS 1950-002-008 Computer Concepts and Applications for Business Fall 2012 Instructor: Contact Information: Susan Kling Office: 4505 Lumpkin Hall Phone: 217-581-8547 Email: SJKling@eiu.edu Course Website:

More information

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture. Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in

More information

CENTRAL COLLEGE Department of Mathematics COURSE SYLLABUS

CENTRAL COLLEGE Department of Mathematics COURSE SYLLABUS CENTRAL COLLEGE Department of Mathematics COURSE SYLLABUS MATH 1314: College Algebra Fall 2010 / Tues-Thurs 7:30-9:00 pm / Gay Hall Rm 151 / CRN: 47664 INSTRUCTOR: CONFERENCE TIMES: CONTACT INFORMATION:

More information

Dixie State college Family and Consumer Science Syllabus fall 2011 COURSE INFORMATION

Dixie State college Family and Consumer Science Syllabus fall 2011 COURSE INFORMATION Dixie State college Family and Consumer Science Syllabus fall 2011 Course Number: NFS 2990 Course Name: CULINARY ARTS COURSE INFORMATION Credit Hours: Prerequisites: 3 CREDIT HOURS none Dates: Aug 22 --

More information

KENNESAW STATE UNIVERSITY GRADUATE COURSE PROPOSAL OR REVISION, Cover Sheet (10/02/2002)

KENNESAW STATE UNIVERSITY GRADUATE COURSE PROPOSAL OR REVISION, Cover Sheet (10/02/2002) KENNESAW STATE UNIVERSITY GRADUATE COURSE PROPOSAL OR REVISION, Cover Sheet (10/02/2002) Course Number/Program Name ACS 7420 Algorithm Design for Big Data Department Computer Science Degree Title (if applicable)

More information

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop

More information

Hadoop Development & BI- 0 to 100

Hadoop Development & BI- 0 to 100 Development Master the Data Analysis tools like Pig and hive Data Science Hadoop Development & BI- 0 to 100 Build a recommendation engine Hadoop Development - 0 to 100 HADOOP SCHOOL OF TRAINING Basics

More information

CSE452 Computer Graphics

CSE452 Computer Graphics CSE452 Computer Graphics Spring 2015 CSE452 Introduction Slide 1 Welcome to CSE452!! What is computer graphics? About the class CSE452 Introduction Slide 2 What is Computer Graphics? Modeling Rendering

More information

Workshop on Hadoop with Big Data

Workshop on Hadoop with Big Data Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly

More information

Lecture 10: HBase! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Lecture 10: HBase! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl Big Data Processing, 2014/15 Lecture 10: HBase!! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind the

More information

Course #1506/ Course Syllabus Beginning College Algebra

Course #1506/ Course Syllabus Beginning College Algebra 501 West College Drive Brainerd, MN 5640 Headings Red = CLC syllabus Blue = High School Info. Black = additional info. Pierz Healy High School 112 Kamnic Street Pierz, MN 56364 Course #1506/ Course Syllabus

More information

Statistics W4240: Data Mining Columbia University Spring, 2014

Statistics W4240: Data Mining Columbia University Spring, 2014 Statistics W4240: Data Mining Columbia University Spring, 2014 Version: January 30, 2014. The syllabus is subject to change, so look for the version with the most recent date. Course Description Massive

More information

MIS 310: Management Information Systems (Spring 2015)

MIS 310: Management Information Systems (Spring 2015) Syllabus MIS 310: Management Information Systems (Spring 2015) Instructor: Dr. Minder Chen, Professor of MIS Email: Minder.Chen@csuci.edu Phone number: 805-437-2683 Class Location: Smith Decision Center

More information

CMPT 165 INTRODUCTION TO THE INTERNET AND THE WORLD WIDE WEB

CMPT 165 INTRODUCTION TO THE INTERNET AND THE WORLD WIDE WEB CMPT 165 INTRODUCTION TO THE INTERNET AND THE WORLD WIDE WEB Unit 0 Course Introduction Slides based on course material SFU Icons their respective owners 1 How many activities in your life make use of

More information

Required Textbook: Sciarra, Dorothy June, Dorsey, Anne G., Developing and Administering a Child Care and Education Program, 7th Edition.

Required Textbook: Sciarra, Dorothy June, Dorsey, Anne G., Developing and Administering a Child Care and Education Program, 7th Edition. CD 137 Syllabus Page 1 of 5 CD 137 Syllabus for Spring, 2013 A 3 unit course taught exclusively online, with online orientation completed the first week of the semester Section #0817 Administration of

More information

Math 161A-01: College Algebra and Trigonometry I Meeting Days: MW 9:31am 11:30am Room : D9

Math 161A-01: College Algebra and Trigonometry I Meeting Days: MW 9:31am 11:30am Room : D9 Math 161A-01: College Algebra and Trigonometry I Meeting Days: MW 9:31am 11:30am Room : D9 INSTRUCTOR INFORMATION: Name: Steve S. Lam, Associate Professor Contact No: 735-5600 Office Hrs.: MW 8:30am 9:30am

More information

CS 425 Software Engineering

CS 425 Software Engineering Department of Computer Science and Engineering College of Engineering, University of Nevada, Reno Fall 2009 CS 425 Software Engineering Lectures: Instructors: Office hours: Catalog description: Course

More information

George Washington University Department of Psychology PSYC 001: General Psychology

George Washington University Department of Psychology PSYC 001: General Psychology George Washington University Department of Psychology PSYC 001: General Psychology Course Syllabus Fall 2006 Times & Place Section 14 (CRN #70754) Tues & Thurs: 11:10am 12:25pm: Corcoran #302 Section 15

More information

OPERATIONS, BUSINESS ANALYTICS & INFORMATION SYSTEMS

OPERATIONS, BUSINESS ANALYTICS & INFORMATION SYSTEMS IT Architecture and Networking IS-3040-001 Spring 2015 Office : 523 Lindner Hall Telephone : 513-556-7058 E-mail : Robert.Rokey@uc.edu Office Hours: by appointment. TEXT: Englander, Irv. The Architecture

More information

Instructions for Labs

Instructions for Labs CSE 241 Algorithms and Data Structures Jan 12, 2014 Instructions for Labs 1 Basic Information For each lab, you must check in the code before class on the day when the lab is due. The code will be written

More information

After completing SI- 539, students will have a working personal portfolio website in production.

After completing SI- 539, students will have a working personal portfolio website in production. SI 539, Fall 2014 Complex Web Design Lecture: Friday: 1:00pm 3:00pm *Must leave by 3:15 Discussion Sections Varies Office Hours*: Tues: 11:35 12:35 Wed mornings *Please check my Google Calendar for updates

More information

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08

More information

Syllabus for Course 1-02-326: Database Systems Engineering at Kinneret College

Syllabus for Course 1-02-326: Database Systems Engineering at Kinneret College Syllabus for Course 1-02-326: Database Systems Engineering at Kinneret College Instructor: Michael J. May Semester 2 of 5769 1 Course Details The course meets 9:00am 11:00am on Wednesdays. The Targil for

More information

EMPORIA STATE UNIVERSITYSCHOOL OF BUSINESS Department of Accounting and Information Systems. IS213 A Management Information Systems Concepts

EMPORIA STATE UNIVERSITYSCHOOL OF BUSINESS Department of Accounting and Information Systems. IS213 A Management Information Systems Concepts EMPORIA STATE UNIVERSITYSCHOOL OF BUSINESS Department of Accounting and Information Systems IS213A Course Syllabus Spring 2013 MISSION STATEMENT: The School of Business prepares a diverse student body

More information

Lecture 1: Course Introduction"

Lecture 1: Course Introduction Lecture 1: Course Introduction" CSE 123: Computer Networks Alex C. Snoeren First Discussion Friday 10/4! Lecture 1 Overview" Class overview Expected outcomes Structure of the course Policies and procedures

More information

Web-Based Database Applications ITP 300x (3 Units)

Web-Based Database Applications ITP 300x (3 Units) Web-Based Database Applications ITP 300x (3 Units) Objective Examination of the architecture and use of database-enabled web sites. Define the foundation for using relational databases on the web. Architectural

More information

CIS 4301 - Information and Database Systems I. Course Syllabus Spring 2015

CIS 4301 - Information and Database Systems I. Course Syllabus Spring 2015 CIS 4301 - Information and Database Systems I 1. General Info Credits: Three Section: 7776 Prerequisite: CIS 3020 or CIS 3023, COT 3100 Instructor: Prof. Daisy Zhe Wang Meeting Times: M W F 9:35AM to 10:25AM

More information

INFSCI 1017 Implementation of Information Systems

INFSCI 1017 Implementation of Information Systems INFSCI 1017 Implementation of Information Systems Time: Thursdays 6:00 8:30 Location: Information Science Building, Room 411 Instructor: Dmitriy Babichenko Office Hours: Tuesdays, 3-5PM Wednesday, 3-5PM

More information

#TalendSandbox for Big Data

#TalendSandbox for Big Data Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND

More information

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop

More information

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs

Big Data Analytics. Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs 1 Big Data Analytics Genoveva Vargas-Solar http://www.vargas-solar.com/big-data-analytics French Council of Scientific Research, LIG & LAFMIA Labs Montevideo, 22 nd November 4 th December, 2015 INFORMATIQUE

More information