BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business

Size: px
Start display at page:

Download "BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business"

Transcription

1 BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business Instructor: Kunpeng Zhang Lecture-Discussions: Monday/Wednesday, 12:30--1:45 PM Room: VMH 1520 Office Hour: Monday, 9:30--11:30 AM Room: VMH 4316 Textbook: Mining of Massive Datasets Hardcopy: Amazon.com E-version: Free available here About the course As the web technology and mobile use rapidly evolves, people are becoming more and more enthusiastic about interacting, sharing, and communicating with each other through different social platforms, communities, and media. In recent years, this collective intelligence has spread to many different domains, with particular focus on ecommerce, healthcare, and social network, causing the volume of user-generated data to expand exponentially. The distillation of knowledge from such a large amount of unstructured dynamically changed is an extremely difficult task without the help of distributed techniques. Those typical data includes millions of online customer reviews, social comments from Facebook, Twitter and other popular social platforms, shopping transaction records, mobile messages, financial news, climate data, and others. BUDT 758 (Big Data Analytics) is a graduate-level class, which introduces most state-of-the-art big data analytical concepts, techniques, and data management. Most of current intelligent marketing decisions are made based on analyzing user-generated data, such as sentiments of comments and customer reviews, purchase transaction records, and user friendship networks, etc. As the business data becomes 3Vs(volume, variety, and velocity), using distributed techniques to help us analyze and manage data has been widely and successfully deployed in many areas. In addition, having business big data analytical knowledge can make us more competitive in our future career. This course has some prerequisites: data mining and information retrieval techniques (optional); basic computer programming skills (Java or Python is preferred); basic college-level math knowledge (probability/statistics/matrices). Since the big data is a newly emerging topic and has been evolving quickly, we do not have a specific and fixed curriculum. The main format of this course will be teaching, class discussion, hands-on case study, and projects. In this course, we will cover the basic concepts of big data framework introduced by Apache: Hadoop and MapReduce. More importantly, we will cover how to solve big data problems using right distributed algorithms. The ultimate goal of this course is to master the basic big data analytical techniques and tools for solving business problems through hands-on experiences and projects. What this course offers: Installation and configuration of Hadoop under a multi-node environment. Basic concepts and ideas about Big Data.

2 Introduction the framework of MapReduce. Distributed algorithms: Recommender Systems, Clustering, Classification, Topic Models, and Network Analysis. Distributed data management and NoSQL techniques: Apache Hive and Apache Pig. Hands-on experiences of big data analysis to solve business problems. What this course does NOT offer: This is NOT a machine learning or data mining course. We will touch very few details of some machine learning algorithms. If you want to learn the principles of learning algorithms, I would recommend you to take statistical machine learning class and optimization in machine-learning class, which is usually offered from computer science department. This is NOT a programming course. We assume you have basic programming skills and you are familiar with how to interact with Linux/Unix systems (such as how to create folders, delete files, execute files under command environment, etc.). Lab sessions This course has a lab component. The labs give you a chance to get hands-on experience with the computer and with programming. The instructor, TA, or your fellow classmates can help you get through the bugs. Most labs will involve the usage of some popular distributed data analytical algorithms (machine learning). In total, we will have about 7 labs as shown below. For most labs, you need to submit a lab report before the next lab (The change of due date is subject to the difficulty of the lab). How to configure and install a Hadoop environment under a multi-node cluster; How to set up and use Amazon EC cloud; How to write and run a basic MapReduce program using Java or Python; K-means algorithm for clustering under Mahout; Recommender system algorithm under Mahout; Topic modeling algorithm under Mahout; Social network analysis. Assignments We have 2 homework assignments. These assignments are mainly from the lectures. They could be basic MapReduce, Frequent Itemset Mining, Decision Tree, K-Means, Recommendation System Algorithm, Topic Models, Locality Sensitive Hashing, some network analysis, or data management (NoSQL). These assignments will help you understand concepts and ideas you've learned from the class. Plagiarism Policy: Inevitably in a programming course, it seems that a few people will turn in work that is not their own. You should understand that it is usually easy to detect copying of programs -- even when a program is modified to try to disguise its source. Copying a program, or letting someone else copy your program, is a form of academic dishonesty and the penalties can be found here.

3 Class project There has a class project for each group. The size of each group is 3 at maximum. Two types of formats are acceptable: a consulting case study or a runnable system (frontend + backend). For the case study, each group will be assigned a case (mostly, they are real data and problems in industry). For the system, you can use some existing online datasets or download your own datasets from online resources, like Facebook, Twitter, Yelp, Amazon.com, Yahoo financial news, etc. Then run existing big data analysis algorithms to show some interesting results. Grading Your final grade for the course will be composed from the following items: Attendance: 5%*1=5% Class project: 35%*1=35% Lab report: 10%*4=40% Assignments: 10%*2 =20% Letter grades are assigned as follows: Points Letter Grade Percentage A A A B B B C C C D D D F Below 60 Attendance, etc. I assume that you understand the importance of attending class. While I do not check attendance in every lecture, I expect you to be present unless circumstances make that impossible. If you miss your project presentation without an extremely good excuse, you will receive a grade of ZERO for that. If you think you have an excuse for missing your presentation, please discuss it with me, in advance if possible. If I judge that your excuse is reasonable, I will -- depending on the circumstances -- either give you a make-up presentation, or I will average your other grades so that the missing grade does not count against you. Although it should not need to be said, I expect you to maintain a reasonable level of decorum in class. This means that there is usually no eating or drinking in class. Cell phones are suggested to be turned off. You'd better not walk in late or walk in and out of the room during lecture. Disability Services The Office of Disability Services works to ensure the accessibility of UMD programs, classes, and services to students with disabilities. Services are available for students who have documented disabilities, including vision or hearing impairments and emotional or physical disabilities. Students

4 with disability/access needs or questions may contact the Office of Disability Services at (301) Office Hours, , WWW I am on campus most days, and you are welcome to come in anytime you can find me there. My office hour would be Monday afternoon 4:00--6:00PM, but note that your office visits are certainly not restricted to my regular office hours (appointments by preferred for non-regular office hour time). My address is kpzhang@umd.edu. is a good way to communicate with me, since I usually answer messages within a day of receiving them. The home page for this course will be up soon. This page contains a weekly guide to the course and links to corresponding readings. We also use ELMS to post announcements, lectures, and assignments during the semester. Tentative Schedule Here is a tentative schedule of lectures, readings, and labs for this course. We will try to keep approximately to this schedule. We will not cover every topic in every section -- but I recommend you to read the first seven chapters of the book in their entirety, if you are really interested in learning Java. (Note that we may change the schedule during the semester. Chapters are in the book: Mining of Massive Datasets.) Dates Topics Readings 08/24 & 08/26 Introduction to Big Data Chapter 1. Data Mining 08/31 & 09/02 & 09/09 Configuration and Installation of Hadoop Hadoop Cluster Setup Running Hadoop on Linux (Single-Node- Cluster) Running Hadoop on Linux (Multi-Node- Cluster) Examples 09/14 & 09/16 Basic Hadoop Programming: MapReduce MapReduce Tutorial MapReduce: Simplified Data Processing on Large Clusters Chapter 2: Large-Scale File Systems and Map-Reduce 09/21 & 09/23 Frequent Itemsets and Association Rules Chapter 6: Frequent itemsets 09/28 & 09/30 K-means and Hierarchical K-means

5 Clustering Chapter 7: Clustering 10/05 & 10/07 Collaborative Filtering Chapter 9: Recommendation systems Item-based Collaborative Filtering 10/12 & 10/14 Vector Similarity Locality Sensitive Hashing (LSH) Chapter 3: Finding Similar Items Cosine Similarity 10/19 & 10/21 Latent Dirichlet Allocation (LDA) Latent Dirichlet Allocation Finding Scientific Topics Studying the History of Ideas Using Topic Models 10/26 & 10/28 Sentiment Identification Opinion Mining and Sentiment Analysis 11/02 & 11/04 Network Analysis Chapter 10: Analysis of Social Networks Community Detection in graphs 11/09 & 11/11 Amazon EMR and Spark Amazon Elastic MapReduce Spark 11/16 & 11/18 Distributed Data Management Apache Hbase 11/23 & 11/30 Distributed Data Management Apache Pig 12/02 & 12/07 & 12/09 Project presentation TBD

CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS

CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS CSE 427 CLOUD COMPUTING WITH BIG DATA APPLICATIONS COURSE OVERVIEW & STRUCTURE Fall 2015 Marion Neumann ABOUT Marion Neumann email: m dot neumann at wustl dot edu office: Jolley Hall 403 office hours:

More information

Learn how to store and analyze Big Data Learn about the cloud and its services for Big Data

Learn how to store and analyze Big Data Learn about the cloud and its services for Big Data CS-495/595 Big Data: Syllabus Spring 2015 Wed. 4:20PM - 7:00PM Constant Hall 1043 Instructor: Dr. Cartledge http://www.cs.odu.edu/ ccartled/teaching Big data is quadrupling every year!! Everyone is creating

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

IST565 M001 Yu Spring 2015 Syllabus Data Mining

IST565 M001 Yu Spring 2015 Syllabus Data Mining IST565 M001 Yu Spring 2015 Syllabus Data Mining Draft updated 10/28/2014 Instructor: Professor Bei Yu Classroom: Hinds 117 Email: byu.teaching@gmail.com Class time: 3:45-5:05 Wednesdays Office: Hinds 320

More information

Cleveland State University

Cleveland State University Cleveland State University CIS 612 Modern Database Programming & Big Data Processing (3-0-3) Fall 2014 Section 50 Class Nbr. 2670. Tues, Thur 4:00 5:15 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred.

More information

A Professional Big Data Master s Program to train Computational Specialists

A Professional Big Data Master s Program to train Computational Specialists A Professional Big Data Master s Program to train Computational Specialists Anoop Sarkar, Fred Popowich, Alexandra Fedorova! School of Computing Science! Education for Employable Graduates: Critical Questions

More information

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

CSCI-599 DATA MINING AND STATISTICAL INFERENCE CSCI-599 DATA MINING AND STATISTICAL INFERENCE Course Information Course ID and title: CSCI-599 Data Mining and Statistical Inference Semester and day/time/location: Spring 2013/ Mon/Wed 3:30-4:50pm Instructor:

More information

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required. What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees

More information

MIS 484-4 Big Data Information Systems

MIS 484-4 Big Data Information Systems MIS 484-4 Big Data Information Systems Chetan (Chet) Kumar, PhD Associate Professor of Information Systems California State University San Marcos ckumar@csusm.edu COURSE DESCRIPTION The aim of this course

More information

SYLLABUS MAC 1105 COLLEGE ALGEBRA Spring 2011 Tuesday & Thursday 12:30 p.m. 1:45 p.m.

SYLLABUS MAC 1105 COLLEGE ALGEBRA Spring 2011 Tuesday & Thursday 12:30 p.m. 1:45 p.m. SYLLABUS MAC 1105 COLLEGE ALGEBRA Spring 2011 Tuesday & Thursday 12:30 p.m. 1:45 p.m. Instructor: Val Mohanakumar Office Location: Office Phone #: 253 7351 Email: vmohanakumar@hccfl.edu Webpage: http://www.hccfl.edu/faculty-info/vmohanakumar.aspx.

More information

How To Learn To Use Big Data

How To Learn To Use Big Data Information Technologies Programs Big Data Specialized Studies Accelerate Your Career extension.uci.edu/bigdata Offered in partnership with University of California, Irvine Extension s professional certificate

More information

Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate

Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate Data Science and Business Analytics Certificate Data Science and Business Intelligence Certificate Description The Helzberg School of Management has launched two graduate-level certificates: one in Data

More information

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

Big Data and Analytics: Challenges and Opportunities

Big Data and Analytics: Challenges and Opportunities Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif

More information

Cleveland State University

Cleveland State University Cleveland State University CIS 695 Big Data Processing and Data Analytics (3-0-3) 2016 Section 51 Class Nbr. 5493. Tues, Thur TBA Prerequisites: CIS 505 and CIS 530. CIS 612, CIS 660 Preferred. Instructor:

More information

PRACTICAL DATA SCIENCE

PRACTICAL DATA SCIENCE PRACTICAL DATA SCIENCE INFO-GB.3359.10 Fall 2013 SYLLABUS Professors Josh Attenberg Office; Hours Wednesdays 2-3, KMC 8-171 & By appointment Email jattenbe@stern.nyu.edu Emails should have subject tag:

More information

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

More information

B490 Mining the Big Data. 0 Introduction

B490 Mining the Big Data. 0 Introduction B490 Mining the Big Data 0 Introduction Qin Zhang 1-1 Data Mining What is Data Mining? A definition : Discovery of useful, possibly unexpected, patterns in data. 2-1 Data Mining What is Data Mining? A

More information

Email: justinjia@ust.hk Office: LSK 5045 Begin subject: [ISOM3360]...

Email: justinjia@ust.hk Office: LSK 5045 Begin subject: [ISOM3360]... Business Intelligence and Data Mining ISOM 3360: Spring 2015 Instructor Contact Office Hours Course Schedule and Classroom Course Webpage Jia Jia, ISOM Email: justinjia@ust.hk Office: LSK 5045 Begin subject:

More information

Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis

Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis , 22-24 October, 2014, San Francisco, USA Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis Teng Zhao, Kai Qian, Dan Lo, Minzhe Guo, Prabir Bhattacharya, Wei Chen, and Ying

More information

POSTGRAD PLACEMENTS. Placements are an integral part of the Masters programmes, so international students will not require additional work visas.

POSTGRAD PLACEMENTS. Placements are an integral part of the Masters programmes, so international students will not require additional work visas. POSTGRAD PLACEMENTS COMPUTATIONAL FINANCE DATA SCIENCE AND ANALYTICS MACHINE LEARNING KEY INFORMATION Placements can start in the middle of June 2015 or later and must finish by the middle of June 2016

More information

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Introduction to Big Data! with Apache Spark UC#BERKELEY# Introduction to Big Data! with Apache Spark" UC#BERKELEY# This Lecture" The Big Data Problem" Hardware for Big Data" Distributing Work" Handling Failures and Slow Machines" Map Reduce and Complex Jobs"

More information

KENNESAW STATE UNIVERSITY GRADUATE COURSE PROPOSAL OR REVISION, Cover Sheet (10/02/2002)

KENNESAW STATE UNIVERSITY GRADUATE COURSE PROPOSAL OR REVISION, Cover Sheet (10/02/2002) KENNESAW STATE UNIVERSITY GRADUATE COURSE PROPOSAL OR REVISION, Cover Sheet (10/02/2002) Course Number/Program Name ACS 7420 Algorithm Design for Big Data Department Computer Science Degree Title (if applicable)

More information

Workshop on Hadoop with Big Data

Workshop on Hadoop with Big Data Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly

More information

Data Analyst Program- 0 to 100

Data Analyst Program- 0 to 100 Development Data Analyst Program- 0 to 100 Master the Data Analysis tools like Pig and hive Data Science Build a recommendation engine 1 Data Analyst Program- 0 to 100 HADOOP SCHOOL OF TRAINING Basics

More information

Hadoop Development & BI- 0 to 100

Hadoop Development & BI- 0 to 100 Development Master the Data Analysis tools like Pig and hive Data Science Hadoop Development & BI- 0 to 100 Build a recommendation engine Hadoop Development - 0 to 100 HADOOP SCHOOL OF TRAINING Basics

More information

MATH 1900, ANALYTIC GEOMETRY AND CALCULUS II SYLLABUS

MATH 1900, ANALYTIC GEOMETRY AND CALCULUS II SYLLABUS MATH 1900, ANALYTIC GEOMETRY AND CALCULUS II SYLLABUS COURSE TITLE: Analytic Geometry and Calculus II CREDIT: 5 credit hours SEMESTER: Spring 2010 INSTRUCTOR: Shahla Peterman OFFICE: 353 CCB PHONE: 314-516-5826

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

STAT 1403 College Algebra Dr. Myron Rigsby Fall 2013 Section 0V2 crn 457 MWF 9:00 am

STAT 1403 College Algebra Dr. Myron Rigsby Fall 2013 Section 0V2 crn 457 MWF 9:00 am MATH 1403 College Algebra/ Rigsby/ Fall 2013 Page 1 Credit Hours: 3 Lecture Hours: 3 University of Arkansas Fort Smith 5210 GRAND AVENUE P.O. BOX 3649 FORT SMITH, AR 72913-3649 479-788-7000 Syllabus and

More information

BIG DATA TOOLS. Top 10 open source technologies for Big Data

BIG DATA TOOLS. Top 10 open source technologies for Big Data BIG DATA TOOLS Top 10 open source technologies for Big Data We are in an ever expanding marketplace!!! With shorter product lifecycles, evolving customer behavior and an economy that travels at the speed

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

How To Learn Data Analytics

How To Learn Data Analytics COURSE DESCRIPTION Spring 2014 COURSE NAME COURSE CODE DESCRIPTION Data Analytics: Introduction, Methods and Practical Approaches INF2190H The influx of data that is created, gathered, stored and accessed

More information

Consulting and Systems Integration (1) Networks & Cloud Integration Engineer

Consulting and Systems Integration (1) Networks & Cloud Integration Engineer Ericsson is a world-leading provider of telecommunications equipment & services to mobile & fixed network operators. Over 1,000 networks in more than 180 countries use Ericsson equipment, & more than 40

More information

Dealing with Data Especially Big Data

Dealing with Data Especially Big Data Dealing with Data Especially Big Data INFO-GB-2346.30 Spring 2016 Very Rough Draft Subject to Change Professor Norman White Background: Most courses spend their time on the concepts and techniques of analyzing

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

COURSE DESCRIPTION. Required Course Materials COURSE REQUIREMENTS

COURSE DESCRIPTION. Required Course Materials COURSE REQUIREMENTS Communication Studies 2061 Business and Professional Communication Instructor: Emily Graves Email: egrave3@lsu.edu Office Phone: 225-578-???? Office Location: Coates 144 Class Meeting Times and Locations:

More information

Napa Valley College Fall 2015 Math 106-67528: College Algebra (Prerequisite: Math 94/Intermediate Alg.)

Napa Valley College Fall 2015 Math 106-67528: College Algebra (Prerequisite: Math 94/Intermediate Alg.) 1 Napa Valley College Fall 2015 Math 106-67528: College Algebra (Prerequisite: Math 94/Intermediate Alg.) Room 1204 Instructor: Yolanda Woods Office: Bldg. 1000 Rm. 1031R Phone: 707-256-7757 M-Th 9:30-10:35

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level?

Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level? Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level? Dr. Frank Lee Chair, ECE/CS/IT New York Institute of Technology Old Westbury, NY 11568 Topics This talk describes:

More information

Big Data Presentation of the course

Big Data Presentation of the course Academic year 2014/2015 Big Data Presentation of the course Prof. Riccardo Torlone Università Roma Tre 2 A new course Second year at Roma Tre First university course on Big Data in Italy We will experiment

More information

Let the data speak to you. Look Who s Peeking at Your Paycheck. Big Data. What is Big Data? The Artemis project: Saving preemies using Big Data

Let the data speak to you. Look Who s Peeking at Your Paycheck. Big Data. What is Big Data? The Artemis project: Saving preemies using Big Data CS535 Big Data W1.A.1 CS535 BIG DATA W1.A.2 Let the data speak to you Medication Adherence Score How likely people are to take their medication, based on: How long people have lived at the same address

More information

Data Science Certificate General Information About Completion

Data Science Certificate General Information About Completion Data Science Certificate General Information About Completion Introduction This guide is designed to help you form expectations about the program you are beginning as well as point you in the direction

More information

Big Data and Analytics (Fall 2015)

Big Data and Analytics (Fall 2015) Big Data and Analytics (Fall 2015) Core/Elective: MS CS Elective MS SPM Elective Instructor: Dr. Tariq MAHMOOD Credit Hours: 3 Pre-requisite: All Core CS Courses (Knowledge of Data Mining is a Plus) Every

More information

Integrating a Big Data Platform into Government:

Integrating a Big Data Platform into Government: Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government

More information

WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley

WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley Disclaimer: This material is protected under copyright act AnalytixLabs, 2011. Unauthorized use and/ or duplication of this material or

More information

CSci 538 Articial Intelligence (Machine Learning and Data Analysis)

CSci 538 Articial Intelligence (Machine Learning and Data Analysis) CSci 538 Articial Intelligence (Machine Learning and Data Analysis) Course Syllabus Fall 2015 Instructor Derek Harter, Ph.D., Associate Professor Department of Computer Science Texas A&M University - Commerce

More information

L1: Introduction to Hadoop

L1: Introduction to Hadoop L1: Introduction to Hadoop Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: December 1, 2014 Today we are going to learn... 1 General

More information

Predictive Analytics Certificate Program

Predictive Analytics Certificate Program Information Technologies Programs Predictive Analytics Certificate Program Accelerate Your Career Offered in partnership with: University of California, Irvine Extension s professional certificate and

More information

COURSE DESCRIPTION OBJECTIVE:

COURSE DESCRIPTION OBJECTIVE: Course Number: OMIS Instructor: Dr. Akshay Bhagwatwar Course Title: Social Media Analytics Semester: Fall 2016 Classroom: Barsema Hall xxx Credit Value: 3 Class Hours: Office: Barsema Hall 328P Office

More information

Syllabus: IST451. Division of Business and Engineering. Penn State Altoona

Syllabus: IST451. Division of Business and Engineering. Penn State Altoona Syllabus: IST451 Division of Business and Engineering Penn State Altoona Course Title 1. IST451: Network Security-Spring 2012 2. Section 001 3. Credits: 3 Meeting Times 1. Lectures: Mondays and Wednesdays

More information

Big Data and Scripting Systems build on top of Hadoop

Big Data and Scripting Systems build on top of Hadoop Big Data and Scripting Systems build on top of Hadoop 1, 2, Pig/Latin high-level map reduce programming platform interactive execution of map reduce jobs Pig is the name of the system Pig Latin is the

More information

CSE 40437/60437 - Social Sensing and Cyber- Physical Systems - Spring 2015

CSE 40437/60437 - Social Sensing and Cyber- Physical Systems - Spring 2015 CSE 40437/60437 - Social Sensing and Cyber- Physical Systems - Spring 2015 Instructor Prof. Dong Wang dwang5 at nd dot edu Office Hours: Tue 3:15-5:15 PM, 214B Cushing Hall TA: Chao Huang chuang7 at nd

More information

Turtle Mountain Community College

Turtle Mountain Community College Turtle Mountain Community College Fall Semester- 2013 CIS 104 I: Microcomputer Database-Access Course Dates: August 20 rd to December 6 th Instructor: Marlin Allery (staff) E-mail: mallery@tm.edu Office

More information

Cloud Security in Map/Reduce An Analysis July 31, 2009. Jason Schlesinger ropyrusk@gmail.com

Cloud Security in Map/Reduce An Analysis July 31, 2009. Jason Schlesinger ropyrusk@gmail.com Cloud Security in Map/Reduce An Analysis July 31, 2009 Jason Schlesinger ropyrusk@gmail.com Presentation Overview Contents: 1. Define Cloud Computing 2. Introduce and Describe Map/Reduce 3. Introduce Hadoop

More information

MAC2233, Business Calculus Reference # 722957, RM 2216 TR 9:50AM 11:05AM

MAC2233, Business Calculus Reference # 722957, RM 2216 TR 9:50AM 11:05AM Instructor: Jakeisha Thompson Email: jthompso@mdc.edu Phone: 305-237-3347 Office: 1543 Office Hours Monday Tuesday Wednesday Thursday Friday 7:30AM 8:15AM 12:30PM 2:00PM 7:30AM 9:30AM 7:30AM 8:15AM 12:30PM

More information

Overview. Introduction. Recommender Systems & Slope One Recommender. Distributed Slope One on Mahout and Hadoop. Experimental Setup and Analyses

Overview. Introduction. Recommender Systems & Slope One Recommender. Distributed Slope One on Mahout and Hadoop. Experimental Setup and Analyses Slope One Recommender on Hadoop YONG ZHENG Center for Web Intelligence DePaul University Nov 15, 2012 Overview Introduction Recommender Systems & Slope One Recommender Distributed Slope One on Mahout and

More information

AMIS 7640 Data Mining for Business Intelligence

AMIS 7640 Data Mining for Business Intelligence The Ohio State University The Max M. Fisher College of Business Department of Accounting and Management Information Systems AMIS 7640 Data Mining for Business Intelligence Autumn Semester 2013, Session

More information

Big Data and Data Science. The globally recognised training program

Big Data and Data Science. The globally recognised training program Big Data and Data Science The globally recognised training program Certificate in Big Data Analytics Duration 5 days Big Data and Data Science enables value creation from data, through the use of calculative

More information

Programme Specification Postgraduate Programmes

Programme Specification Postgraduate Programmes Programme Specification Postgraduate Programmes Awarding Body/Institution Teaching Institution University of London Goldsmiths, University of London Name of Final Award and Programme Title MSc Data Science

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

RYERSON UNIVERSITY Ted Rogers School of Information Technology Management And G. Raymond Chang School of Continuing Education

RYERSON UNIVERSITY Ted Rogers School of Information Technology Management And G. Raymond Chang School of Continuing Education 1.0 PREREQUISITE RYERSON UNIVERSITY Ted Rogers School of Information Technology Management And G. Raymond Chang School of Continuing Education COURSE OF STUDY 2015-2016 (C)ITM 618 - Business Intelligence

More information

FALL 2013 SECTION 501 WEDNESDAYS 7:00PM-9:45PM JSOM 1.217

FALL 2013 SECTION 501 WEDNESDAYS 7:00PM-9:45PM JSOM 1.217 MIS 6324: BUSINESS INTELLIGENCE SOFTWARE AND TECHNIQUES SECTION 001 TUESDAYS 1:00PM-3:45PM JSOM 2.106 FALL 2013 SECTION 501 WEDNESDAYS 7:00PM-9:45PM JSOM 1.217 SECTION 002 SATURDAYS 9:00AM-11:45AM JSOM

More information

Introduction to Big Data Training

Introduction to Big Data Training Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model

More information

Statistics and Measurements I (3 Credits) FOR 250-001 College of Agriculture, Food and Environment Department of Forestry

Statistics and Measurements I (3 Credits) FOR 250-001 College of Agriculture, Food and Environment Department of Forestry Statistics and Measurements I (3 Credits) FOR 250-001 College of Agriculture, Food and Environment Department of Forestry Times: Lecture: MW 10:00 10:50 am (TPC 113) Lab: Thursday 1:00 3:00 pm (TPC 212)

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University They are very new technologies to Computer Science in rise of Web Service on Internet (IoT) They were fast developed and fast evolving Research and Developments

More information

Course Description This course will change the way you think about data and its role in business.

Course Description This course will change the way you think about data and its role in business. INFO-GB.3336 Data Mining for Business Analytics Section 32 (Tentative version) Spring 2014 Faculty Class Time Class Location Yilu Zhou, Ph.D. Associate Professor, School of Business, Fordham University

More information

Microsoft SQL Server 2012 with Hadoop

Microsoft SQL Server 2012 with Hadoop Microsoft SQL Server 2012 with Hadoop Debarchan Sarkar Chapter No. 1 "Introduction to Big Data and Hadoop" In this package, you will find: A Biography of the author of the book A preview chapter from the

More information

44-599 Intro. to Data Visualization Spring 2016

44-599 Intro. to Data Visualization Spring 2016 44-599 Intro. to Data Visualization Spring 2016 Instructor: Dr. Ajay Bandi 2250 Colden Hall ajay@nwmissouri.edu Classroom: VLK127 Time: 02:00pm - 03:15pm TR Textbook: No textbook is required. All the material

More information

Machine Learning. Hands-On for Developers and Technical Professionals

Machine Learning. Hands-On for Developers and Technical Professionals Brochure More information from http://www.researchandmarkets.com/reports/2785739/ Machine Learning. Hands-On for Developers and Technical Professionals Description: Dig deep into the data with a hands-on

More information

CSE 6040 Computing for Data Analytics: Methods and Tools. Lecture 1 Course Overview

CSE 6040 Computing for Data Analytics: Methods and Tools. Lecture 1 Course Overview CSE 6040 Computing for Data Analytics: Methods and Tools Lecture 1 Course Overview DA KUANG, POLO CHAU GEORGIA TECH FALL 2014 Fall 2014 CSE 6040 COMPUTING FOR DATA ANALYSIS 1 Course Staff Instructor Da

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

HOUSTON COMMUNITY COLLEGE SOUTHWEST. Local Area Networks Management Cisco 3 - ITCC 1042

HOUSTON COMMUNITY COLLEGE SOUTHWEST. Local Area Networks Management Cisco 3 - ITCC 1042 HOUSTON COMMUNITY COLLEGE SOUTHWEST Local Area Networks Management Cisco 3 - ITCC 1042 Date and Time of class: Class CRN: Instructor s Name: School Site: Phone number: HOUSTON COMMUNITY COLLEGE SOUTHWEST

More information

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop

Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Role of Cloud Computing in Big Data Analytics Using MapReduce Component of Hadoop Kanchan A. Khedikar Department of Computer Science & Engineering Walchand Institute of Technoloy, Solapur, Maharashtra,

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof. CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing University of Florida, CISE Department Prof. Daisy Zhe Wang Cloud Computing and Amazon Web Services Cloud Computing Amazon

More information

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,

More information

BUMK758K Advanced Marketing Analytics Fall, 2015 Professor Michel Wedel

BUMK758K Advanced Marketing Analytics Fall, 2015 Professor Michel Wedel ROBERT H. SMITH SCHOOL OF BUSINESS UNIVERSITY OF MARYLAND BUMK758K Advanced Marketing Analytics Fall, 2015 Professor Michel Wedel Class time: Tue, 4:00pm 6:40pm Classroom: College Park, VMH1330 Email:

More information

College Algebra Online Course Syllabus

College Algebra Online Course Syllabus VALENCIA COMMUNITY COLLEGE EAST CAMPUS MAC 1114 COLLEGE TRIGONOMETRY (ONLINE COURSE) SYLLABUS Term/Year: Spring 2009 CRN: 22607 Professor: Dr. Agatha Shaw Phone: (407) 582 2117 Office: 8-249 Student Engagement

More information

6500:305- Business Analytics Fall 2014

6500:305- Business Analytics Fall 2014 6500-305 Fall 2014 Page 1 College of Business Administration, UA 6500:305- Business Analytics Fall 2014 Instructor: B. Vijayaraman (Vijay) Office: CBA 357 Office Hours: Mon/Wed from 1:00 pm to 2:00 pm;

More information

MATH 1111 College Algebra Fall Semester 2014 Course Syllabus. Course Details: TR 3:30 4:45 pm Math 1111-I4 CRN 963 IC #322

MATH 1111 College Algebra Fall Semester 2014 Course Syllabus. Course Details: TR 3:30 4:45 pm Math 1111-I4 CRN 963 IC #322 MATH 1111 College Algebra Fall Semester 2014 Course Syllabus Instructor: Mr. Geoff Clement Office: Russell Hall, Room 205 Office Hours: M-R 8-9 and 12:30-2, and other times by appointment Other Tutoring:

More information

ISM 4403 Section 001 Advanced Business Intelligence 3 credit hours. Term: Spring 2012 Class Location: FL 411 Time: Monday 4:00 6:50

ISM 4403 Section 001 Advanced Business Intelligence 3 credit hours. Term: Spring 2012 Class Location: FL 411 Time: Monday 4:00 6:50 COURSE TITLE/NUMBER, NUMBER OF CREDIT HOURS: COURSE LOGISTICS: ISM 4403 Section 001 Advanced Business Intelligence 3 credit hours Term: Spring 2012 Class Location: FL 411 Time: Monday 4:00 6:50 INSTRUCTOR

More information

Machine Learning and Cloud Computing. trends, issues, solutions. EGI-InSPIRE RI-261323

Machine Learning and Cloud Computing. trends, issues, solutions. EGI-InSPIRE RI-261323 Machine Learning and Cloud Computing trends, issues, solutions Daniel Pop HOST Workshop 2012 Future plans // Tools and methods Develop software package(s)/libraries for scalable, intelligent algorithms

More information

USC Viterbi School of Engineering

USC Viterbi School of Engineering USC Viterbi School of Engineering INF 551: Foundations of Data Management Units: 3 Term Day Time: Spring 2016 MW 8:30 9:50am (section 32411D) Location: GFS 116 Instructor: Wensheng Wu Office: GER 204 Office

More information

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14 Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases Lecture 14 Big Data Management IV: Big-data Infrastructures (Background, IO, From NFS to HFDS) Chapter 14-15: Abideboul

More information

How To Pass Eecs 485

How To Pass Eecs 485 EECS 485 - Web Databases & Information Systems The University of Michigan Fall 2013 Lectures Mon, Wed 10:30AM - 12:00PM Professor Michael Cafarella Location 1013 DOW Office 4709 BBB GSI Jun Chen Professor

More information

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop

More information

UNIVERSITY OF MICHIGAN SCHOOL OF INFORMATION SI301: Models of Social Information Processing Syllabus

UNIVERSITY OF MICHIGAN SCHOOL OF INFORMATION SI301: Models of Social Information Processing Syllabus UNIVERSITY OF MICHIGAN SCHOOL OF INFORMATION SI301: Models of Social Information Processing Syllabus Instructor: Office: Office hours: GSI: Office Hours: Course Email: WebSite: Tanya Rosenblat < trosenbl@umich.edu>

More information

A Study of Data Management Technology for Handling Big Data

A Study of Data Management Technology for Handling Big Data Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 9, September 2014,

More information

Open source Google-style large scale data analysis with Hadoop

Open source Google-style large scale data analysis with Hadoop Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical

More information

Information and Decision Sciences (IDS)

Information and Decision Sciences (IDS) University of Illinois at Chicago 1 Information and Decision Sciences (IDS) Courses IDS 400. Advanced Business Programming Using Java. 0-4 Visual extended business language capabilities, including creating

More information

Lecture: Mon 13:30 14:50 Fri 9:00-10:20 ( LTH, Lift 27-28) Lab: Fri 12:00-12:50 (Rm. 4116)

Lecture: Mon 13:30 14:50 Fri 9:00-10:20 ( LTH, Lift 27-28) Lab: Fri 12:00-12:50 (Rm. 4116) Business Intelligence and Data Mining ISOM 3360: Spring 203 Instructor Contact Office Hours Course Schedule and Classroom Course Webpage Jia Jia, ISOM Email: justinjia@ust.hk Office: Rm 336 (Lift 3-) Begin

More information

Distributed Framework for Data Mining As a Service on Private Cloud

Distributed Framework for Data Mining As a Service on Private Cloud RESEARCH ARTICLE OPEN ACCESS Distributed Framework for Data Mining As a Service on Private Cloud Shraddha Masih *, Sanjay Tanwani** *Research Scholar & Associate Professor, School of Computer Science &

More information

IT services for analyses of various data samples

IT services for analyses of various data samples IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical

More information