SYLLABUS DSCI 4520 Introduction to Data Mining Fall 2016

Similar documents
Department of Accounting ACC Fundamentals of Financial Accounting Syllabus

DSCI 3710 Syllabus: Spring 2015

Belk College of Business Administration, University of North Carolina at Charlotte. INFO : BUSINESS ANALYTICS Fall 2015

TECM 2700 Introduction to Technical Writing

RMIN Life Insurance Fall Course Outline

CRN: STAT / CRN / INFO 4300 CRN

Data Warehouses and Business Intelligence ITP 487 (3 Units) Fall Objective

University of North Texas at Dallas. Fall 2011 SYLLABUS. MGMT 4860D 090: Organizational Design and Change. Division of Urban and Professional Studies

Math 830- Elementary Algebra

Course Syllabus. Purposes of Course:

COURSE NUMBER AND TITLE: Management Information Systems Concepts

MATH 2412 PRECALCULUS SPRING 2015 Synonym 26044, Section 011 MW 12:00-1:45, EVC 8106

Syllabus Systems Analysis and Design Page 1 of 6

The University of Akron Department of Mathematics. 3450: COLLEGE ALGEBRA 4 credits Spring 2015

MAT 117: College Algebra Fall 2013 Course Syllabus

ANTH Introduction to Archaeology FALL 2015 (77579) Tu/Th 12:30PM - 2:00PM Katy Campus Room 348

Course title: Management Information Systems Fall 2010 Course number: CRN: Location: Meeting day: Meeting time:

OPERATIONS MANAGEMENT (OM335: 04285, 04290)

Management Financial Accounting I Fall Intermediate Accounting by Spiceland, Sepe, and Nelson, 8th edition.

Date approved or revised Angelina College Business Division BUSI-1301 Business Principles Instructional Syllabus I. BASIC COURSE INFORMATION

The University of Texas at Austin School of Social Work SOCIAL WORK STATISTICS

MIS Big Data Information Systems

Pierce College Online Math. Math 115. Section #0938 Fall 2013

BOR 6335 Data Mining. Course Description. Course Bibliography and Required Readings. Prerequisites

SYLLABUS MAC 1105 COLLEGE ALGEBRA Spring 2011 Tuesday & Thursday 12:30 p.m. 1:45 p.m.

MIS Systems Analysis & Design

PA 750: Financial Management in Public Service Tuesday, 6:00-8:45 pm DTC Lab 617

MIS Big Data Information Systems

Management 352: Human Resource Management Spring 2015 Syllabus

Statistical Methods Online Course Syllabus

PSYC 2301 Introduction to Psychology. Fall 2014 Saturdays 9:00 AM 12:00 PM Regular Term 16 weeks

College Algebra Online Course Syllabus

ACC Child Care & Development Department CDEC Special Topics in Early Childhood Master Syllabus

UNIVERSITY OF LETHBRIDGE FACULTY OF MANAGEMENT Mgt 2400A Management Accounting Fall 2014

University of Central Florida Rosen Campus

Management Science 250: Mathematical Methods for Business Analysis Three Semester Hours

Transfer Credit: 3 semester units, meet GE requirements for Associates Degrees, and transfers to UC, CSU. Prerequisite Math-153

Course Syllabus OPRE/MIS Supply Chain Software The University of Texas at Dallas

Math 103, College Algebra Fall 2015 Syllabus TTh PM Classes

Retail Management. Office Hours: Tuesdays and Thursdays 8:30 to 9:30 am; 10:45 am to 12:30 pm; 1:45 pm to 2:45 pm Wednesdays 1 to 3:30 pm

BCE 101 SAMPLE COURSE SYLLABUS

MATH 1111 College Algebra Fall Semester 2014 Course Syllabus. Course Details: TR 3:30 4:45 pm Math 1111-I4 CRN 963 IC #322

Course Description This course will change the way you think about data and its role in business.

Syllabus. Construction Engineering Design

Methods and Models in Business Analytics

Los Angeles Pierce College. SYLLABUS Math 227: Elementary Statistics. Fall 2011 T Th 4:45 6:50 pm Section #3307 Room: MATH 1400

ACCT 5610/5613/6610/6616 Governmental and Not-For-Profit Accounting Fall 2014

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

QUANTITATIVE ANALYSIS FOR BUSINESS DECISIONS

INFO 2130 Introduction to Business Computing Fall 2014

SYLLABUS Fall 2013 MATH 115 ELEMENTARY STATISTICS. Class Section Name (on WileyPlus):

Part A of the Syllabus

RR765 Applied Multivariate Analysis

General Psychology. Course Syllabus

CS 649 Database Management Systems. Fall 2011

EDUC 1301: INTRODUCTION TO THE TEACHING PROFESSION COURSE SYLLABUS

PH 7525 Introduction to Data & Statistical Packages Course Reference #: Spring 2011

OM 335: OPERATIONS MANAGEMENT (Summer 2012)

(Texas Tech) AND (personal)

BUS4 118S Big Data San José State University Fall 2014

Accounting : Accounting Information Systems and Controls. Fall 2015 COLLEGE OF BUSINESS AND INNOVATION

INFS5873 Business Analytics. Course Outline Semester 2, 2014

Technology and Online Computer Access Requirements: Lake-Sumter State College Course Syllabus

BCM 247 BUSINESS COMMUNICATION Course Syllabus Fall 2012

MT120-ES: Topics in Applied College Math (4 credits; 100% online) Syllabus Fall 2013

AMIS 7640 Data Mining for Business Intelligence

Canvas: All tests and assignments will be submitted through use of Canvas, which can be found using the following link: learn-wsu.uen.

General Psychology. Professor. Course Description. Course Objectives. Accommodations. PSY 201 (10544, 10545) Fall 2013 M/W 4:00 5:50 ITC 211

CHEM PRINCIPLES OF CHEMISTRY Lecture

College of Charleston School of Business DSCI : Management Information Systems Fall 2014

MKTG 2150 GLOBAL MARKETING WINTER 2015 (Tuesday/Thursday course) - - -F I R S T D A Y H A N D O U T- - -

Dr. Stanny EXP 3082L Fall 2003 EXPERIMENTAL PSYCHOLOGY LABORATORY. Office Hours For Dr. Stanny: 9:00 AM - 11:30 AM Tuesday, Wednesday, & Thursday

Precalculus Algebra Online Course Syllabus

San José State University CS160, Software Engineering, Sections 1, 2, and 4, Fall, 2015

COURSE: PSYC 1101 (11) Introduction to Psychology TIME AND DAYS: Tuesdays & Thursdays; 1:00 2:15 pm CLASSROOM: Science Center 1405 (and computer lab)

Learning and Memory Adult Development. Lifespan Development Applied Psychology. Multi-Cultural Psychology Child Development

MGSC 590 Information Systems Development Course Syllabus for Spring 2008

ACCY 2001 Intro Financial Accounting Fall 2014

GB 401 Business Ethics COURSE SYLLABUS: Fall nd 8 Week Syllabus Mr. Robert Wells COURSE OVERVIEW

Course Syllabus for Commercial Photography 1

This four (4) credit hour. Students will explore tools and techniques used penetrate, exploit and infiltrate data from computers and networks.

Course Overview. Course Learning Objectives

Acct 206 INTRODUCTION TO MANAGERIAL ACCOUNTING Spring 2015 Section 002 SYLLABUS

Truman College-Mathematics Department Math 125-CD: Introductory Statistics Course Syllabus Fall 2012

CS 425 Software Engineering

New Course Proposal: ITEC-621 Predictive Analytics. Prerequisites: ITEC-610 Applied Managerial Statistics

Experimental Psychology PSY 3017, CRN Fall 2011

Business Computer Applications CGS 1100 Course Syllabus. Course Title:

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

MATHEMATICAL TOOLS FOR ECONOMICS ECON FALL 2011

University of North Texas at Dallas Fall 2015 SYLLABUS

New Course Proposal OSC 4820, Business Analytics and Data Mining

IT 230 Data Visualization

Austin Community College Marketing Research Marketing Fall 2009 Distance Learning

ITNW 2321 Networking with TCP/IP

CS 425 Software Engineering. Course Syllabus

STA 4442 INTRODUCTION TO PROBABILITY FALL 2012

Transcription:

SYLLABUS DSCI 4520 Introduction to Data Mining Fall 2016 CLASS (DAY/TIME): Wednesdays 6:30-9:20, BLB 070 INSTRUCTOR: Dr. Nick Evangelopoulos OFFICE HRS: TW 1:00-2:00pm at BLB 365D, T 5:30-6:30pm at Frisco CONTACT INFO: OFFICE PHONE: 940-565-3056 E-MAIL (preferred): Nick.Evangelopoulos@unt.edu Textbooks in printed and PDF file format Kattamuri Sarma, Predictive Modeling with SAS Enterprise Miner, Second Edition, SAS Press 2013, ISBN: 978-1-60764-767-6 (required, printed text) Data Mining Using SAS Enterprise Miner, A Case Study Approach, 3 rd Edition, SAS Publishing 2013 (required, free PDF), or ISBN 978-1-61290-638 6 (optional, printed text) Getting Started with SAS Enterprise Miner 14.1, SAS Publishing 2016 (required, free PDF) Getting Started with SAS Text Miner 13.2, SAS Publishing 2014 (required, free PDF) Getting Started with SAS Enterprise Miner 5.3, SAS Pub. 2008 (recommended, free PDF) Software IBM SPSS Statistics 22, IBM SPSS Modeler 15, SAS Enterprise Miner 13.2, SAS Text Miner 13.2. All these are available at the CoB lab, physically and via VMWare. Web Site http://www.cob.unt.edu/itds/courses/dsci4520/dsci4520.htm http://www.cob.unt.edu/itds/faculty/evangelopoulos/evangelopoulos.htm Purpose of the Course This course deals with the problem of extracting information from large databases and designing data-based decision support systems. The extracted knowledge is subsequently used to support human decision-making in the areas of summarization, prediction, and the explanation of observed phenomena (e.g. patterns, trends, and customer behavior). Techniques such as visualization, statistical analysis, decision trees, and neural networks can be used to discover relationships and patterns that shed light on business problems. This course will examine methods for transforming massive amounts of data into new and useful information, uncovering factors that affect purchasing patterns, and identifying potential profitable investments and opportunities. Learning Objectives 1. Understand the problems and opportunities when dealing with extremely large databases. 2. Review data visualization software used for interpreting complex patterns in multidimensional data. Learn to identify what information is useful and what is not. 2003-2016 Nicholas Evangelopoulos

3. Provide an understanding of predictive models and algorithms, as well as exploratory algorithms. 4. Examine all phases of decision making, including discovery and data query, data analysis and confirmation, presentation, and implementation of results. Class Attendance Regular class attendance and informed participation are expected. Academic Integrity This course adheres to the UNT policy on academic integrity. The policy can be found at http://vpaa.unt.edu/academic-integrity.htm. If you engage in academic dishonesty related to this class, you will receive a failing grade on the test or assignment, or a failing grade in the course. In addition, the case may be referred to the Dean of Students for appropriate disciplinary action. Students with Disabilities The College of Business complies with the Americans with Disabilities Act in making reasonable accommodations for qualified students with disability. If you have an established disability and would like to request accommodation, please see your instructor as soon as possible. You will need to register with the UNT Office for Disability Accommodation. Deadlines Dates of drop deadlines, final exams, etc., are published in the university catalog and the schedule of classes. Please be sure you stay informed about these dates. Student Perceptions of Teaching (SPOT) Student Perceptions of Teaching (SPOT) utilizes IASystem and is a requirement for all organized classes at UNT. This short Web-based survey will be available to you at the end of the semester, providing you a chance to comment on how this class is taught. I am very interested in this feedback from my students, as I work to continually improve my teaching. I consider SPOT to be an important part of your class participation. Cell Phones As a courtesy to your instructor and to your fellow classmates, you are asked to set your cell phone to vibrate. In case of a personal emergency, if you must use your cell phone, you are asked to step out of the classroom. Incomplete Grade (I) The grade of "I" is not given except for rare and very unusual emergencies, as per University guidelines. An I grade cannot be used to substitute your poor performance in class. If you think you will not be able to complete the class satisfactorily, please drop the course. Campus Closures Should UNT close campus, it is your responsibility to keep checking your official UNT e-mail account (EagleConnect) to learn if your instructor plans to modify class activities, and how. This may include changing assignment due dates, rescheduling quizzes and exams, etc. 2

Point Allocation DSCI 4520 DSCI 5240 Homework exercises (8 exercises) 25% 22.5% In-class quizzes (8 quizzes, 3 dropped) 5% 4.5% Mid-term Exam (in-class) 25% 22.5% Final Exam (take-home) 20% 18.0% Project (4 individual parts and 1 group part) 25% 22.5% Graduate Presentation 10.0% TOTAL 100% 100.0% Bonus for attending/giving a graduate presentation 0.5% 0.5% Bonus for graduate presentation Option C up to 20% Letter Grades: 90% or more = A 80% or more = B 70% or more = C 60% or more = D Below 60% = F Homework Exercises There will be 8 homework exercises that you will have to turn in. Exercises will be using IBM SPSS Statistics, IBM SPSS Modeler, SAS Enterprise Miner, and SAS Text Miner. The homework exercises ask you to perform certain types of analysis, capture screen shots, and answer questions. Related handouts and PowerPoint slides with data description, step-by-step instructions, and assignment details, will be available on Blackboard. HW7 closely follows the text in Getting Started with SAS Text Miner, referred to as GSTM text below. Homework is turned in electronically using Blackboard, in the form of a report document. If you turn in your HW report late, 50% of HW credit is awarded. HW1. Multiple Regression for TargetD using IBM SPSS Statistics. MYRAW data. HW2. Logistic Regression for TargetB using IBM SPSS Statistics. Small sample effects. MYRAW data. HW3. Overview of SEMMA process in SAS Enterprise Miner. Decision Tree and Logistic Regression. Model comparison. HMEQ data. HW4. Scoring, Reporting in SAS EM. HMEQ data. HW5. Clustering in SAS EM. SHOESTORE data. HW6. Association Analysis in SAS EM. ASSOCS data. HW7. Text Analytics in SAS Text Miner. Text cleanup, synonyms, stop list, topic extraction, and predictive modeling using text data. VAEREXT data. Based on the GSTM text. HW8. Introduction to IBM SPSS Modeler. Decision Tree and Logistic Regression. HMEQ data. Graduate Presentations This course meets with DSCI 5240, in the same classroom, at the same time. Graduate students who are enrolled in DSCI 5240 have a graduate presentation assignment. As an undergraduate student, you will have a chance to attend graduate presentations as a member of the audience. 3

Such attendance is optional. There is a small extra credit if you decide to attend the graduate presentations Term Project This course has a term project. You will be asked to analyze data related to the KDD-cup 98, an International competition for professional data miners. The data set will be available on Blackboard. Handouts describing what you have to do will be distributed in class. During the first 4 parts you will work individually and submit your work as a Word document that includes screen shots from Enterprise Miner and answers to various questions as described on the handouts. You will turn in your reports by uploading them on Blackboard. Grading and late penalty policies for PR1-PR4 are the same as with HW1-HW8. During the last part of the project you will form groups. Each group will include at least one undergraduate student and at least one graduate student. The maximum group size will be 6. Groups will be selfmanaged. If the group is not satisfied with some member s contribution they may choose to dismiss that person from the group. In such a case, alternative individual assignment will be given to the dismissed group member. The group will turn in a single PR5 report, listing all group member names, in printed hard copy format (i.e., brought to class, not uploaded on Blackboard). A summary of the project parts follows below. Topic PR1. Open the data, produce statistics and graphs PR2. Decision Trees PR3. Regression PR4. Neural Networks PR5. Final Written Report. Comparison and evaluation of 3 models (Decision tree, logistic regression, neural net). Work type Group 4

DSCI 4520/5240 TIME SCHEDULE Fall 2016, Denton section The schedule below is a tentative outline for the semester. It is meant to be a guide and several items are subject to change. Certain topics may be stressed more or less than indicated. Date Topics Assignment due Aug. 31 Intro to Data Mining, Ch1 Multiple Linear Regression Sept. 7 Logistic Regression, Ch 6 Stepwise Procedure HW1 (regression in SPSS, MYRAW) Sept. 14 SEMMA, CRISP-DM, HW2 (Log Reg in SPSS, MYRAW) Model comparison, Ch7 Sept. 21 Scoring & Deployment HW3 (SEMMA, HMEQ) Sept. 28 Decision Trees, Sarma Ch 4 HW4 (scoring, HMEQ) Oct. 5 Decision Trees, Sarma Ch 4 PR1 (data explor., DONOR_RAW1) Oct. 12 Neural Networks, Sarma Ch 5 PR2 (trees, DONOR_RAW1, 2, 3 ) Oct. 19 Clustering Analysis PR3 (reg, DONOR_RAW) Oct. 26 Review for Exam 1 PR4 (neural nets, DONOR_RAW) Nov. 2 *** Exam 1 (in-class) *** Grad Pres. signup deadline (type C only) Nov. 9 Association Analysis HW5 (clustering, SHOESTORE) Nov. 16 Text Mining, Sarma Ch. 9 Grad Pres. signup deadline (types A-B) HW6 (market basket, ASSOCS) Nov. 23 Buffer lecture (use as needed) Grad Presentations are due by 5PM (Alternative date for HW7 (text mining, VAEREXT) grad student presentations) 0.5 pt. extra credit for attendance (UG st.) Nov. 30 Grad student presentations HW8 (SPSS Modeler, HMEQ) Buffer lecture (use as needed) 0.5 pt. extra credit for attendance (UG st.) Dec 7 Course review PR5 (project report, hard copy) Take-home final handed out 0.5 pt. extra credit for attendance Dec 14 *** Final Exam (take-home, due 11:59PM on Blackboard) *** 5