UWT TCSS 555 Data Mining Course Syllabus Spring 2014. Instructors: Senjuti Basu Roy. Location: TLB 115. Class sessions:



Similar documents
How To Learn Data Analytics

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

MAC2233, Business Calculus Reference # , RM 2216 TR 9:50AM 11:05AM

College Algebra MATH 1111/11

Department of Accounting ACC Fundamentals of Financial Accounting Syllabus

FIN 430: Financial Modeling (Spring 2016) Professor Russell Jame Course Overview and Objectives Course Prerequisites Required Materials

CS 5890: Introduction to Data Science Syllabus, Utah State University, Fall

Introduction to Information Technology ITP 101x (4 Units)

MATH 1310, SECTION 17086

The University of Akron Department of Mathematics. 3450: COLLEGE ALGEBRA 4 credits Spring 2015

SYLLABUS MAC 1105 COLLEGE ALGEBRA Spring 2011 Tuesday & Thursday 12:30 p.m. 1:45 p.m.

Economics : Principles of Microeconomics

MAT 1500: College Algebra for the Social and Management Sciences General Syllabus

Entrepreneurship 490a Grand Challenges for Entrepreneurship

Shepherd University Department of Psychology COURSE SYLLABUS

Math : Applied Business Calculus

BBA SMALL BUSINESS MANAGEMENT Spring 2016

Introduction to Data Science: CptS Syllabus First Offering: Fall 2015

MATH 2103 Business Calculus Oklahoma State University HONORS Spring 2015 Instructor: Dr. Melissa Mills 517 Math Sciences

IST565 M001 Yu Spring 2015 Syllabus Data Mining

Gustavus Adolphus College Department of Economics and Management E/M : MARKETING M/T/W/F 11:30AM 12:20AM, BH 301, SPRING 2016

University of Washington, Tacoma TCSS 360 (Software Development and Quality Assurance Techniques), Spring 2005 Handout 1: Course Syllabus

Ordinary Differential Equations

MBA 6410 Strategic Global Marketing 3 Credit Hours Milton Fall Term 2, 2014

MAT 117: College Algebra Fall 2013 Course Syllabus

Course Description This course will change the way you think about data and its role in business.

DSBA/MBAD 6211 Advanced Business Analytics UNC Charlotte Fall 2015

Columbus State Community College COLS 1100: First Year Experience Seminar Course Information: 1 credit, meets 1 hour per week, no pre-requisite

Analytical Chemistry Lecture - Syllabus (CHEM 3310) The University of Toledo Fall 2012

Syllabus. HMI 7437: Data Warehousing and Data/Text Mining for Healthcare

New York University Stern School of Business Undergraduate College

College of Health and Human Services. Fall Syllabus

FINN Principles of Risk Management and Insurance Summer 2015

Whenever possible, I will announce changes to the course via the Canopy announcement function.

TA contact information, office hours & locations will be posted in the Course Contacts area of Blackboard by end of first week.

CASPER COLLEGE COURSE SYLLABUS

Professor: Dr. Esra Memili Office: 370 Bryan Office Hours: Monday 2:00-6:00pm and 8:50-9:50pm, and by appointment

Ordinary Differential Equations

CS 261 C and Assembly Language Programming. Course Syllabus

Business Management MKT 829 International Sport Marketing

Sample Online Syllabus

CS 1340 Sec. A Time: 8:00AM, Location: Nevins Instructor: Dr. R. Paul Mihail, 2119 Nevins Hall, rpmihail@valdosta.

General Psychology. Fall 2015

Psychology 420 (Sections 101 and 102) Experimental Psychology: Social Psychology Laboratory

Syllabus EMEN 5080, Business Ethics

Seattle Central Community College BITCA Division. Syllabus MIC Online

Pre-requisite: Completion or exemption from first communication course, Comm A

Rollins College Entrepreneurial and Corporate Finance BUS 320- H1X

MKTG 330 FLORENCE: MARKET RESEARCH Syllabus Spring 2011 (Tentative)

Community College of Philadelphia Spring 2010 Math 017-Elementary Algebra SYLLABUS

Syllabus for IST 346 Operating Systems Administration Permanently Tentative

Psych 204: Research Methods in Psychology

Syllabus Principles of Microeconomics ECON200-WB11 Winter Term 2016

January 10, Course MIS Enterprise Resource Planning Professor Dr. Lou Thompson Term Spring 2011 Meetings Thursday, 4-6:45 PM, SOM 1.

HCC ONLINE COURSE REVIEW RUBRIC

Lincoln Land Community College Business and Technologies Division COS Office Professional Syllabus - 3 credit hours

Math 830- Elementary Algebra

STAT 1403 College Algebra Dr. Myron Rigsby Fall 2013 Section 0V2 crn 457 MWF 9:00 am

School of Business and Nonprofit Management Course Syllabus

College Algebra Online Course Syllabus

PSY 201 General Psychology Online Fall credits

GEOG 5200S Elements of Cartography : Serving the Community Through Cartography Spring 2015

Introduction to Psychology Psych 100 Online Syllabus Fall 2014

Biology W Fundamentals of Nutrition 13 week online Spring 2015

Course Objectives. Learning Outcomes. There are three (3) measurable learning outcomes in this course.

Johnson County Community College

Corporate and Brand Identity on the Web: VIC5315 University of Florida Summer 2013

Course Objectives: This is a survey course to introduce you to the federal income tax system. The objectives of the course are to:

COURSE AND GRADING POLICY

ISM and 05D, Online Class Business Processes and Information Technology SYLLABUS Fall 2015

MAT Elements of Modern Mathematics Syllabus for Spring 2011 Section 100, TTh 9:30-10:50 AM; Section 200, TTh 8:00-9:20 AM

Health Sciences 4250a: Population Health Interventions

GEOGRAPHY 339: DISASTER MANAGEMENT AND COMMUNITY RESILIENCE DEPARTMENT OF GEOGRAPHY, UNIVERSITY OF VICTORIA Course outline Fall 2015

INFM 700: Information Architecture

INDIVIDUAL, SOCIETY, AND CULTURE

Portland Community College - Cascade Campus MM Credits 3D for Interactivity CRN: 18072

DIVISION OF NATURAL SCIENCES AND MATHEMATICS TIDEWATER COMMUNITY COLLEGE VIRGINIA BEACH CAMPUS COURSE PLAN

MAR 4232 Retail Management Syllabus Spring 2014 Term

This four (4) credit hour. Students will explore tools and techniques used penetrate, exploit and infiltrate data from computers and networks.

CS 2302 Data Structures Spring 2015

CASPER COLLEGE COURSE SYLLABUS HOSP 1520-Intro to Hospitality Management-N1 Fall 2015

CSC 281 Automata and Algorithms Spring 2009

CS Data Science and Visualization Spring 2016

Math 121- Online College Algebra Syllabus Spring 2015

Kent State University, College of Business Administration. Department of Accounting, Fall REVISED Aug 22, Instructor:

4ECE 320 Signals and Systems II Department of Electrical and Computer Engineering George Mason University Fall, 2015

BUS315: INTRODUCTION TO FINANCIAL MANAGEMENT COURSE OUTLINE

Criminal Justice I. Mr. Concannon Smith Website:

Ranger College Syllabus

Physics 230 Winter 2014 Dr. John S. Colton

Mechanical Engineering Technology Mech 257 Solid Modeling Applications

SYLLABUS. NOTE: A three ring binder is required to keep notes and hand-outs neatly organized.

COURSE DESCRIPTION. Required Course Materials COURSE REQUIREMENTS

Child Development 1 Child Growth & Development - Section # 0180 Fall 2015 Wednesday 12:10 pm 3:20 pm

Course Description and Objectives

ASTR 100 Introduction to Astronomy Syllabus for Fall 2015

COURSE REQUIREMENTS AND EXPECTATIONS FOR ALL STUDENTS ENROLLED IN COLLEGE ALGEBRA ROWAN UNIVERSITY CAMDEN CAMPUS SPRING 2011

Course Syllabus OPRE/MIS Supply Chain Software The University of Texas at Dallas

Transcription:

UWT TCSS 555 Data Mining Course Syllabus Spring 2014 Instructors: Senjuti Basu Roy Location: TLB 115 Class sessions: MW 4:15-6:20 Instructor Office: CP 229 Office Hours: TBD (email: senjutib@uw.edu) Class mailing list Class moodle website Course Overview: Welcome. The data mining course presents methods and systems for mining varied data and discovering knowledge from data. After detailing a data mining system architecture and tasks, the course examines and compares specific methods in data mining, such as data preparation, classification, clustering, and text mining. Several applications are detailed, and tools to build new applications are provided. The task of knowledge discovery is then outlined as a higher- level goal of data mining. Familiarity with statistics, and database systems; in particular database design is expected. The primary objectives of the course are: Understand algorithms and methods of data mining. Develop data mining programs and applications. Program using available data mining tools and general- purpose languages. Understand analysis, metrics, visualization and navigation of data mining results. Learn how to use a few commercial data mining tools (amongst a selection of: RStudio, Matlab, Jung library, Weka). The outcomes of the course are: Upon successful completion of the course, students are able to: Explain the basic principles of the primary data mining techniques. Explain the difference between data mining, data warehousing, machine learning, etc. Design mining models and manage databases to enable data mining technologies as part of larger systems. Describe issues facing latest trends in data mining. Prerequisites: (Under)Graduate Standing in CSS. TCSS 445 or equivalent database systems design experience at work or at internships. You should be proficient in database design and have an understanding of basic database system implementation techniques. In addition to that, basic understanding in probability and statistics is desirables. We also recommend that all students have prior experience with at least one programming language such as C/C++/Java/C# and have demonstrated ability to work on coding algorithms and data structures. Students with no programming experience or interest should not take this course.

Texts: Data Mining: Concepts and Techniques (3rd edition), by Jiawei Han, Micheline Kamber and Jian Pei, Morgan Kaufmann, 2011 Supplementary Text: Prof. Zaki s text, Massive Data Mining by Jure Leskovec et. al. (available on moodle) Additional Readings: The links to additional readings will be made available on moodle during the term. You may bring your readings to each class session so that you can refer to them during discussion. You will additionally do a number of readings of your own choosing for the project and the accompanying research paper that you will write. Your instructors may assign specific readings to individual students. Course Content: There will be several aspects to this course: 1. Instructor lectures 2. Individual assignments: Assignments should be done individually and submitted online on moodle by the due date. 3. Class Notes: Everyone takes class notes. You can form an ad- hoc group, meet in- person or collaborate online to discuss and then submit your class notes. Bring hard- copy to class each Monday. The notes should read like a chapter with your explanation of the concepts discussed in class that week. 4. In- Class discussion sessions or in- class labs (every Wednesday) on non- technical reading, data mining algorithms and implementation techniques. 5. Guest lectures may be invited depending on schedule and availability from industry describing latest research trends and implementation issues in commercial systems. 6. Class quiz: Each Wednesday the class will begin with a short quiz based on the reading assignments and class discussion from the previous week. This quiz is to ensure that you are taking the readings seriously and are absorbing the material presented in class. 6. Class project: A class project will be assigned and will be completed individually. 7. Midterm Exam Class Schedule: (may change - updated version always on moodle) Week 1 topics covered Introduction and Math Foundations Readings (text; other) Ch.1,2; Linear Algebra Week 2 Data Preprocessing, Warehousing, OLAP Ch.3,4

Week 3 Frequent Pattern Mining Ch. 6 Week 4 Classification: basic concepts Ch. 8, 9 Week 5 Clustering: basic concepts Ch. 10 Week 6 Week 7,8 Week 9 Recommendation Systems Mining Social Network Graphs Outlier detection Ch.13.3.5; supplementary reading Ch 10, Massive Data Mining, by Jure Leskovec. Ch. 12 Week 10 & 11 Evaluating Data Mining Models Ch. 31 (Zaki text), IR evaluation techniques; Class Attendance and participation: Attendance and participation is expected. It will be counted towards your final grade (5%). We will not take attendance each day but since the class size is relatively small, absences will not be hard to miss. Early departures will not be counted as attending. Each student may miss up to one class with no penalty. This policy is not in effect if there is a flu epidemic (see last section). Class Etiquette: It is sometimes useful to distance ourselves from technology to obtain an environment of quality in- class discussion. Hence, kindly minimize the use of personal laptop computers in the classroom unless appropriate. Please turn off all cell- phones and PDAs unless taking notes. If placed on vibrate mode, please ensure that the buzz does not disturb your neighbors during the sessions. Please bring a pen (black or blue) and a writing pad to class to take notes. Come prepared to the class in the following manner: Read the assigned reading several days prior to the session on that topic.

Guest lectures will be invited. If you have suggestions for guest lecturers you can email the instructor about them. Maintain a sense of decorum and respect for the guest speaker. They are providing a valuable volunteer service by giving us the benefit of their expertise. We can make their visit meaningful for them only if we have read about the topic and come prepared to hear them. We can demonstrate our preparedness by asking meaningful questions that advance their research interests. Hence, be extra careful in designing your questions for these days. Give these papers extra attention. All assignments are to be submitted to moodle on the date due, unless otherwise specified in class. If a handout is needed, make sure that your name is on each sheet that is handed in and that any hand- in with multiple pages is stapled. You should as a backup also upload your submissions to appropriate slots on Moodle to facilitate grade recording. Failure to submit /uploads to moodle prior to deadline will result in automatic denial of grade for that work. To keep the grading process on time, extensions are not available and should not be requested. Additional Material for writing support: You may choose current significant (science- or technology- related where applicable) news stories from the press to support your writing in the class. Provide full bibliographic information for each story in APA style. Bibliographic style Use APA Style for bibliographic references and citations in everything that you write. Make sure to use the special format for electronic references. Grading Scheme*: I reserve the right to make small adjustments to grade weights, or to add small assignments as the need arises. Item Grade (% of Due date final grade) Homework Assignments (total 3) 3 x 7= 21% Home works would be posted on Fridays. It would be due on Fridays after 2- weeks. (HW- 1 posting date, 12 th April) (HW- 2 posting date, 3rd May) (HW- 3 posting date, 24 th May) Reading, Class Notes, class 5+5% Each week participation (every week) Reading based Class Quiz (7 total) 7x4= 28% Each week Final Project 25% June 10 before midnight (Posting Date, May 5) Midterm 16% May 1 (for 2 hours, followed by another 2 hours of regular class activities) Additionally, there will be 2 extra- credit homework assignments. Each will be of worth 5%.

*30% late submission penalty if submitted within one day of deadline. Beyond that, late submission is not entertained. Grading will not be on a curve. The correspondence between percentage scores and letter grades is given as in the Grading matrix: Grade GPA Score Grade GPA Score Grade GPA Score A 4.0 98-100 B+ 3.4 88-89 C+ 2.4 77 3.9 95-97 3.3 87 2.3 76 A- 3.8 93-94 3.2 86 2.2 75 3.7 92 B 3.1 85 C 2.0-2.1 70-75 3.6 91 3.0 83-84 C- 1.8-1.9 68-69 3.5 90 2.9 82 1.5-1.7 65-67 B- 2.8 81 1.2-1.4 62-64 2.7 80 D 0.9 1.1 60-62 2.6 79 E 0.0-0.8 <60 2.5 78 Grading Policy Details: Attendance/Class Participation- 5 points: complete attendance (1 class absenteeism allowed if you email the instructor ahead of the class and you don t have an assigned duty in class), actively contribute to the sessions, enriching the class experience, help others in learning the concepts and contribute to projects and discussions. 3 points: All the above except for lesser attendance. 1 point: some participation, irregular attendance. 0 point: irregular attendance, no interaction and participation. Quiz Each quiz graded based on its questions. No extra credit awarded. Assignments Each assignment graded based on its questions and points therein. Complete work mostly qualifies for full grade. Partial credit is dependent on the amount of work completed. The assignments are supposed to enhance your understanding and help improve your hands on skills using data mining tools and languages such as R. These skills are central to success as a data miner in the industry and all assignments use real- world datasets. You should look out and discuss when you discover new things as you try to solve the problems. General Student Post- conditions: 1. Experience with designing and implementing programs to handle data of significant size (about Million records), using appropriate data mining algorithms, data structures, code organization, and documentation. 2. Familiarity with characteristics of data mining methods wrt worst- case analysis and its influence on choices of data structures, algorithms, and program design. Characteristics of a below average (C) outgoing student- 1. Able to implement and document a mediocre quality data mining algorithm when given guidance for how to design and implement it. 2. Knowledge of data mining algorithms: how they are defined, how they work. 3. Knowledge of when to employ appropriate statistical measures that have been discussed in the course.

Characteristics of an average (B) outgoing student All of the above and the following: 1. Able to implement and document an above average quality data mining algorithm when given some guidance for how to design and implement it. 2. Understanding of when to employ what data mining algorithm in situations that are simple variations of those described in the course. 3. Has rudimentary ability to generate and evaluate different optimization alternatives based on the criteria described in the course. 4. Has rudimentary ability to reason about programs performance. Characteristics of an above average (A) outgoing student All of the above and the following: 1. Able to implement and document a high quality, high complexity (applicable in the real- world) program when given little guidance for how to design and implement it. 2. Understanding of when to employ appropriate data mining algorithms, even in novel situations. 3. Has a strong ability to generate and evaluate different design alternatives based on the criteria described in the course. Has a strong ability to reason about the data mining solutions performance and ability to adapt to new data. Services Support for students with disabilities: DISABILITY SUPPORT SERVICES (Student Health and Wellness - SHAW): The University of Washington Tacoma is committed to making physical facilities and instructional programs accessible to students with disabilities. Disability Support Services (DSS) functions as the focal point for coordination of services for students with disabilities. In compliance with Title II of the Americans with Disabilities Act, any enrolled student at UW Tacoma who has an appropriately documented physical, emotional, or mental disability that "substantially limits one or more major life activities [including walking, seeing, hearing, speaking, breathing, learning and working]," is eligible for services from DSS. If you are wondering if you may be eligible for accommodations on our campus, please contact the DSS reception desk at 692-4828, or visit http://www.tacoma.washington.edu/studentaffairs/shw/dss_about.cfm" CSS Mentors: The CSS mentors staff the Science Lab (SCIENCE 106) throughout the week. They can provide help with specific questions about specific classes. Please note, however, that they will not do your homework for you. Instead, they will help you when you get stuck (either in programming or in homework) and help you develop the reasoning skills you need to solve future problems. See the CSS Mentors website for information about when the mentors are in the labs and other information. Center for Teaching, Learning & Technology: The Center for Teaching, Learning & Technology offers academic and technical support for students at all levels of expertise - review, upper division, graduate and TA. For your writing, reading, study skills and public speaking needs, please make an appointment online at www.tacoma.washington.edu/ctlt/ or visit KEY 202. For your Math needs, assistance is available on a drop- in basis, Monday Thursday, hours to be posted. For multimedia or video projects, please visit the Multimedia Lab located in MAT 251. For student software training, please register at www.tacoma.washington.edu/ctlt/training/student/index2.cfm Safety Escorts: Safety escorts are available to accompany you to your vehicle Monday through Thursday from 5:00pm to 10:30pm, except holiday's, Breaks and Summer quarter. Dial #300 from a non- campus phone or #333 on a campus telephone and a Campus Safety Escort will walk you safely to your vehicle. In case of emergency, follow your professor's instructions. When an alarm sounds, evacuate the building immediately. MATT, CP, WG, GWP, and BB buildings assemble in the Cragle Parking Lot south of the library. BHS, WCG, and DOU buildings assemble near the transit station next to the Pinkerton Building on Broadway across from Spaghetti Factory). Pinkerton occupants go to the convention center parking lot north of Pinkerton. For more information about emergency procedures and information, please go to: http://www.tacoma.washington.edu/safety/

Emergency Phoning: From campus phones, report emergencies by dialing 9-911 and state the T- number that is on a sticker on the phone; from non- campus phones dial 911. Building location numbers are posted on all buildings. For assistance with non- emergencies call Campus Safety at 2-4416 from a campus phone, and 253-692- 4416 from a non- campus phone. Inclement Weather: In the event of inclement weather, UWT's hotline at 253-383- INFO indicates whether classes have been cancelled. Please see the inclement weather page at: http://www.tacoma.washington.edu/policies_procedures/inclement_weather.pdf for more information. Additional Policies Collaboration: All assignments in this course are to be done as indicated earlier individually or in your own group(s). This does not mean that you cannot discuss anything about this course with others. What it does mean is that anything that you hand in must accurately represent your knowledge and work. It is expected that all students will read and follow guidelines as detailed in the UW Student Conduct Code. Plagiarism: This class will heavily involve the use of the written works of others. Your own written work will involve discussing the ideas of others. When using the ideas of others, it is important to acknowledge whose ideas you are using, and to clearly distinguish the ideas of others from your own. To convey the impression, whether inadvertently or deliberately, that another's work is your own, is called plagiarism. Plagiarism is a serious offense in the university. Prof. Tenenberg has written a guideline on plagiarism and how to avoid it (i.e. by scrupulously citing your sources), and I expect that you will abide by it. Although this guideline is geared toward the use of other's computer programs, it applies equally well to other kinds of text, such as those that you will use in this class.