Syllabus. HMI 7437: Data Warehousing and Data/Text Mining for Healthcare



Similar documents
Data Mining and Business Intelligence CIT-6-DMB. Faculty of Business 2011/2012. Level 6

PSYCH 3510: Introduction to Clinical Psychology Fall 2013 MWF 2:00pm-2:50pm Geology 108

University of Missouri Department of Psychological Sciences Psychology General Psychology Fall 2015

College of Health and Human Services. Fall Syllabus

How To Learn Data Analytics

CS 5890: Introduction to Data Science Syllabus, Utah State University, Fall

DBTech Pro Workshop. Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining. Georgios Evangelidis

City University of Hong Kong. Information on a Course offered by the Department of Management Sciences with effect from Semester A in 2012 / 2013

LSC 740 Database Management Syllabus. Description

City University of Hong Kong. Information on a Course offered by Department of Management Sciences with effect from Semester A in 2010 / 2011

Data Mining Solutions for the Business Environment

Introduction to Data Mining

Introduction to Personality Psychology 2320, Spring 2013 TTh 5:30-6:45 Arts and Science 110 (Allen Auditorium)

CSCI-599 DATA MINING AND STATISTICAL INFERENCE

Knowledge Discovery from Data Bases Proposal for a MAP-I UC

Course Description This course will change the way you think about data and its role in business.

Cleveland State University

KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE

AMIS 7640 Data Mining for Business Intelligence

Psychology 2510: Survey of Abnormal Psychology (Section 2) Fall 2015

IST565 M001 Yu Spring 2015 Syllabus Data Mining

COURSE SYLLABUS. Enterprise Information Systems and Business Intelligence

Business Intelligence and Analytics SCH-MGMT 553 (New course number being proposed) Tu/Th 11:15 AM 12:30 PM in SOM Lab 20

Data Warehousing and Data Mining

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS ADVANCED DATABASE MANAGEMENT SYSTEMS CSIT 2510

More details >>> HERE <<<

Northeastern University College of Professional Studies. ITC PC Database Software Winter B 2016 February 22, 2016 March 28, 2016

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113

CAS CS 565, Data Mining

Dynamic Data in terms of Data Mining Streams

Subject Description Form

Data Warehousing and Data Mining

Knowledge Discovery Process and Data Mining - Final remarks

Healthcare Measurement Analysis Using Data mining Techniques

Data Mining Carnegie Mellon University Mini 2, Fall Syllabus

Information Management course

Course Design Document. IS417: Data Warehousing and Business Analytics

Northeastern University Online College of Professional Studies Course Syllabus

Accounting : Accounting Information Systems and Controls. Fall 2015 COLLEGE OF BUSINESS AND INNOVATION

L&I SCI 410: Database Information Retrieval Systems

INFO B512 Scientific and Clinical Data Management

Introduction to Data Mining

MSIS 635 Session 1 Health Information Analytics Spring 2014

PSY 201 General Psychology Social & Behavioral Sciences Department

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

Lecture: Mon 13:30 14:50 Fri 9:00-10:20 ( LTH, Lift 27-28) Lab: Fri 12:00-12:50 (Rm. 4116)


CSE532 Theory of Database Systems Course Information. CSE 532, Theory of Database Systems Stony Brook University

AMIS 7640 Data Mining for Business Intelligence

Data Warehouses and Business Intelligence ITP 487 (3 Units) Fall Objective

Data Mining and Soft Computing. Francisco Herrera

MIS636 AWS Data Warehousing and Business Intelligence Course Syllabus

MIS630 Data and Knowledge Management Course Syllabus

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI

DATA MINING FOR BUSINESS INTELLIGENCE. Data Mining For Business Intelligence: MIS 382N.9/MKT 382 Professor Maytal Saar-Tsechansky

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

CSC-570 Introduction to Database Management Systems

Health Informatics CS580C1/EL Course Format (On Campus/Blended)

How To Solve The Kd Cup 2010 Challenge

A Brief Tutorial on Database Queries, Data Mining, and OLAP

Ursuline College Accelerated Program

CSC475 Distributed and Cloud Computing Pre- or Co-requisite: CSC280

University of North Texas, School of Library and Information Sciences SLIS , 005, 007, 009 SLIS , 005, 007, 009

Angelo State University. PSY 6347 Life-Span Development Psychology. fall, James Forbes, PhD

Northeastern University Online College of Professional Studies Course Syllabus

COURSE PROFILE. Business Intelligence MIS531 Fall

CSE 412/598 Database Management Spring 2012 Semester Syllabus

Teaching Big Data and Analytics to Undergraduate and Graduate Students

Psychological Testing (PSYCH 149) Syllabus

CS 649 Database Management Systems. Fall 2011

MKTG MARKETING RESEARCH 2010 INSTRUCTOR INFORMATION

Database Marketing, Business Intelligence and Knowledge Discovery

Cleveland State University

ISQS 3358 BUSINESS INTELLIGENCE FALL 2014

Course Specification

CAROLINAS COLLEGE OF HEALTH SCIENCES SCHOOL OF NURSING COURSE SYLLABUS

Psychology 4978: Clinical Psychology Capstone (Section 1) Fall 2015

Three Perspectives of Data Mining

IS 592 Big Data Analytics Spring 2014

A Statistical Text Mining Method for Patent Analysis

Software Quality. Learning outcomes and evaluation: Students that successfully complete the course will be able to:

Crime Scene Investigation Central College

CS 300 Data Structures Syllabus - Fall 2014

Technology and Online Computer Access Requirements: Lake-Sumter State College Course Syllabus

APK 3400 Introduction to Sport Psychology University of Florida Department of Applied Physiology & Kinesiology Spring 2016

IST659 Database Admin Concepts & Management Syllabus Spring Location: Time: Office Hours:

ART 261 T/TH 1-2:15. University of Nevada, Reno

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

COURSE SYLLABUS ACCT 102 ID8W2, PRINCIPLES OF ACCOUNTING II 2015FA

TMGT W Principles of Cost Engineering Course Syllabus: Spring 2013 Online ecollege Course

Transcription:

Syllabus HMI 7437: Data Warehousing and Data/Text Mining for Healthcare 1. Instructor Illhoi Yoo, Ph.D Office: 404 Clark Hall Email: muteaching@gmail.com Office hours: TBA Classroom: TBA Class hours: TBA 2. Prerequisites Students are expected to have taken the following courses before registering this course. Algorithm Design/Programming I and II (2 semesters or CS 1050 and 2050 or equivalent) Database management systems I (CS 4380/7380 or equivalent) 3. Course Background A huge amount of clinical data, biomedical data, genomic data, and healthcare data have been produced and collected in disparate repository systems. Most of these data have piled up on an unprecedented scale without any analysis. Using data warehouse and data/text mining technologies, we can systematically integrate and re-organize multiple heterogeneous data sources to extract meaningful hidden patterns from disparate data sources. Ultimately, these technologies enable us to transform healthcare raw data into healthcare enterprise decisions. Thus, one of the ultimate goals of data warehouse and data/text mining in our field could be to enable physicians to make clinical decisions based on each patient s genome or to even facilitate the personalized medicine based on each patient s genome. 4. Course Description This course provides an introduction to the basic concepts of data warehouse and data/text mining and creates an understanding of why we need those technologies and how they can be applied to healthcare problems. University of Missouri-Columbia 1

5. Course Objectives Understanding major concepts of data warehouse (DW) and data/text mining (DM/TM) Understanding why we need DW and DM/TM for healthcare Understanding how DW, DM and TM are related to or different from information retrieval (IR), information extraction (IE), database query, statistics, and machine learning (ML) Introducing major DM algorithms such as decision trees, clustering, classification, association, etc. Understanding how these algorithms can be applied to biomedical/healthcare problems. For example, you would o Forecast patient s disease(s) based on patient s current record More efficient disease prevention Cost-effective medical examination o Determine which medical examinations are more accurate to diagnose a specific disease Multiple cheap medical exams could be more accurate to diagnose diseases than an expensive medical exam. Cost-effective medical examination Demonstrating major DW and DM/TM solutions available in the market/internet o commercial DM/TM solutions and research-oriented DM/TM solutions Introducing the importance of ontologies in DM/TM Introducing how life science, healthcare and health insurance companies have used DM through their DM success stories 6. Course projects The course project must be done individually. The written project report must be submitted in electronic form only such as MS Word. There are four kinds of course projects since the student needs for the course could be very different. Survey projects o Conducting a comprehensive survey on any DM/TM applications Searching articles in the qualified journals or proceedings (see Appendix) that discuss how DM/TM has been applied to healthcare systems and summarizing them Research-oriented (theoretical) projects o Must incorporate new/novel ideas in DW, DM, or TM o The report should be in a style of a scientific paper including the related work. Implementation (programming) projects o You can implement existing or novel DM/TM algorithms. o You should include the source code, the screen captures, test data sets, the limitations of the programs, etc. Their Own o but must be deliverable Or the instructor can recommend a course project to students based on their background and current occupation if students want. University of Missouri-Columbia 2

Course Project Procedure and Deadline Deadline 9/24/07 Week 5 13 Week 14 (11/26/07) Week 15 12/3/07 Submission Proposal Submission; you will be notified of its approval within a few days Working on your course project Course project report submission Revising Project Report based on instructor s review Final Course Project Report Submission University of Missouri-Columbia 3

7. Evaluation 20% Assignments HW1: Business Intelligence success stories in healthcare o 10% of course grade o Given in 4 th week Lab1: Data/Text Mining Lab o 10% of course grade o Given in 11 th week 25% Mid-term exam 25% Final-term exam 30% Project: Project report (15%) and final revised-submission (15%) University of Missouri-Columbia 4

8. Course Schedule Dates Topics Reading Week 1 (8/27/07) Database I Week 2 (9/4/07) Database II Week 3 [HK]Ch1, Introduction to data mining (9/10/07) [RG]Ch1,2,5 Week 4 (9/17/07) Business Intelligence (BI) success stories in healthcare Week 5 [HK]Ch2, Data Preprocessing (9/24/07) [RG]Ch5 Week 6 [HK]Ch3, Data Warehouse (DW) and OLAP (10/1/07) [RG]Ch6 Week 7 (10/8/07) Mid-term exam Week 8 [HK]Ch5, Data Mining Algorithm: Association Rules (10/15/07) [RG]Ch2,3 Week 9 [HK]Ch6, Data Mining Algorithm: Classification (10/22/07) [RG]Ch2,3 Week 10 [HK]Ch7, Data Mining Algorithm: Clustering (10/29/07) [RG]Ch2,3 Week 11 (11/5/07) Data Mining Lab [RG]Ch4,A,B Week 12 (11/12/07) Text Mining for MEDLINE [HK]Ch10 Week 13 (11/19/07) Thanksgiving Week 14 (11/26/07) Course Project Report Submission Revising Project Report based on peer-reviews Week 15 (12/3/07) Final Course Project Report Submission Week 16 (12/10/07) Final-term exam The instructor will provide lecture notes for each topic shown in the course schedule above. The instructor will refer to the required textbooks and the recommended textbooks (shown in Section 9 below) for lecture notes and, to make them self-contained and for students convenience, include tables and figures from the references; you should note the citations (e.g., [HK], [HMS]) in lecture notes. Basically, the lecture notes will be based on [HK]. University of Missouri-Columbia 5

9. Useful Resources 9.1 Data/Text Mining tools Name How to get it Note idata Analyzer (MS Excelbased data mining tool) MineSet Oracle Microsoft SAS WEKA Data Mining: A Tutorial-based Primer The CD includes two real medical data sets as well as other real data sets; Cardiology patient dataset from VA Medical Center in CA and Spine clinic dataset. http://www.purpleinsight.com http://www.oracle.com/solutions/business_intelligence/index.html http://www.microsoft.com/sql/technologies/dm/default.mspx http://www.sas.com/technologies/bi/ Research-oriented data mining http://www.cs.waikato.ac.nz/ml/weka/ tool (open source SW) 9.2 Data/Text Mining Tutorials SQL Server 2005 Data Mining Tutorial o http://msdn2.microsoft.com/en-us/library/ms167167.aspx SQL Server 2005 Data Mining Concepts o http://msdn2.microsoft.com/en-us/library/ms174949.aspx Solving Business Problems with Oracle Data Mining o http://www.oracle.com/technology/obe/obe10gdb/bidw/odm/odm.htm 9.3 Success Stories Study how BI has been used in their companies. SAS Customer Success in Healthcare and Health insurance o http://www.sas.com/success/industry.html#healthcare Oracle Business Intelligence (BI) Customers o http://www.oracle.com/customers/solutions/bi.html Success Stories for Microsoft products o http://www.microsoft.com/casestudies/ 9.4 Training National Center for Biotechnology Information (NCBI) s PowerScripting: FREE 4 day course. You will learn how to take advantages of NCBI databases using programming languages. http://www.ncbi.nlm.nih.gov/class/powertools/eutils/course.html University of Missouri-Columbia 6

10. References Required Textbooks: [HK] Data Mining - Concepts and Techniques by Jiawei Han and Micheline Kamber, Second Edition, Morgan Kaufmann, 2006, ISBN 1-55860-901-6 [RG] Data Mining - A tutorial-based primer by Richard J. Roiger and Michael W. Geatz, Addison Wesley, 2003, ISBN 0-201-74128-8 Recommended Textbooks: [HMS] Principles of Data Mining by D. Hand, H. Mannila, and P. Smyth, MIT Press, 2001, ISBN 0-262-08290-X [WF] Data mining: practical machine learning tools and techniques, Ian H. Witten and Eibe Frank, Second Edition, Morgan Kaufmann, 2005, ISBN 0-12-088407-0 [TSK] Introduction to Data Mining, P. Tan, M. Steinbach, and V. Kumar, Pearson Education, 2006, ISBN 0-321-32136-7 [Dunham] Data Mining - Introductory and Advanced Topics by Margaret H. Dunham, Prentice Hall, 2003, ISBN 0-13-088892-3. [KR] The Data Warehouse Toolkit, R. Kimball and M. Ross, Wiley, 2002, ISBN 0-471- 20024-7 [AM] Text Mining for Biology and Biomedicine, S. Ananiadou and J. McNaught (editors), Artech House, ISBN 1-58053-984-x University of Missouri-Columbia 7

11. Academic Dishonesty Academic integrity is fundamental to the activities and principles of a university. All members of the academic community must be confident that each person's work has been responsibly and honorably acquired, developed, and presented. Any effort to gain an advantage not given to all students is dishonest whether or not the effort is successful. The academic community regards breaches of the academic integrity rules as extremely serious matters. Sanctions for such a breach may include academic sanctions from the instructor, including failing the course for any violation, to disciplinary sanctions ranging from probation to expulsion. When in doubt about plagiarism, paraphrasing, quoting, collaboration, or any other form of cheating, consult the course instructor. 12. Statement for ADA If you need accommodations because of a disability, if you have emergency medical information to share with me, or if you need special arrangements in case the building must be evacuated, please inform me immediately. Please see me privately after class, or at my office. Office location: 404 Clark Hall Office hours: by appointment To request academic accommodations (for example, a notetaker), students must also register with the Office of Disability Services, (http://disabilityservices.missouri.edu), S5 Memorial Union, 882-4696. It is the campus office responsible for reviewing documentation provided by students requesting academic accommodations, and for accommodations planning in cooperation with students and instructors, as needed and consistent with course requirements. For other MU resources for students with disabilities, click on "Disability Resources" on the MU homepage. 13. Statement for Intellectual Pluralism The University community welcomes intellectual diversity and respects student rights. Students who have questions concerning the quality of instruction in this class may address concerns to either the Departmental Chair or Divisional leader or Director of the Office of Students Rights and Responsibilities (http://osrr.missouri.edu/). All students will have the opportunity to submit an anonymous evaluation of the instructor(s) at the end of the course. University of Missouri-Columbia 8

Appendix: Qualified Journals and Proceedings MU Libraries: http://mulibraries.1cate.com/ ACM TRANSACTIONS ON INFORMATION SYSTEMS (4.529 1 ) ISSN: 1046-8188 ARTIFICIAL INTELLIGENCE IN MEDICINE (1.882) ISSN: 0933-3657 COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE (0.788) ISSN: 0169-2607 COMPUTERS IN BIOLOGY AND MEDICINE (1.358) ISSN: 0010-4825 DATA MINING AND KNOWLEDGE DISCOVERY (2.105) ISSN: 1384-5810 IEEE INTELLIGENT SYSTEMS (2.56) ISSN: 1541-1672 IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE (1.376) ISSN: 1089-7771 MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING (1.028) ISSN: 0140-0118 MEDICAL DECISION MAKING (1.822) ISSN: 0272-989X Journal of Biomedical Informatics (2.388) ISSN: 1532-0464 BMC Bioinformatics (4.96) ISSN: 1471-2105 AMIA Proceedings Conference on Knowledge Discovery in Data (KDD) Proceedings (Visit ACM Digital Library) Etc... IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (1.758) ISSN: 1041-4347 INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS (1.374) ISSN: 1386-5056 JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION (4.339) ISSN: 1067-5027 1 This number indicates the corresponding journal s impact factor (IF) which has been used as the importance of a journal. University of Missouri-Columbia 9