Data Privacy and Biomedicine Syllabus - Page 1 of 6



Similar documents
Systems and Internet Marketing Syllabus Spring 2011 Department of Management, Marketing and International Business

Accounting : Accounting Information Systems and Controls. Fall 2015 COLLEGE OF BUSINESS AND INNOVATION

ANGELO STATE UNIVERSITY Department of Accounting, Finance and Economics. Financial Management. Spring 2015 Syllabus

Systems and Internet Marketing Syllabus Fall 2012 Department of Management, Marketing and International Business

DATA MINING - 1DL360

STARK STATE COLLEGE Master Syllabus (to be included with Class Syllabus)

Required Text Schacter, Daniel L. Introducing Psychology with Updates on DSM-5 (2nd ed.). Worth Publishers. (2014).

SYLLABUS Human Resource Management MGMT 3241 Section 001 Spring 2006, MW 3:00-4:20 Friday 9

The De-identification Maturity Model Authors: Khaled El Emam, PhD Waël Hassan, PhD

MED 2400 MEDICAL INFORMATICS FUNDAMENTALS

MKTG 435 International Marketing Course Syllabus Spring Phone: (618)

1. Introduction to ehealth:

Class: BBA 440 Human Resource Management; 3 credit hours

Health Informatics CS580C1/EL Course Format (On Campus/Blended)

ACCT 430: Accounting Ethics Leventhal School of Accounting University of Southern California Spring 2013

Florida Gulf Coast University Lutgert College of Business Marketing Department MAR3503 Consumer Behavior Spring 2015

Moravian College Department of Economics and Business Management 223: Management and Organizational Theory

(575) and by prior appointment nmsu. edu

Introduction to Computer Forensics Course Syllabus Spring 2012

ITNW 1337 Introduction to the Internet Course Syllabus: Spring 2015

CHEM PRINCIPLES OF CHEMISTRY Lecture

MTH 110: Elementary Statistics (Online Course) Course Syllabus Fall 2012 Chatham University

RHIT Competency Review

FINC 6532-ADVANCED FINANCIAL MANAGEMENT Expanded Course Outline Spring 2007, Monday & Wednesday, 5:30-6:45 p.m.

UNIVERSITY OF MASSACHUSETTS BOSTON COLLEGE OF MANAGEMENT AF Theory of Finance SYLLABUS Spring 2013

Applied Network Security Course Syllabus Spring 2015

BCHM Analytical Biochemistry Syllabus Spring, 2013

CHF 201, Introduction to Child Development Academ-e Spring 2010 Online, 3 Credit Hours University of Maine January 11, 2010 February 26, 2010

FACULTY of MANAGEMENT MARKETING MGT 2020 Z Fall 2015

IS 592 Big Data Analytics Spring 2014

BUAD 310 Applied Business Statistics. Syllabus Fall 2013

Johnson State College External Degree Program. INT-4610-JH01 QE: Senior Seminar in Interdisciplinary Studies, 3 credits Syllabus Spring 2015

PCB 3043: Ecology Spring 2012, MMC

COURSE OBJECTIVES AND STUDENT LEARNING OUTCOMES:

Syllabus. May 16, Wednesday, 10:30 AM 12:30 PM

College of Southern Maryland Fundamentals of Accounting Practice(ACC 1015) Course Syllabus Spring 2015

HPM 750 Introduction to Dental Public Health (Credit Hours: 3) Department of Health Policy and Management School of Public Health

Phone: Office: BLB-358L. MEETING TIMES/PLACE Monday, Wednesday 9:30am-10:50am, BLB 090

BIO 3350: ELEMENTS OF BIOINFORMATICS PARTIALLY ONLINE SYLLABUS

To talk by phone or leave a voice mail, call: (generally not available on weekends)

College of Public Health & Health Professions Course Syllabus. Public Health Concepts in Infectious Diseases PHC 6517, section 1816

Math 103, College Algebra Spring 2016 Syllabus MWF Day Classes MWTh Day Classes

Course Syllabus Fall S652 Digital Libraries

PSYCHOLOGICAL ASSESSMENT I PCO 6316C

CAS 464/464-L: Advanced Practicum in Early Childhood

Florida Gulf Coast University Finite Math M GF 1106 Spring 2014 CRN 11072

CSUS COLLEGE OF ENGINEERING AND COMPUTER SCIENCE Department of Computer Science (RVR 3018; /6834)

Forensic Biology 3318 Syllabus

COURSE INFORMATION. Biology 224 Anatomy & Physiology Spring, 2015

COURSE INFORMATION. Biology 224 Anatomy & Physiology Spring, 2014

The University of North Carolina at Greensboro CRS 605: Research Methodology in Consumer, Apparel, and Retail Studies (3 Credits) Spring 2014

Societal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse April Oslo

CMJ CRIME SCENE INVESTIGATION Spring Syllabus 2015

PSYCHOLOGY Section M01 Mixed Mode Spring Semester Fundamentals of Psychology I MW 11:30 - A130. Course Description

Online MPH Program Supplemental Application Handbook

Introduction to Symbolic Logic Vaishali Khandekar, PhD Course Description: PREREQUISITE(S): CO-REQUISITE(S): FREQUENT REQUISITES

How To Pass A Financial Analysis Course

or 30 minutes prior to the start of each class, or by prearranged appointment

CIT 217 Security + Network Security Fall 2015

IS 375 Discovering User Needs for UX

COURSE SYLLABUS PHILOSOPHY 001 CRITICAL THINKING AND WRITING SPRING 2012

Course Elementary Microeconomics Spring Credit Hours MW 7:30-9:20 pm Room: TBA. Professor

PELLISSIPPI STATE TECHNICAL COMMUNITY COLLEGE MASTER SYLLABUS SECURITY MANAGEMENT I SEAT 1500

Statistics W4240: Data Mining Columbia University Spring, 2014

J320 Stratcom I: Introduction to Strategic Communication Spring 2016

Vanderbilt University Biomedical Informatics Graduate Program (VU-BMIP) Proposal Executive Summary

CHEM 124 and CHEM 125: College Chemistry

Introduction to Industrial and Organizational Psychology PSY 319 Spring, 2013 (Section 1)

College of Public Health University of South Florida. Department of Environmental and Occupational Health. Syllabus Page 1

Nonprofit Financial Administration PADP 8220

A Q&A with the Commissioner: Big Data and Privacy Health Research: Big Data, Health Research Yes! Personal Data No!

Angelina College Technology & Workforce Division CRIJ Introduction to Criminal Justice - ONLINE Summer I 2015 Course Syllabus

BUS 3525 Strategic Management Online

IVY TECH COMMUNITY COLLEGE OF INDIANA REGION 14 BLOOMINGTON COURSE SYLLABUS

Course Description: Examines the basic elements of marketing theory, terminology and concepts with emphasis placed on analyzing consumer motivation.

University of Pennsylvania Graduate Program in Public Health MPH Degree Program Course Syllabus Spring 2012

CS 464/564 Networked Systems Security SYLLABUS

ANT 2330: Cross-Cultural Communication. Tues / Thurs 6:30pm 7:45pm in WC 230

JOU 3411 DESIGN SYLLABUS

EASTERN MICHIGAN UNIVERSITY DEPARTMENT OF POLITICAL SCIENCE LAW & POLICY IN A CONSTITUTIONAL DEMOCRACY PLSC 305 Winter, 2015 SYLLABUS

Econ 110 (Sec. 1), Principles of Microeconomics

PREREQUISITES. Graduate level standing. COURSE DESCRIPTION

BIO Evolution. KSCommons. Keene State College. Sciences and Social Sciences, School of. Syllabi. Spring 2010

Syllabus. EVR 1001: Introduction to Environmental Science and Sustainability Florida International University, Spring 2016

EMPORIA STATE UNIVERSITYSCHOOL OF BUSINESS Department of Accounting and Information Systems. IS213 A Management Information Systems Concepts

UNIVERSITY OF WISCONSIN-EAU CLAIRE COLLEGE OF ARTS AND SCIENCES Department of Political Science Criminal Justice Program

INF 203: Introduction to Network Systems (3 credit hours) Spring W1, Class number 9870

CSE 5392 Sensor Network Security

Transcription:

Data Privacy and Biomedicine Syllabus - Page 1 of 6 Course: Data Privacy in Biomedicine (BMIF-380 / CS-396) Instructor: Bradley Malin, Ph.D. (b.malin@vanderbilt.edu) Semester: Spring 2015 Time: Mondays & Wednesdays, 3:10 4:25pm Location: Featheringill Hall, Room 313 Website: http://www.hiplab.org/courses/bmif380/ Office Hours: Upon Appointment DESCRIPTION The integration of information technology into biomedical environments has enabled unprecedented advances in the collection, storage, analysis, and rapid dissemination of patientspecific data to physicians and researchers. Given the potential wealth of such detailed data for further advances in healthcare, many organizations associated with the healthcare domain share, or anticipate sharing, their collections for various purposes related to quality assurance, public health, and research. However, in the face of today s complex networked environments, many organizations find it increasingly difficult to share biomedical data due to concerns about patient privacy. For instance, how can we share patient-specific data without revealing the identity of the patient? Security practices, such as role-based access control and encrypted communications ensure authentication and secure communications, but they do not necessarily stem the leakage of inferences from the data after it has been accessed or transmitted. Thus, this course is concerned with the analysis and protection of data privacy with a focus on the idiosynchrasies and regulatory framework associated with biomedical information. The goal of this course is to introduce students to the computational challenges, as well as formal privacy protection solutions, for data privacy in healthcare and biological research environments. The topology of data privacy is a highly interdisciplinary landscape and material in this course will touch on issues and methodologies from bioinformatics, cryptography, data mining, databases, distributed systems, law, machine learning, medical informatics, policy, and statistics. OBJECTIVES After this course, students will be able to analyze data privacy issues from three non-exclusive perspectives: 1. Data Detectives: Oftentimes data is shared with false beliefs about privacy and data protection. From this perspective students will learn how seemingly private information, can be learned using automated strategies. 2. Data Protectors: Students will learn how to construct privacy protection technologies that provide formal computational guarantees of privacy in data collection and sharing. 3. Technology Policy Designers: Computational models provide a basis for protection, but in order to implement such technology in the real world, it must support, and not circumvent, existing policy specification. From this perspective, students will learn how to develop privacy protection solutions which complement policy regulations.

Data Privacy and Biomedicine Syllabus - Page 2 of 6 PREREQUISITES Required: Students are expected to have proficiency in designing and writing software programs. There is no programming language requirement for this class, though experience with object orientation is beneficial. Recommended: Students should be comfortable with learning about basic statistics, data structures, and algorithm analysis. When appropriate, quantitative and computational methodology will be reviewed. Knowledge of, and prior experience with, security principles is NOT a prerequisite for this course. GRADING Criteria Percent of Grade Project 50% (Initial Proposal, Due March 16) (5%) (Status Report, Due March 31) (15%) (Final Report & Presentation, Due on April 27) (30%) Homework Assignments (3 assignments, 10% each) 30% Reading Summaries 10% Class Participation 10% 100% Required Reading Assignments: There is no primary textbook for this course. Reading assignments will be selected from various periodicals. Students will be required to read and submit brief summaries of assigned readings. Your summaries should be no longer than one page in length. Readings will be made available online or as in-class handouts at least one lecture before they are due. Your summaries will be graded on a {check-minus, check, checkplus} scale. - (or 1 point): You skimmed the assigned reading and barely understood, or summarized, its meaning and implications. (or 2 points): You demonstrated that you read the material by providing a reasonable account of its contents, its strengths, and weaknesses. + (or 3 points): You provided a critical assessment of the reading and show insight regarding the reading s topic. These summaries constitute a total of 10% of your final grade. An average score of (i.e., 2 points) will provide the student with the full 10%. An average score greater than (i.e., greater than 2 points) will entitle the student to extra credit, with a maximum of 5 additional percentage points on their final grade. You must email your summaries to b.malin@vanderbilt.edu before the beginning of class.

Data Privacy and Biomedicine Syllabus - Page 3 of 6 Project: In lieu of a final exam, each student must complete an independent project on a data privacy issue in biomedicine. Projects should investigate a topic of interest to the student, and must demonstrate analysis and critical thinking in data privacy. The project will require a significant commitment and contribute to a substantial part of the final grade. A list of sample project topics will be made available and reviewed in class. Honesty Policy: From the Vanderbilt Student Handbook, HONESTY is a commitment to refrain from lying, cheating, and stealing. Recognizing that dishonesty undermines community trust, stifles the spirit of scholarship, and threatens a safe environment, we expect ourselves to be truthful in academic endeavors, in relationships with others, and in pursuit of personal development. You are permitted, and encouraged, to discuss homework assignments with other students. However, you must do your own work and submit your own solutions. TOPIC AND SCHEDULE OVERVIEW (Tentative and Subject to Change) Part 1 (Jan 5 and 7): Course Overview, a Little Policy, and Whole Lot of Data In the first class, we ll go over ground rules for the course and review the syllabus. If time permits, we ll begin a discussion on what data privacy is (and is not). We ll investigate how it relates to data security principles, such as authorization, access control, and authentication. In the second (though first full) class, we will survey various ideologies, legal, and policy precedents for privacy in modern healthcare environments and society. Who collects medical information and when do patients have control over their privacy? Can policy and specification of privacy protections be automated? Part 2 (Jan 12 and 14): De-identification & Re-identification This week we will look at how seemingly de-identified medical information can be reidentified to the individuals from which it was collected. In the process, students will learn statistical techniques for characterizing uniqueness in data, both at elemental and population levels of granularity. We ll look at how personal information is available in many different resources both onand offline. Where is this information? How do we automatically capture and organize it for privacy assessments? We will look at various information repositories, such as vital records and statistics (including birth records, death records, marriage records, court documents) and the Social Security Death Index. It will also review systems are set up to track people through identifiers and traceable elements. As an example, we will discuss the potential of unique numbers for persistent patient identifiers and the history of the Social Security Number in the United States. And how recent policy changes in the United States influence these opportunities. Part 3 (Jan 19, 21, and 26): Record Linkage There is no class on Jan 19 (Martin Luther King Day). This week we will investigate concepts and methodology associated with the linkage of data in disparate databases. Methods will be drawn from a deterministic perspective.

Data Privacy and Biomedicine Syllabus - Page 4 of 6 We will also discuss how linkage methods can be automated and their application within electronic medical record systems. The second part of this section will be dedicated to more sophisticated strategies of record linkage. It will move beyond basic deterministic methods and will look at more formal probabilistic strategies, with a particular emphasis on expectation-maximization (EM) frameworks. Part 4 (Jan 28, Feb 2, 4, and 9): Access Control Models and Auditing This section of the course will look into how access control frameworks, and particularly roles, can be defined. We will look into formal models of access specification, how to design roles in a manner that meet organizational needs, and algorithms for automatically constructing role hierarchies through simple data mining strategies. While, access control provides a framework for specifying who is entitled to access what information and when, there are many situations in which access control cannot be sufficiently specified or must be circumvented to enable timely care of patients. In this section of the course, we will also look at how information in the access logs of electronic medical records and the records of the patients themselves and can be mined for auditing purposes. Portions of this section of the course will be taught by Dr. You Chen. Part 5 (Feb 11, 16, 18, 23): Anonymization In this part of the course, we will begin to turn the table and shift from reidentification to anonymisation frameworks which can be designed to explicitly prevent such attacks. We will begin with formal models of anonymity, in which guarantees are provided on the extent to which data can be exploited for linkage and re-identification purposes. In particular, students will learn about the k-map and k-anonymity models, as well as algorithmic approaches to transform biomedical data to satisfy a formal anonymity model. We ll explore heuristic and approximation algorithms to achieve efficient anonymization. Students will also exposed to graph-based methods for modeling the re-identification and anonymization problem. These approaches will be presented in the context of multiple types of data encountered in the biomedical realm, including relational and set-valued data types. Portions of this section of the course will be taught by Dr. Raymond Heatherly. Part 6 (Feb 25 and March 9): Natural Language Scrubbing In this section of the course, we ll focus on de-identification in the context of free text (e.g., doctor s notes, laboratory reports, discharge reports, and more). How can we deidentify text information? Can we ever achieve anonymized text? This section of the course will review various data-intensive methods for processing natural language to detect and remove or scrub personal identifiers from clinical text. Part 7 (March 10) Ethical Reasoning This part considers of some of the ethical issues in the application of data privacy technologies. Simply because you can build a re-identification technology, doesn t

Data Privacy and Biomedicine Syllabus - Page 5 of 6 mean that you should use it, does it? We ll investigate ethical reasoning and governance models for dual use technologies. Part 8 (March 16, 18, 23, and 30): High-Dimensionality Data and Beyond Anonymization In the first lecture of the week, students will learn about the homogeneity attack against anonymization algorithms, also known as attribute disclosure. We will then explore frameworks and algorithms to mitigate this attack in the context of health information sharing for various secondary use cases. In the second part of this week, we ll explore how higher dimensional data can be exploited in various settings to perform attribute disclosure. We will particularly foucs on: high-throughput technologies that becoming ingrained in the clinical environment, the collection and sharing of biological information, such as DNA data. We will investigate ways in which patient identity in genomic data is protected, how it is re-identified and how it can be formal protected geospatial technologies that are becoming standard practice in public health and epidemiology settings. These problems require geographic information regarding the presence of clinically interesting cases to detect potential outbreaks and bioterrorist activities. However, the sharing of geographic and spatiotemporal information may lead to re-identification. We will investigate various approaches by which such information may be protected during data sharing. Social networks that are becoming popular settings for integrating diseasebased communities, interacting with support groups, and performing population-based pharmacoepidemiology studies. Project Status Report Presentations (March 25 note this is in the middle of part 8) This day will be dedicated to student projects. Students will write a short summary of their problem statement, initial research design, and make a short presentation on the status of their projects for an in-class evaluation. Part 9 (April 1, 3, and 8) Image and Mobile Privacy This portion of the course will look into images and video stream, which are increasingly used for monitoring and surveillance in health care environments, such as managed care facilities. We will investigate several procedures and principles for removing personally identifying features, e.g., an individual s face, from video streams. We will also investigate how images, e.g., the picture of a face, are a special case of video streams, can be protected using formal models of anonymity. Part 10 (April 13 and 15): Privacy Preserving Data Mining This section of the course will move beyond traditional de-identification and anonymization models. First, we will look into basic variations of secure multiparty computation (SMC). The traditional application of cryptography is framed from a twoparty viewpoint in which two participants, Alice and Bob, exchange information, such as

Data Privacy and Biomedicine Syllabus - Page 6 of 6 a patient's medical record, over an unsecured channel. An extension to the traditional model is secure multiparty computation, which is concerned with the interaction of two or more participants that need to exchange information to construct a result without revealing private information. Next, we will go into further depth regarding how such protocols can be adapted to support record linkage frameworks without revealing the identities of the corresponding patients Week 14 (April 20): Student Final Presentations The final lecture will be dedicated to students presentations on their final projects.