ENSURING ANONYMITY WHEN SHARING DATA. Dr. Khaled El Emam Electronic Health Information Laboratory & uottawa

Similar documents
The De-identification of Personally Identifiable Information

De-Identification 101

De-Identification of Health Data under HIPAA: Regulations and Recent Guidance" " "

HIPAA POLICY REGARDING DE-IDENTIFICATION OF PROTECTED HEALTH INFORMATION AND USE OF LIMITED DATA SETS

The De-identification Maturity Model Authors: Khaled El Emam, PhD Waël Hassan, PhD

PRIVACY BREACH MANAGEMENT POLICY

UPMC POLICY AND PROCEDURE MANUAL

A Q&A with the Commissioner: Big Data and Privacy Health Research: Big Data, Health Research Yes! Personal Data No!

How to De-identify Data. Xulei Shirley Liu Department of Biostatistics Vanderbilt University 03/07/2008

Winthrop-University Hospital

HIPAA-Compliant Research Access to PHI

De-Identification of Clinical Data

Clinical Study Reports Approach to Protection of Personal Data

Privacy Committee. Privacy and Open Data Guideline. Guideline. Of South Australia. Version 1

Understanding De-identification, Limited Data Sets, Encryption and Data Masking under HIPAA/HITECH: Implementing Solutions and Tackling Challenges

HIPAA-P06 Use and Disclosure of De-identified Data and Limited Data Sets

De-identification Koans. ICTR Data Managers Darren Lacey January 15, 2013

Protecting Personal Health Information in Research: Understanding the HIPAA Privacy Rule

Producing official statistics via voluntary surveys the National Household Survey in Canada. Marc. Hamel*

Everett School Employee Benefit Trust. Reportable Breach Notification Policy HIPAA HITECH Rules and Washington State Law

Memorandum. Factual Background

4. No accounting of disclosures is required with respect to disclosures of PHI within a Limited Data Set.

IRB Application for Medical Records Review Request

Health Data De-Identification by Dr. Khaled El Emam

Dispelling the Myths Surrounding De-identification:

Introduction to The Privacy Act

What is Covered by HIPAA at VCU?

What is Covered under the Privacy Rule? Protected Health Information (PHI)

Statistics Canada s National Household Survey: State of knowledge for Quebec users

HIPAA COMPLIANCE. What is HIPAA?

CLOUD COMPUTING & THE PATRIOT ACT: A RED HERRING?

HIPAA-G04 Limited Data Set and Data Use Agreement Guidance

Degrees of De-identification of Clinical Research Data

Using 'Big Data' to Estimate Benefits and Harms of Healthcare Interventions

Societal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse April Oslo

Conducting Surveys: A Guide to Privacy Protection. Revised January 2007 (updated to reflect A.R. 186/2008)

Surname First Initials. City Province/State Postal Code/Zip. Home telephone # Cell # Business telephone #

Privacy & Big Data: Enable Big Data Analytics with Privacy by Design. Datenschutz-Vereinigung von Luxemburg Ronald Koorn DRAFT VERSION 8 March 2014

IHE IT Infrastructure Handbook. De-Identification. Published

Framework for Compliance Audits Under the Employment Equity Act

Subject Access Request (SAR) Procedure

Ann Cavoukian, Ph.D.

INDIANA UNIVERSITY SCHOOL OF OPTOMETRY HIPAA COMPLIANCE PLAN TABLE OF CONTENTS. I. Introduction 2. II. Definitions 3

HIPAA 101: Privacy and Security Basics

How To Protect Your Data In European Law

Big Data, Not Big Brother: Best Practices for Data Analytics Peter Leonard Gilbert + Tobin Lawyers

Breach Notification Policy

THE STATE OF DATA SHARING FOR HEALTHCARE ANALYTICS : CHANGE, CHALLENGES AND CHOICE

HIPAA Privacy and Security Rules: A Refresher. Marilyn Freeman, RHIA California Area HIPAA Coordinator California Area HIM Consultant

NAME, M.D.C.M., F.R.C.S

International Electronic Funds Transfer Report

Best Practice Guidelines for Managing the Disclosure of De-Identified Health Information

Internationally Educated Medical Radiation Technologists APPLICATION for ASSESSMENT

UNDERSTANDING THE HIPAA/HITECH BREACH NOTIFICATION RULE 2/25/14

BUSINESS ASSOCIATE AGREEMENT HIPAA Protected Health Information

Legal Insight. Big Data Analytics Under HIPAA. Kevin Coy and Neil W. Hoffman, Ph.D. Applicability of HIPAA

(Big) Data Anonymization Claude Castelluccia Inria, Privatics

Health Insurance Portability and Accountability Policy 1.8.4

Data Driven Approaches to Prescription Medication Outcomes Analysis Using EMR

PROTECTION OF PERSONAL INFORMATION BILL

POLICY AND PROCEDURE MANUAL

PERSONAL INFORMATION PRIVACY POLICY FOR EMPLOYEES AND VOLUNTEERS [ABC SCHOOL]

Data De-identification and Anonymization of Individual Patient Data in Clinical Studies A Model Approach

A GUIDE TO SCREENING AND SELECTION IN EMPLOYMENT.

HIPAA COMPLIANCE INFORMATION. HIPAA Policy

UPMC POLICY AND PROCEDURE MANUAL

Compliance Program and HIPAA Training For First Tier, Downstream and Related Entities

METLIFE SINGLE LIFE RELEVANT LIFE POLICY TERMS AND CONDITIONS

NORTH CAROLINA COMMUNITY CARE INC. Privacy Policy Manual

Application Form for Registration as a Social Worker

MCDONOUGH CENTER FOR FAMILY DENTISTRY, LLC

North Shore LIJ Health System, Inc. Facility Name

HIPAA and You The Basics

Registration and Licensure as a Pharmacy Technician

Canadian Survey on Disability (CSD)

Transcription:

ENSURING ANONYMITY WHEN SHARING DATA Dr. Khaled El Emam Electronic Health Information Laboratory & uottawa

ANONYMIZATION

Motivations for Anonymization Obtaining patient consent/authorization not practical for large databases and introduces bias Limiting principles / minimal necessary Contractual obligations Maintain public / consumer / client trust Costs of breach notification Rising discipline of re-identification attacks

http://www.plosone.org/article/info%3adoi%2f10.1371%2fjournal.pone.0028071

Anonymization = Risk Management

No Zero Risk

Anonymization Standards

Anonymization Methods & Examples

Direct & Indirect Identifiers Examples of direct identifiers: Name, address, telephone number, fax number, MRN, health card number, health plan beneficiary number, VID, license plate number, email address, photograph, biometrics, SSN, SIN, device number, clinical trial record number Examples of quasi identifiers: sex, date of birth or age, geographic locations (such as postal codes, census geography, information about proximity to known or unique landmarks), language spoken at home, ethnic origin, total years of schooling, marital status, criminal history, total income, visible minority status, profession, event dates, number of children, high level diagnoses and procedures

2 12

2 13

2 14

2 15

Landscape

Landscape A process that removes the association between the identifying data and the data subject. (Source ISO/TS 25237:2008)

Landscape Reducing the risk of identifying a data subject to a very small level through the application of a set of data transformation techniques without any concern for the analytics utility of the data.

Landscape Removal of fields from a data set

Landscape A particular type of anonymization that both removes the association with a data subject and adds an association between a particular set of characteristics to the data subject and one or more pseudonyms (Source: ISO/TS 25237:2008)

Landscape Replacing a value in the data with a random value from a large database of possible values

Landscape Reducing the risk of identifying a data subject to a very small level through the application of a set of data transformation techniques such that the resulting data retains a very high analytics value.

Landscape Reducing the precision of a value to a more general one

Landscape The removal of records or values (cells) in the data

Landscape Randomly selecting a subset of records or patients from a data set

Landscape The motives and capacity of the data recipient to reidentify the data

Landscape The security and privacy practices that the data recipient has in place to manage the data received.

Anonymization Process

4 Choosing A Risk Threshold 29

Managing Re-identification Risk

Gordon Case Federal court case around an access request for information from the Canadian adverse drug event database Case between Health Canada and the CBC Judge ruled that the province of the affected individuals should not be released One of the main drivers was the ability to reidentify individuals from information in the adverse event record if it includes province Sets a precedent on identifiability of province

Our Analysis

Our Analysis It would have been possible to release province information if the date of death was reduced in precision A detailed analysis using the above riskbased methodology allows us to evaluate these tradeoffs, and find a solution that protects individual privacy but also facilitates the sharing of data

Implementation Framework

Contact kelemam@ehealthinformation.ca @kelemam www.ehealthinformation.ca