The De-identification of Personally Identifiable Information



Similar documents
ENSURING ANONYMITY WHEN SHARING DATA. Dr. Khaled El Emam Electronic Health Information Laboratory & uottawa

De-Identification of Health Data under HIPAA: Regulations and Recent Guidance" " "

How to De-identify Data. Xulei Shirley Liu Department of Biostatistics Vanderbilt University 03/07/2008

What is Covered by HIPAA at VCU?

HIPAA POLICY REGARDING DE-IDENTIFICATION OF PROTECTED HEALTH INFORMATION AND USE OF LIMITED DATA SETS

HIPAA-Compliant Research Access to PHI

UPMC POLICY AND PROCEDURE MANUAL

De-Identification 101

Legal Insight. Big Data Analytics Under HIPAA. Kevin Coy and Neil W. Hoffman, Ph.D. Applicability of HIPAA

De-Identification of Clinical Data

The De-identification Maturity Model Authors: Khaled El Emam, PhD Waël Hassan, PhD

HIPAA-P06 Use and Disclosure of De-identified Data and Limited Data Sets

HIPAA COMPLIANCE. What is HIPAA?

IRB Application for Medical Records Review Request

Winthrop-University Hospital

De-identification Koans. ICTR Data Managers Darren Lacey January 15, 2013

4. No accounting of disclosures is required with respect to disclosures of PHI within a Limited Data Set.

A Q&A with the Commissioner: Big Data and Privacy Health Research: Big Data, Health Research Yes! Personal Data No!

HIPAA COMPLIANCE INFORMATION. HIPAA Policy

What is Covered under the Privacy Rule? Protected Health Information (PHI)

LA BioMed Secure

Data Driven Approaches to Prescription Medication Outcomes Analysis Using EMR

HIPAA Privacy and Security Rules: A Refresher. Marilyn Freeman, RHIA California Area HIPAA Coordinator California Area HIM Consultant

HIPAA-G04 Limited Data Set and Data Use Agreement Guidance

University of Cincinnati Limited HIPAA Glossary

IRB Month Investigator Meeting April 2014

Understanding De-identification, Limited Data Sets, Encryption and Data Masking under HIPAA/HITECH: Implementing Solutions and Tackling Challenges

Societal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse April Oslo

Everett School Employee Benefit Trust. Reportable Breach Notification Policy HIPAA HITECH Rules and Washington State Law

North Shore LIJ Health System, Inc. Facility Name

Memorandum. Factual Background

Health Insurance Portability & Accountability Act (HIPAA) Compliance Application

Protecting Personal Health Information in Research: Understanding the HIPAA Privacy Rule

VENDOR / CONTRACTOR. Privacy Basics

Grand Rapids Medical Education Partners Mercy Health Saint Mary s Spectrum Health. Pam Jager, GRMEP Director of Education & Development

INDIANA UNIVERSITY SCHOOL OF OPTOMETRY HIPAA COMPLIANCE PLAN TABLE OF CONTENTS. I. Introduction 2. II. Definitions 3

Information Privacy and Security Program Title:

Application for an Off-Site Tissue Banking Waiver at a Non-Profit or Academic Institution

HIPAA OVERVIEW ETSU 1

Guidance on De-identification of Protected Health Information November 26, 2012.

BUSINESS ASSOCIATE AGREEMENT HIPAA Protected Health Information

HIPAA and Clinical Research

Best Practice Guidelines for Managing the Disclosure of De-Identified Health Information

Abstract. It s peace of mind knowing that we ve done everything that is possible to meet industry standards for de-identification. Dr.

HIPAA 101: Privacy and Security Basics

Degrees of De-identification of Clinical Research Data

Statement of Policy. Reason for Policy

BUSINESS ASSOCIATE AGREEMENT BETWEEN LEWIS & CLARK COLLEGE AND ALLEGIANCE BENEFIT PLAN MANAGEMENT, INC. I. PREAMBLE

HIPAA Compliance for Students

UPMC POLICY AND PROCEDURE MANUAL

HIPAA ephi Security Guidance for Researchers

8/3/2015. Integrating Behavioral Health and HIV Into Electronic Health Records Communities of Practice

Presented by Jack Kolk President ACR 2 Solutions, Inc.

Health Insurance Portability and Accountability Policy 1.8.4

HIPAA and You The Basics

Donna S. Sheperis, PhD, LPC, NCC, CCMHC, ACS Sue Sadik, PhD, LPC, NCC, BC-HSP Carl Sheperis, PhD, LPC, NCC, MAC, ACS

HIPAA Data Use Agreement Policy R&G Template Updated for Omnibus Rule HIPAA DATE USE AGREEMENT 1

Health Data De-Identification by Dr. Khaled El Emam

[Insert Name and Address of Data Recipient] Data Use Agreement. Dear :

Computer Security Incident Response Plan. Date of Approval: 23- FEB- 2015

CREATIVE SOLUTIONS IN HEALTHCARE, INC. Privacy Policy

Data Masking for HIPAA Compliance

Privacy Committee. Privacy and Open Data Guideline. Guideline. Of South Australia. Version 1

Dispelling the Myths Surrounding De-identification:

Data Security Considerations for Research

Business Associate Agreement

Transcription:

The De-identification of Personally Identifiable Information Khaled El Emam (PhD) www.privacyanalytics.ca 855.686.4781 info@privacyanalytics.ca 251 Laurier Avenue W, Suite 200 Ottawa, ON Canada K1P 5J6

De-identification Works http://www.plosone.org/article/info%3adoi%2f10.1371%2fjournal.pone.0028071

Anonymization = Risk Management

Direct & Quasi-identifiers Examples of direct identifiers: Name, address, telephone number, fax number, MRN, health card number, health plan beneficiary number, VID, license plate number, email address, photograph, biometrics, SSN, SIN, device number, clinical trial record number Examples of quasi-identifiers: sex, date of birth or age, geographic locations (such as postal codes, census geography, information about proximity to known or unique landmarks), language spoken at home, ethnic origin, total years of schooling, marital status, criminal history, total income, visible minority status, profession, event dates, number of children, high level diagnoses and procedures

2 Anonymization Landscape

HIPAA Safe Harbor Method Safe Harbor Direct Identifiers and Quasi-identifiers 1. Names 2. ZIP Codes (except first three) 3. All elements of dates (except year) 4. Telephone numbers 5. Fax numbers 6. Electronic mail addresses 7. Social security numbers 8. Medical record numbers 9. Health plan beneficiary numbers 10.Account numbers 11.Certificate/license numbers 12.Vehicle identifiers and serial numbers, including license plate numbers 13.Device identifiers and serial numbers 14.Web Universal Resource Locators (URLs) 15.Internet Protocol (IP) address numbers 16.Biometric identifiers, including finger and voice prints 17.Full face photographic images and any comparable images; 18. Any other unique identifying number, characteristic, or code

Expert Determination (Statistical) Method A person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable: I. Applying such principles and methods; determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information by an anticipated recipient to identify an individual who is a subject of the information; and II. Documents the methods and results of the analysis that justify such determination

Spectrum of Identifiability 1 Cell Size 3 Two matching indirect identifiers in three cells within a dataset

Spectrum of Identifiability There are a range of operational precedents, based on situational context and mitigating controls. 8 10 11 5 16 3 2 20 Little De-identification Significant De-identification

Managing Re-identification Risk

De-identification Process Set Risk Threshold Based on the characteristics of the data recipient, the data, and precedents and quantitative threshold is set. This is an iterative process. The mitigating controls in place can be strengthened to get a more forgiving threshold. De-identification Process Measure Risk Based on plausible attacks, appropriate metrics are selected and used to measure actual reidentification risk from the data. Apply Transformations If the measured risk does not meet the threshold, specific transformations (such as generalization and suppression) are applied to reduce the risk.

Automation

Enabling Post-marketing and Public Health Surveillance Large EMR Vendor Challenge Wants to anonymize data on 535,595 patients from general practices Longitudinal data needs to be used for on-going and on-demand analytics Solution PARAT CORE PARAT integrated in ETL pipeline Why Privacy Analytics De-identified data would allow: 1. Post-marketing surveillance of adverse events 2. Public health surveillance 3. Prescription pattern analysis 4. Health services analysis Customer Profile EMR vendor with more than 2664 clinics and 5850 physicians using the system in family clinics and walk-in clinics. The data set spans more than five years of all clinical, prescription, laboratory, scheduling and billing data.

GI Protocol Two arm protocol; GI events after taking NSAIDs with and without a PPI

Chlamidya Protocol Females 14-24 years old inclusive tested and tested positive for Chlamydia in the previous 12 months

Contact kelemam@privacyanalytics.ca @kelemam www.privacyanalytics.ca