Societal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse 2015 21-22 April Oslo

Similar documents
The De-identification of Personally Identifiable Information

A Q&A with the Commissioner: Big Data and Privacy Health Research: Big Data, Health Research Yes! Personal Data No!

The Use of Patient Records (EHR) for Research

Degrees of De-identification of Clinical Research Data

De-Identification 101

Policy Brief: Protecting Privacy in Cloud-Based Genomic Research

HIPAA-Compliant Research Access to PHI

Data Privacy and Biomedicine Syllabus - Page 1 of 6

Secondary Use of Healthcare Data for Public Health. Leslie Lenert, MD, MS FACMI Director, National Center for Public Health Informatics

DATA MINING - 1DL360

Accelerating Clinical Trials Through Shared Access to Patient Records

Yale University Open Data Access (YODA) Project Procedures to Guide External Investigator Access to Clinical Trial Data Last Updated August 2015

Health Data Governance: Privacy, Monitoring and Research - Policy Brief

Protecting Personal Health Information in Research: Understanding the HIPAA Privacy Rule

What is Covered by HIPAA at VCU?

Winthrop-University Hospital

Legal Insight. Big Data Analytics Under HIPAA. Kevin Coy and Neil W. Hoffman, Ph.D. Applicability of HIPAA

The De-identification Maturity Model Authors: Khaled El Emam, PhD Waël Hassan, PhD

ENSURING ANONYMITY WHEN SHARING DATA. Dr. Khaled El Emam Electronic Health Information Laboratory & uottawa

Healthcare data analytics. Da-Wei Wang Institute of Information Science

Principles for Responsible Clinical Trial Data Sharing

Understanding De-identification, Limited Data Sets, Encryption and Data Masking under HIPAA/HITECH: Implementing Solutions and Tackling Challenges

Guidance on De-identification of Protected Health Information November 26, 2012.

Clinical Study Reports Approach to Protection of Personal Data

Technical Approaches for Protecting Privacy in the PCORnet Distributed Research Network V1.0

De-Identification of Health Data under HIPAA: Regulations and Recent Guidance" " "

HIPAA Compliance Strategies for Pharmaceutical Manufacturers,

IRB Application for Medical Records Review Request

De-Identification of Clinical Data

Anonymizing Unstructured Data to Enable Healthcare Analytics Chris Wright, Vice President Marketing, Privacy Analytics

Privacy-by-Design: Understanding Data Access Models for Secondary Data

Clinical Research from EHR data

1.2: DATA SHARING POLICY. PART OF THE OBI GOVERNANCE POLICY Available at:

Comments of the World Privacy Forum To: Office of Science and Technology Policy Re: Big Data Request for Information. Via to

How To Use An Electronic Health Record

De-identification Koans. ICTR Data Managers Darren Lacey January 15, 2013

Implementing Honest Broker System(s) in Academic Medical Centers: The Pittsburgh Experience

Privacy and Security within an Interoperable EHR

Ann Cavoukian, Ph.D.

Abstract. It s peace of mind knowing that we ve done everything that is possible to meet industry standards for de-identification. Dr.

IRB Month Investigator Meeting April 2014

Privacy Techniques for Big Data

What is Covered under the Privacy Rule? Protected Health Information (PHI)

BUSINESS ASSOCIATE AGREEMENT BETWEEN AND COMMISSION ON ACCREDITATION, AMERICAN PSYCHOLOGICAL ASSOCIATION

Considering De-Identification? Legacy Data. Kymberly Lee 16-Jul-2015

HIPAA-P06 Use and Disclosure of De-identified Data and Limited Data Sets

Testimony. before the. National Committee on Vital and Health Statistics Ad Hoc Workgroup for Secondary Uses of Health Data

NSF Workshop on Big Data Security and Privacy

A Commercial Approach to De-Identification Dan Wasserstrom, Founder and Chairman De-ID Data Corp, LLC

Table of Contents. Page 1

Global Alliance for Genomics & Health Data Sharing Lexicon

Dispelling the Myths Surrounding De-identification:

Big Data, Not Big Brother: Best Practices for Data Analytics Peter Leonard Gilbert + Tobin Lawyers

20 years of Telemedicine in Tromsø, Norway

How to De-identify Data. Xulei Shirley Liu Department of Biostatistics Vanderbilt University 03/07/2008

Data Driven Approaches to Prescription Medication Outcomes Analysis Using EMR

Secondary Uses of Data for Comparative Effectiveness Research

BUSINESS ASSOCIATE AGREEMENT

Privacy Impact Assessment: care.data

Privacy: Legal Aspects of Big Data and Information Security

Standardization of the Australian Medical Data Exchange Model. Michael Legg PhD

DATA MINING - 1DL105, 1DL025

HIPAA and Clinical Research

The Health Information Act. Use and Disclosure of Health Information for Research

Data Analytics in Health Care

Health Data De-Identification by Dr. Khaled El Emam

Transcription:

Privacy Societal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse 2015 21-22 April Oslo Kassaye Yitbarek Yigzaw UiT The Arctic University of Norway Outline Background Introduction utility De-identification Secure multi-party computation 2 1

Background Electronic health data are being widely collected Administrative data, e.g. census, survey, socioeconomic, and registry Invaluable resource to improve healthcare systems effectiveness, efficiencies and quality of care Enormous benefits for individuals and society in general Jutte DP et al. Administrative Record Linkage as a Tool for Public Health Research. Annual Review of Public Health 2011 Geissbuhler A et al. Trustworthy of health data: A transnational perspective. International Journal of Medical Informatics 2013 3 opportunities Why are healthcare costs increasing? What are the comparative benefits and risks of prescription drugs? What is the evidence base for procedures? What explains variation in health care spending and use? How do environmental factors affect disease patterns? How can the health of minorities and special needs groups be improved? What does this mean for patients like me? Slide borrowed from Michael G. Kahn. Learning Health Systems From Concept to National Deployment. IEEE BHI2014 Conference 4 2

Introduction Individuals privacy concerns are the main challenge Healthcare institutions are also concerned about their own privacy 1 Most ethical and legal regulations allow data through: Informed consent Consent waiver by ethics committee (e.g. REK) (under certain conditions) de-identification Secure multi-party computation (SMC) 1 El Emam K, et al. Physician privacy concerns when disclosing patient data for public health purposes during a pandemic influenza outbreak. BMC Public Health 2011;11:454. 5 Increase privacy protection and data utility Privacy Research 6 3

utility utility is the value of a given data for research The data being available for research Analytical completeness is possible required analyses that can be done on the data Analytical validity is accuracy and generalizability of analyses results 7 Informed consent Everybody Systematic bias Consented subjects Potential subjects Decreased generalizability of analyses results to the general population Bohensky MA et al. Linkage: A powerful research tool with potential problems. BMC Health Services Research 2010 8 4

Informed consent cont Consent requires infeasible time and money for large public health studies Consent alone protects autonomy, but does not guarantees that released data will remain private 1 1 Taylor P. Personal genomes: when consent gets in the way. Nature 2008 9 De-identification de-identification methods remove or modify identifiers The aim is to prevent identity disclosure Often, there is probability of assigning correct identity (re-identification) to records The more quasi-identifiers are removed or modified Less probability of re-identification Less analytical completeness or validity 10 5

Probability of re-identification Growth of public data utility Probability of re-identification Acceptable probability of re-identification control on data recipient 4/27/2015 De-identification cont HIPAA safe harbor removes 18 identifiers Limited dataset removes 16 of the 18 identifiers, except dates and some geographical data Safe harbor has less analytical completeness, e.g. association between treatments and health outcomes 1 Limited dataset has more probability of reidentification 2 1 Nass SJ et al. Beyond the HIPAA Privacy Rule: Enhancing privacy, improving health through research. National Academies Press; 2009 2 Benitez K et al. Evaluating re-identification risks with respect to the HIPAA privacy rule. JAMIA 2010 11 De-identification cont Non-public µ µ µ Public Ohm P. Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization. Rochester, NY: Social Science Research Network; 2009 Emam KE at al. Anonymising and sharing individual patient data. BMJ 2015. Benitez K et al. Evaluating re-identification risks with respect to the HIPAA privacy rule. JAMIA 2010 12 6

De-identified data sharing Hospital Lab Third Party Centralized storage Distributed storage GP 13 Secure multi-party computation (SMC) Hospital Lab Secure multi-party computation emulate the third party GP Lindell Y, et al. Secure multiparty computation for privacy-preserving data mining. Journal of Privacy and Confidentiality 2009 14 7

Secure multi-party computation (SMC) SMC ensures that no more private information is revealed beyond the computation output Each data custodian has the capability of determining what/who compute on their data 15 Secure summation protocol Individual value remain private The summation of an institution s values is also private r 2 = r 1 + x 2 Id x 1 52 2 58 3 70 x 2 = 52+58+70 {r 2 } n 2 {r 1 } r 3 = r 2 + x 3 n 3 n 1 r 1 = r 0 + x 1 Id x Id x x 3 = 40+52+68 7 40 8 52 9 68 {r 3 } {r 0 } n c 4 45 5 65 6 70 x 1 = 45+65+70 r 0 = rand() SUM = (r 3 r 0 ) Andersen A, Yigzaw KY, Karlsen R. Privacy preserving health data processing. IEEE Healthcom, 2014 16 8

Example applications SMC protocol for disease surveillance 1 SMC protocol for logistic regression tested for detection of adverse drug events 2 1 El Emam K et al. A secure protocol for protecting the identity of providers when disclosing data for disease surveillance JAMIA 2011 2 El Emam K et al. A secure distributed logistic regression protocol for the detection of rare adverse drug events. JAMIA 2013 17 History The SMC concept is pioneered by Yao (1982) Most SMC protocols were designed to show feasibility During the last decade better SMC protocols and implementations started to appear SMC protocols vary with privacy guaranty, efficiency and scalability Strong privacy is often achieved using more complex techniques, which are less efficient and scalable Bogdanov D. Sharemind: programmable secure computations with practical applications. PhD Thesis. Tartu University, 2013. 18 9

Discussion SMC techniques do not modify or remove data attributes and do not have selection bias, therefore data utility is not affected More efficient and scalable techniques are being developed Protects the privacy of both individuals and health institutions Enable health institutions to maintain strong control over their private data These could encourage more individuals and institutions to allow data 19 Acknowledgement PhD supervisors (Johan Gustav Bellika, Anders Andersen, and Gunnar Hartvigsen) Meskerem Asfaw Hailemichael Tromsø Telemedicine Laboratory (TTL) UiT The Arctic University of Norway Norwegian Center for Telemedicine and Integrate Care (NST) 20 10

Kassaye Yitbarek Yigzaw PhD candidate UiT The Arctic University of Norway kassaye.y.yigzaw@uit.no http://www.panoramio.com/photo/10889343 21 11