Health Science Cyberinfrastructure:



Similar documents
patient-centered SCAlable National Network for Effectiveness Research

idash Infrastructure to Host Sensitive Data: HIPAA Cloud Storage and Compute

Preparing Electronic Health Records for Multi-Site CER Studies

Interoperability and Analytics February 29, 2016

From Fishing to Attracting Chicks

AHRQ S Perspective On Clinical Utility. Gurvaneet Randhawa, MD, MPH Medical Officer, Senior Advisor on Clinical Genomics and Personalized Medicine

How Can Institutions Foster OMICS Research While Protecting Patients?

Utility of Common Data Models for EHR and Registry Data Integration: Use in Automated Surveillance

Informatics Domain Task Force (idtf) CTSA PI Meeting 02/04/2015

BRISSkit: Biomedical Research Infrastructure Software Service kit

The Use of Patient Records (EHR) for Research

How To Use An Ehr For Research

NIH Genomic Data Sharing (GDS) Policy Guidance Memo #2 1

Societal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse April Oslo

Research Into Care: Identifying Barriers and Gaps in Care. AAFP National Research Network Robert Graham Center Wilson D. Pace, MD

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

Personalized Medicine: Humanity s Ultimate Big Data Challenge. Rob Fassett, MD Chief Medical Informatics Officer Oracle Health Sciences

An Easily Accessed Clinical Research Database from your Epic EMR

Precision Medicine Initiative: Building a Large U.S. Research Cohort NIH Workshop February 11 12, 2015 Porter Neuroscience Building, Bethesda, MD

The Future of the Electronic Health Record. Gerry Higgins, Ph.D., Johns Hopkins

Design and Implementation of an Automated Geocoding Infrastructure for the Duke Medicine Enterprise Data Warehouse

Predictive Analytics Drives Population Health Management

IMPLEMENTING BIG DATA IN TODAY S HEALTH CARE PRAXIS: A CONUNDRUM TO PATIENTS, CAREGIVERS AND OTHER STAKEHOLDERS - WHAT IS THE VALUE AND WHO PAYS

Integration of Genetic and Familial Data into. Electronic Medical Records and Healthcare Processes

Physician Champions David C. Kibbe, MD, & Daniel Mongiardo, MD FAQ Responses

From Research to Practice: New Models for Data-sharing and Collaboration to Improve Health and Healthcare

Vision for the Cohort and the Precision Medicine Initiative Francis S. Collins, M.D., Ph.D. Director, National Institutes of Health Precision

Workshop on Establishing a Central Resource of Data from Genome Sequencing Projects

Big Data and CancerLinQ

#Aim2Innovate. Share session insights and questions socially. Genomics, Precision Medicine and the EHR 6/17/2015

Find the signal in the noise

Center for. Clinical Informatics. Clinical Informatics. Mission Statement

An Essential Ingredient for a Successful ACO: The Clinical Knowledge Exchange

Public Health and the Learning Health Care System Lessons from Two Distributed Networks for Public Health

MediSapiens Ltd. Bio-IT solutions for improving cancer patient care. Because data is not knowledge. 19th of March 2015

Understanding Diagnosis Assignment from Billing Systems Relative to Electronic Health Records for Clinical Research Cohort Identification

Enhancing Functionality of EHRs for Genomic Research, Including E- Phenotying, Integrating Genomic Data, Transportable CDS, Privacy Threats

Health Information Exchange At Sutter Health Using. Steven Lane, MD, MPH EHR Ambulatory Physician Director

NIH s Genomic Data Sharing Policy

Balancing Big Data for Security, Collaboration and Performance

Big Data Analytics in Health Care

IEEE International Conference on Computing, Analytics and Security Trends CAST-2016 (19 21 December, 2016) Call for Paper

Using EHRs to extract information, query clinicians, and insert reports

HEALTHCARE DOES HADOOP: AN ACADEMIC MEDICAL CENTER S FIVE-YEAR JOURNEY. Charles Boicey, MS, RN-BC, CPHIMS Chief Innovation Officer Clearsense

Incorporating Research Into Sight (IRIS) Essentia Rural Health Institute Marshfield Clinic Penn State University

Electronic Medical Records and Genomics: Possibilities, Realities, Ethical Issues to Consider

Course Specification

Big Data R&D Initiative

ROLE OF THE RESEARCH COORDINATOR Study Start-up Best Practices

Smarter Research. Joseph M. Jasinski, Ph.D. Distinguished Engineer IBM Research

MEDICAL DATA MINING. Timothy Hays, PhD. Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012

Moffitt Cancer Center, M2Gen and ConvergeHEALTH Collaboration

HIPAA and Big Data Twenty Third National HIPAA Summit. March 17, 2015 Mitchell W. Granberg, Optum Chief Privacy Officer

Medical Informatics: A Fishing Story

BRISSkit: Biomedical Research Infrastructure Software Service kit. Jonathan Tedds. #brisskit #umfcloud

Electronic health records to study population health: opportunities and challenges

Secondary Uses of Data for Comparative Effectiveness Research

OME : Mobile Cognitive Healthcare for the Consumer. Michael Nova, M.D. Chief Innovation Officer Pathway Genomics

TRANSLATIONAL BIOINFORMATICS 101

i2b2 Clinical Research Chart

A leader in the development and application of information technology to prevent and treat disease.

Natural Language Processing in the EHR Lifecycle

Kaiser Permanente Comments on Health Information Technology, by James A. Ferguson

Meaningful Use. Goals and Principles

Len Bowes, MD, MS, Intermountain Healthcare Medical Informatics Jan *ARRA = American Recovery and Reinvestment Act

Medical Informatics An Overview Saudi Board For Community Medicine

Cloud-Based Big Data Analytics in Bioinformatics

Biobanks: DNA and Research

ADVANCING MEASUREMENT OF PATIENT- CENTERED OUTCOMES AND QUALITY METRICS WITH ELECTRONIC HEALTH RECORDS

ENVISIONING THE FUTURE OF DERMATOLOGY THROUGH THE LENS OF MEDICAL INFORMATICS

I n t e r S y S t e m S W h I t e P a P e r F O R H E A L T H C A R E IT E X E C U T I V E S. In accountable care

The registry of the future: Leveraging EHR and patient data to drive better outcomes

Transcription:

UC s Research Enterprise, Emerging Trends / Opportunities Required Cyber Infrastructures Faculty Perspective 3/23/15 Health Science Cyberinfrastructure: Privacy Protection and High-Performance Computing with Big Human Subjects Data Lucila Ohno-Machado, MD, PhD University of California San Diego The analyses upon which this publication is based were performed under Contract Number HHSM-500-2009-00046C sponsored by the Center for Medicare and Medicaid Services, Department of Health and Human Services.

Decision Support Tools Guidelines, Alert & Reminders Patient Interaction Data Collection Tools Clinical Data Warehouse Communication Strategies Consumer Health Informatics Medical Education Predictive Modeling Evaluation Methods Data De- Identification Privacy Technology Data Integration Genomics Proteomics Sensors Data Analysis Statistics Machine Learning Data Structuring Natural Language Processing Data Modeling

slide from Dr. Xiaoqian Jiang 3 Medicine in the Era of Big Data Biomedical science is moving towards datadriven approaches 60+ genetic variants are adopted in clinical practice Sequencing is getting cheaper Family history data is consider as "the best genetic test available

Big Data Announcement 4

Types of Health Sciences Big Data Several terabytes of data are being generated every day and samples are being collected Genomes Images Body sensors Environmental sensors Electronic Health Records Personal Health Records Pharmacy Social media

Precision Medicine Workshop at NIH EHR Group recommendations Patient participation: Blue button rights (Synch for Science) Patient portals (MyChart): patientreported outcomes, research match UC could build a large portion of the 1M people cohort Cutting edge research attracts patients Strengths in cancer, genomics, analytics, robotics, imaging EHR data already harmonized to national standards Point of care contact via EHR, mhealth, sensors Health gaming, cloud, privacy Biorepositories

Learning Healthcare System Secure HIPAA cloud storage & compute for big data Predictive analytics, machine learning Natural language processing (extract data from clinical notes, specialty reports) Human-computer interaction expertise Consent management system, biorepository Benchmarking Integration at the point of care 7

Integrating Different Types of Data Genotype genome transcription RNA transcriptome translation Protein proteome Phenotype physical exam, imaging, monitoring systems Physiology tests Metabolites laboratory

9

Specific Information Can Support Clinical Decisions This patient has DPYD deficiency 5-FU/Capecitabine toxicity is increased prescribe Raltitrexed instead

Precision Medicine Genotype-phenotype correlations Sensor data, mhealth Clinical Applications Pharmacogenomics Targeted prevention

Health Insurance Portability & Accountability Act HIPAA Security Risk of disclosure Liability Privacy Risk of re-identification Informed consent Presidential Commission for the Study of Bioethical Issues. Privacy and progress in whole genome sequencing. http://www.bioethics.gov covers genome sequencing, exome sequencing, genomewide SNV analysis, and data from large scale genomic studies 12 HIPAA De-identified data removal of 18 identifiers, such as dates, biometrics, names, etc. expert certification of low risk of re-identification Limited data sets have dates

13

What is the problem? Imagine a public data set of genomes about a certain disease (e.g., Alzheimer s) Can you find out if your neighbor has Alzheimer s? Same problem happens with clinical data, which can be re-identified to a target patient Dates of visits Combination of patient characteristics 05/6/2015 Supported by the NIH Grant U54 HL108460 to 14

Our Needs Supported by the NIH Grant U54 HL108460 to the University of California, San Diego 15 Share access to data and computation Train the new generation of data scientists Innovative software, platform, and infrastructure Privacy protection Algorithms Tools Infrastructure Policies Federatio n Develop. Platform Research Privacy Knowledg Knowledge e & Tools & Tools CI Policie s Sharing Consent Data Services Hosting Aggreg. Sensors Clinical Genomic Exec. WWW Apps

Ohno-Machado L. To Share or Not To Share: That Is Not the Question. Science Translational Medicine, 2012 4(165) Supported by the NIH Grant U54 HL108460 to the University of California, San Diego Big Human Subjects Data Cyberinfrastructure commons differential privacy homomorphic encryption secure multiparty computation

Shifting control Patient Interface Consent Management System Healthcare Institutions Do I wish to disclose data D to U? Yes Trusted broker Patient I Sharing Look-up Data use agreements Study registry User U requests Data D on individual I I can check that U looked at my data D 05/6/2015

informed CONsent for Clinical data Use for Research iconcur Patient I Consent Management System Sharing Look-up Registry Connecting Software Trusted Entity Clinical Data Warehouse Concierge or Automated Services Query Results User U Electronic Health Record 18 Healthcare Institution

Ohno-Machado L. To Share or Not To Share: That Is Not the Question. Science Translational Medicine, 2012 4(165) Supported by the NIH Grant U54 HL108460 to the University of California, San Diego Big Human Subjects Data Cyberinfrastructure commons differential privacy homomorphic encryption secure multiparty computation

Demand Resources On-demand Virtualized Elastic Resilient Compute And Storage Technology AUTOMATE Compute D nodes Memory Disk storage Networking Powered by VMware compute request, direct upload & download of proprietary data, tool, recipe Data Tools Recipes upload & download data middleware and HIPAA security developed by idash Safe HIPAA-compliant Annotated Data deposit box Environment 05/6/2015 HIPAA and non-public data Powere d by MIDAS public data, tools, recipes

HIPAA-Private Cloud 3 computation tiers 3 storage tiers 10GbE throughout Full redundancy RSA Two Factor Auth. Remote data replication 800+ cores 7TB+ RAM 600TB+ storage More planned: 600+ cores 4TB+ RAM 200TB+ storage

Ohno-Machado L. To Share or Not To Share: That Is Not the Question. Science Translational Medicine, 2012 4(165) Supported by the NIH Grant U54 HL108460 to the University of California, San Diego Big Human Subjects Data Cyberinfrastructure commons differential privacy homomorphic encryption secure multiparty computation

UC Clinical Data Warehouses Funded by the UC Office of the President to the NIH-funded CTSAs UL1TR000100 Integration of Clinical Data Warehouses from 5 University of California Medical Centers and affiliated institutions (>12 million patients) Aggregate results Individual-level patient data accessible according to data use agreements and, if required, IRB approval Objectives Monitor patient safety Improve outcomes Promote research

UC-Research exchange Researcher wants data about population Registries, other DBs UCSD (Epic) UC Irvine (Allscripts) Community Partners UC Davis (Epic) UCSF (Epic) Data Matching Function: Map D onto data dictionaries, use Data Warehouses derived from EHR Request for Data on Cohort Results on Harmonized Data Even if data are stored in the same Electronic Health Record system, they need to be harmonized

Big Healthcare Data, Big Clinical Sequencing Data User requests data for Quality Improvement or Research Trusted Broker(s) Identity & Trust Management Policy enforcement AHRQ R01HS19913 Clinical Data Network Diverse Healthcare Entities in 3 different states (federal, state, private) How many patients over 65 are on Warfarin or Dabigatran? What are the major and minor bleeding rates for patients on these drugs, adjusted for co-morbidities? Which CYP2C9 variants are most influential? Kim K, et al. Development of a Privacy and Security Policy Framework for a Multi-State Comparative Effectiveness Research Network. Medical Care 2013 Jiang X et al. Privacy Technology to Support Data Sharing for Comparative Effectiveness Research: A Systematic Review. Medical Care 2013 Kim KK et al. Data Governance Requirements for Distributed Clinical Research Networks: Triangulating Perspectives of Diverse Stakeholders. JAMIA 2014

Privacy-preserving distributed computing User requests data for Quality Improvement or Research Trusted Broker(s) Identity & Trust Management Policy enforcement Security Entity Meeker et al. A System to Build Distributed Multivariate Models and Manage Disparate Data Sharing Policies: Implementation in the Scalable National Network for Effectiveness Research. JAMIA (accepted) Kim K et al. Comparison of Consumers Views on Electronic Data Sharing for Healthcare and Research. JAMIA (accepted) AHRQ R01HS19913 / EDM forum Scalable National Network for Comparative Effectiveness Research

Privacy-preserving distributed computing User requests data for Quality Improvement or Research Diverse Healthcare Entities in 3 different states (federal, state, private) Trusted Broker(s) Identity & Trust Management Policy enforcement Security Entity AHRQ R01HS19913 / EDM forum Scalable National Network for Comparative Effectiveness Research Wu Y et al. Grid Binary LOgistic REgression (GLORE): Building Shared Models Without Sharing Data. JAMIA, 2012 Wang S et al. EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed Privacy-Preserving Online Model Learning. J Biomed Inf 2013 Jiang W et al.. WebGLORE: A Webservice for Grid Logistic Regression. Bioinformatics 2014 Wu Y et al. Grid Multi-Category Response Logistic Models. BMC Med Inform Dec Making 2015

pscanner patient-centered SCAlable National Network for Effectiveness Research 21 M people: 5 UCs, National VA (VINCI resource), 3 Federally Qualified Health Systems in LA (USC), RAND 3 Cohorts: Kawasaki Disease, Congestive Heart Failure, Obesity New Partners (~9 M people) Cedars-Sinai USC Keck / LA Children s San Mateo Med Ctr Intermountain Healthcare U Colorado U Washington-led WWAMI practice-based network SAFTiNet U Vermont FQHCs TCC: The Children s Clinic VM: Virtual Machine SFGH: San Francisco General Hospital OMOP: Observational Medical Outcomes Partnership

Acknowledgements Vineet Bafna Tyler Bath Jane Burns Michele Day Robert El-Kareh Claudiu Farcas Olivier Harismendy Chun-nan Hsu Zhanglong Ji Jihoon Kim Eric Levy Kevin Patrick Elena Martinez Greg Talavera Staal Vinterbo Shuang Wang Wei Wei Zia Agha Lori Daniels Jason Doctor Scott Duvall Fern Fitzhenry Pietro Galassetti Michael Hogarth Kathy Kim Cleo Maehara Michael Matheny Daniella Meeker Jonathan Nebeker Dena Rifkin Carl Stepnowski Howard Taras Adriana Tremoulet Mary Whooley UC-ReX Nick Anderson Lattice Armstead Doug Bell Lisa Dahm Leslie Yuan iconcur Elizabeth Bell Dexter Friedman Rita Germann-Kurtz Xiaoqian Jiang Ian Komenaka Paulina Paul Joe Ramsdell Richard Schwab Amy Sitapati CTRi Tony Chen Daniel Clark Jim Graczik Mike Kalichman Antonios Koures Narimene Lekmine Carol Johnson Gary Firestein PhenDISCO Son Doan Hyeon-eui Kim Ko-wei Lin Hua Xu UCSD Health System Wolf Dillmann Cleo Maehara Maryann Martone Hua Xu Cui Tao Todd Johnson Peter Rose Ricky Taira NIH U54HL108460 UL1TR000100 UH3HL108785 R01LM011392 R21LM012060 K99HG008175 T15LM011271 PCORI CDRN-1306-04819 AHRQ R01HS019913 UCOP

2008-2009 2010-2011 2012-2013 2014-2015 Clinical Data Research Data Applications Integration Personnel Systems Active Directory Electronic Health Record System Epic & Clarity HIPAA Healthcare Clinical Data Clinical Data Warehouse for Research Query Tools UC-ReX Explorer Privacy Technology UCSF Davis Irvine UCLA USC/LAC Recruitment Consent tools Custom Apps pscanner PCORI CDRN Cedars Sinai Epic CDDS New modules San Mateo Other Systems PACS, lab, etc Clinical Research Data RedCAP Velos Other DBs Scalable Network (Distributed Analytics Tools) idash HIPAA/ FISMA OVERCAST idash, CTRI, School of Medicine VA LA Clinics External data (patient reported data) UCSD s Journey idash HIPAA SHADE Images, human genomes, etc iconcur Text Mining / NLP Cancer Genomics Analytical Tools UC-ReX idash Cloud PCORI contracts Accrual for Clinical Trials FISMA De-ID Tools SCANNER K22, K99 pscanner BRIGHT PhenDISCO NLM Training Grant biocaddie