Balancing Big Data for Security, Collaboration and Performance



Similar documents
Cancer Genomics: What Does It Mean for You?

The National Consortium for Data Science (NCDS)

Comprehensive Data Resource Introduction

How Can Institutions Foster OMICS Research While Protecting Patients?

Managing Next Generation Sequencing Data with irods

An EVIDENCE-ENHANCED HEALTHCARE ECOSYSTEM for Cancer: I/T perspectives

Integrating a Research Management System & EMR: Motivations and Benefits Host

High Performance Computing Initiatives

A leader in the development and application of information technology to prevent and treat disease.

Personalized Medicine: Humanity s Ultimate Big Data Challenge. Rob Fassett, MD Chief Medical Informatics Officer Oracle Health Sciences

irods for Big Data Management in Research Driven Organizations Charles Schmitt CTO & Director of Informatics RENCI

Opportunities and Limitations of Big Data to Address Diversity. Shawn Murphy MD, Ph.D.

School of Nursing. Presented by Yvette Conley, PhD

Nazneen Aziz, PhD. Director, Molecular Medicine Transformation Program Office

INTRODUCTION TO THE DATAVERSE NETWORK

Transla6ng from Clinical Care to Research: Integra6ng i2b2 and OpenClinica

Opportunities and Challenges in Translating Novel Discoveries into Useful Clinical Tests

BioGrid s use of Business Analytics for Collaborative Medical Research. Maureen Turner, CEO, BioGrid Australia

Big Data Visualization for Genomics. Luca Vezzadini Kairos3D

IMPLEMENTING BIG DATA IN TODAY S HEALTH CARE PRAXIS: A CONUNDRUM TO PATIENTS, CAREGIVERS AND OTHER STAKEHOLDERS - WHAT IS THE VALUE AND WHO PAYS

TRACKS GENETIC EPIDEMIOLOGY

68 th Meeting of the National Cancer Institute (NCI) NCI Council of Research Advocates (NCRA) National Institutes of Health (NIH)

Fast. Integrated Genome Browser & DAS. Easy. Flexible. Free. bioviz.org/igb

Big data in cancer research : DNA sequencing and personalised medicine

m 4 Biobank Alliance & m 4 Trial Service Center

ALCHEMIST (Adjuvant Lung Cancer Enrichment Marker Identification and Sequencing Trials)

Next Generation Sequencing: Technology, Mapping, and Analysis

Clinical Trials: Questions and Answers

Worldwide Collaborations in Molecular Profiling

Automated and Scalable Data Management System for Genome Sequencing Data

Testimony of. Paul Misener Vice President for Global Public Policy, Amazon.com. Before the

Big Data and the Data Lake. February 2015

Genomic Medicine The Future of Cancer Care. Shayma Master Kazmi, M.D. Medical Oncology/Hematology Cancer Treatment Centers of America

Just the Facts: A Basic Introduction to the Science Underlying NCBI Resources

NIH s Genomic Data Sharing Policy

Digital Health: Catapulting Personalised Medicine Forward STRATIFIED MEDICINE

SAP Healthcare Analytics Solutions Provide physicians and researchers access to patient data from various systems in realtime

Big Data for Population Health

ARIA ONCOLOGY INFORMATION SYSTEM RADIATION ONCOLOGY

IBCSG Tissue Bank Policy

Carolina s Journey: Turning Big Data Into Better Care. Michael Dulin, MD, PhD

ENABLING DATA TRANSFER MANAGEMENT AND SHARING IN THE ERA OF GENOMIC MEDICINE. October 2013

If you are signing for a minor child, you refers to your child throughout the consent document.

THE SIDNEY KIMMEL COMPREHENSIVE CANCER CENTER AT JOHNS HOPKINS

The Future of Personalized Medicine: Powered by Real World Data. Kris Joshi, PhD Global Vice President

G E N OM I C S S E RV I C ES

Computational Pathology and the Role of Pathology Informatics

NIH Commons Overview, Framework & Pilots - Version 1. The NIH Commons

HIPAA: Open Research Issues Michael L. Blau, Esq. McDermott, Will & Emery

Masters of Science in Clinical Research (MSCR) Curriculum. Goal/Objective of the MSCR

Center for Health Informatics & Bioinformatics. A New Catalyst For Cutting Edge research, Funding Opportunities, and Education at NYULMC

8/27/2014. Office of Research Informatics(ORI) CORI. Introduction- The Office of Research Informatics (ORI)?

Secondary Uses of Data for Comparative Effectiveness Research

An Introduction to Genomics and SAS Scientific Discovery Solutions

European Genome-phenome Archive database of human data consented for use in biomedical research at the European Bioinformatics Institute

From Fishing to Attracting Chicks

Data-driven Medicine in the Age of Genomics Overcoming the Challenge With Advanced Molecular Analytics

Moffitt Cancer Center, M2Gen and ConvergeHEALTH Collaboration

Understanding Big Data Analytics for Research

Using genetic biomarkers to pre-identify oncology patients for clinical trials

How does genetic testing work?

How Real-time Analysis turns Big Medical Data into Precision Medicine?

The MSCR Curriculum and Its Advantages

What Cancer Patients Need To Know

Technology funding opportunities at the National Cancer Institute

Course Requirements for the Ph.D., M.S. and Certificate Programs

University of Medicine and Dentistry of New Jersey (UMDNJ)

Using the Bionimbus Protected Data Cloud (PDC): Obtaining Access Credentials FAQ

PATHOLOGY DEPARTMENTS AT GRADUATE MEDICAL EDUCATION TEACHING INSTITUTIONS

NCI Community Cancer Centers Program Program Overview Ascension Health St. Vincent Indianapolis Hospital

Visual Analytics to Enhance Personalized Healthcare Delivery

The University is comprised of seven colleges and offers 19. including more than 5000 graduate students.

Big Data for Population Health and Personalised Medicine through EMR Linkages

Ask Us About Clinical Trials

Transcription:

Balancing Big Data for Security, Collaboration and Performance Sai Balu Lineberger Cancer Center UNC Chapel Hill Oct 14, 2014

About UNC Oldest Public University -1793 Top 5 Public University. 46th World Wide Clinical Translational Science Award NCTraCS Institute Carolina Data Warehouse - Hospital/Research School of Medicine - 6th in NIH funding

About Lineberger NCI Designated Comprehensive Cancer Center Largest Research Entity at UNC - $190 million/year in external grants 300 Scientists, 1200 Staff across UNC Campus 250 Clinical Trials offered NC Cancer Hospital : Clinical Home University Cancer Research Fund - $25 million in 2007 and $42 million/year in 2014

About UNC Hospitals Not-for-Profit Integrated Health Care Teaching Mission State of the Art Patient Care EMR and Cancer Registry WebCIS Epic

About RENCI A Leader in Cyber Technologies Scientific Discoveries & Business Innovations Medicine & Genomics Environmental Sciences Data Management Technologies: irods

Bioinformatics Core at Lineberger Infrastructure for Data Management and Data Analysis Integrated Data Analysis - Genomic & Clinical & public annotations Supporting Instruments

Big Data Velocity The rate of data generation, rate of change Volume The size of data Variety Under represented of the Vs but not Today!

TCGA The Cancer Genome Atlas Project Study Molecular Basis for Cancer 20+ tumor types studied Expression, Copy Number, DNA/RNA, mirna UNC is Gene Expression Center Dr. Chuck Perou 10K samples processed

TCGA Analysis Tumor Working Groups & Data Freezes Exposure to Variety Types of data, Security, Sources, Performance, Sharing, Analysis.

UNC Cancer Survivorship Goal: Enroll 10K Patients! Collect Biospecimens, medical records and follow-up with questionnaires

UNCseq Genetic Profiling Cancer Patient Specimens Support Treatment Decisions Target ~200 genes of potential clinical utility All known druggable targets Genes of interest confirmed by experts

Big Data - Variety! 1. Clinic Schedules 7. Public data - Clinical Trials, Oncotator, Death Indexes 2. ICD codes 8. Ancillary Studies 3. Consent Status 9. Workflows 4. Tissue Banking and Annotations 10. Metadata 5. Questionnaires - 2 different languages 11. Analysis - exome, survival, spatial. 6. EMR - Pathology as an example 12. Instruments - robots, sequencing - sequonom, snp arrays

Big Data - Variety! Variety of Sources Epic, SAS-Health Outcome Analytics, Death Indexes Variety of Security Public Data to CLIA to FISMA compliance Variety of Standards +1 standards

SAS - HOA Private partnership to create Cancer Data Mart Patient Counts - 155,078 Pathology Report Types - 33 Pathology Report Datapoints - 21,347,023 Lab Tests - 387,495 Lab Test Observations - 34,168,986

Security, Collaboration and Performance Balancing is an art Institutional Policies Develop Trust Develop standard verification processes Develop Training materials

Security HIPAA Sensitive Data FISMA Moderate Claims Data Secure Medical Workspaces Secure Cluster Computing

Performance Sustained 15Gbs/sec over the network for many hours - largest network traffic seen within UNC campus Transferring to Data Coordination Centers - Bit Torrent Style software

Collaboration Through Data Sharing Without Duplication with different ACLs Bring Compute to Data irods - A Possible Solution

Data Governance Identify Stewards Identify Custodians Identify Users Develop Policies Create Workgroups

Acknowledgement UNCseq Team Health Registry Team TCGA at UNC Team DDN Thank you!