Considering De-Identification? Legacy Data. Kymberly Lee 16-Jul-2015

Similar documents
Clinical Study Reports Approach to Protection of Personal Data

Principles for Responsible Clinical Trial Data Sharing

How to De-identify Data. Xulei Shirley Liu Department of Biostatistics Vanderbilt University 03/07/2008

Yale University Open Data Access (YODA) Project Procedures to Guide External Investigator Access to Clinical Trial Data Last Updated August 2015

WHITE PAPER. CONVERTING SDTM DATA TO ADaM DATA AND CREATING SUBMISSION READY SAFETY TABLES AND LISTINGS. SUCCESSFUL TRIALS THROUGH PROVEN SOLUTIONS

Development of CDISC Tuberculosis Data Standards

Data De-identification and Anonymization of Individual Patient Data in Clinical Studies A Model Approach

ABSTRACT INTRODUCTION PATIENT PROFILES SESUG Paper PH-07

The ADaM Solutions to Non-endpoints Analyses

HIPAA POLICY REGARDING DE-IDENTIFICATION OF PROTECTED HEALTH INFORMATION AND USE OF LIMITED DATA SETS

Clinical Data Management Overview

Copyright 2012, SAS Institute Inc. All rights reserved. VISUALIZATION OF STANDARD TLFS FOR CLINICAL TRIAL DATA ANALYSIS

Bringing Order to Your Clinical Data Making it Manageable and Meaningful

Pharmaceutical Applications

Overview of CDISC Implementation at PMDA. Yuki Ando Senior Scientist for Biostatistics Pharmaceuticals and Medical Devices Agency (PMDA)

How to build ADaM from SDTM: A real case study

SAS Drug Development User Connections Conference 23-24Jan08

PharmaSUG Paper HS01. CDASH Standards for Medical Device Trials: CRF Analysis. Parag Shiralkar eclinical Solutions, a Division of Eliassen Group

Implementation of SDTM in a pharma company with complete outsourcing strategy. Annamaria Muraro Helsinn Healthcare Lugano, Switzerland

PharmaSUG 2016 Paper IB10

CDISC SDTM & Standard Reporting. One System

The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data. Ravi Shankar

Clinical Trial Transparency. What is available?

ADaM or SDTM? A Comparison of Pooling Strategies for Integrated Analyses in the Age of CDISC

Bridging Statistical Analysis Plan and ADaM Datasets and Metadata for Submission

Rationale and vision for E2E data standards: the need for a MDR

HIPAA-Compliant Research Access to PHI

Einführung in die CDISC Standards CDISC Standards around the World. Bron Kisler (CDISC) & Andrea Rauch DVMD Tagung

Global Alliance for Genomics & Health Data Sharing Lexicon

HIPAA-P06 Use and Disclosure of De-identified Data and Limited Data Sets

USE CDISC SDTM AS A DATA MIDDLE-TIER TO STREAMLINE YOUR SAS INFRASTRUCTURE

Business & Decision Life Sciences What s new in ADaM

SAS CLINICAL TRAINING

Memorandum. Factual Background

Optimizing Safety Surveillance During Clinical Trials Using Data Visualization Tools

Records and Clinical Trials

How to easily convert clinical data to CDISC SDTM

Data Conversion to SDTM: What Sponsors Can Do to Facilitate the Process

UPMC POLICY AND PROCEDURE MANUAL

De-Identification of Health Data under HIPAA: Regulations and Recent Guidance" " "

TEMPLATE DATA MANAGEMENT PLAN

Protecting Personal Health Information in Research: Understanding the HIPAA Privacy Rule

PharmaSUG2010 HW06. Insights into ADaM. Matthew Becker, PharmaNet, Cary, NC, United States

Needs, Providing Solutions

STUDY DATA TECHNICAL CONFORMANCE GUIDE

Winthrop-University Hospital

Guidance for Industry

CDISC and Clinical Research Standards in the LHS

Societal benefits vs. privacy: what distributed secure multi-party computation enable? Research ehelse April Oslo

Environmental Health Science. Brian S. Schwartz, MD, MS

Business & Decision Life Sciences

Synapse Privacy Policy

Richmond Gastroenterology Associates, Inc.

IRB Policy for Security and Integrity of Human Research Data

SDTM AND ADaM: HANDS-ON SOLUTIONS

STUDY DATA TECHNICAL CONFORMANCE GUIDE

Training/Internship Brochure Advanced Clinical SAS Programming Full Time 6 months Program

STUDY DATA TECHNICAL CONFORMANCE GUIDE

HIPAA and HITECH Compliance Simplification. Sol Cates

De-Identification of Clinical Data

Trial Description. Organizational Data. Secondary IDs

What is Covered under the Privacy Rule? Protected Health Information (PHI)

Application for an Off-Site Tissue Banking Waiver at a Non-Profit or Academic Institution

ADaM Implications from the CDER Data Standards Common Issues and SDTM Amendment 1 Documents Sandra Minjoe, Octagon Research Solutions, Wayne, PA

Accenture Accelerated R&D Services: CDISC Conversion Service Overview

The following list consists of a few tips and tricks to use when navigating eirb.

Clinical Data Management BPaaS Approach HCL Technologies

Duke Ethics & Compliance Office Update 2014

Best Practices for Good Data Management. February 19, 2015

PharmaSUG Paper DG06

Oracle Buys Phase Forward Expands Oracle s solutions for the life sciences and healthcare industries

An information platform that delivers clinical studies better, faster, safer and more cost effectively

From Validating Clinical Trial Data Reporting with SAS. Full book available for purchase here.

UPMC POLICY AND PROCEDURE MANUAL

A white paper presented by: Barry Cohen Director, Clinical Data Strategies Octagon Research Solutions, Inc. Wayne, PA

Medical Data Review and Exploratory Data Analysis using Data Visualization

Did you know? Accenture can deliver business outcome-focused results for your life sciences research & development organization like these:

HIPAA Basics for Clinical Research

Submission of comments on 'Policy 0070 on publication and access to clinical-trial data'

Use of standards: can we really be analysis ready?

Transcription:

Considering De-Identification? Legacy Data Kymberly Lee 16-Jul-2015

Introduction This presentation provides an overview of Clinical data sharing, clinical data privacy, and clinical transparency. Discuss the nuances and experiences in working with Legacy data to complete the de-identification process in order to preserve data privacy while maintaining scientific importance. 2 Author 00 Month Year Set area descriptor Sub level 1

Data Sharing, Privacy and Transparency Clinical Data Sharing is the ability to share data Clinical data privacy encompasses privacy laws, HIPAA, etc. Clinical Data Transparency determines the levels of data deidentification to protect the patient s personal data 3 Author 00 Month Year Set area descriptor Sub level 1

Unique Approach to De-identification Legacy Data versus SDTM/CDISC Data Differences/Challenges Legacy Data versus SDTM/CDISC Legacy data not required to be converted to be CDISC compliant. However, the de-identified data file should be structurally ready for SAS transport file conversion when completed If the raw data source has been converted already to CDISC SDTM complaint and the analysis file are still legacy data This presentation will demonstrate how to convert legacy data points only. Please reference public documents on CDISC compliant data. 4 Author 00 Month Year Set area descriptor Sub level 1

Ensuring Consistency Consistency/Traceability: Shorten Label/Variable names Traceability: De-identified raw/source data/analysis data How to ensure traceability of the patient going from deidentified raw/source data to the de-identified analysis data Ensure patient consistency from de-identified double-blinded phase trials and open or extension phase de- identified data 5 Author 00 Month Year Set area descriptor Sub level 1

Considerations to Simplify the Approach If not using an industry tool, a viable approach would be to create a separate de-identified linking file based from unique patient/subject identification number, site/center number, demographic information, geographic location, and any other personal data points. Ensure patient/subject identification is completely rerandomized or scrambled. Ways to achieve this process: 1. New Patient identifier 2. Site/Center Number (where applicable) 3. Age Grouping 4. Race category re-randomized or scrambled 6 Author 00 Month Year Set area descriptor Sub level 1

Considerations, cont d Keeping consistency within patient level information across raw/source and analysis data files as well as open/extension studies is crucial Processes Applied: 1. Merging Datasets with de-identified patient level data 2. Dropping Original Patient Information 3. Renaming variables/label 4. What to do with the linking file information 7 Author 00 Month Year Set area descriptor Sub level 1

Domain Considerations: General Points Date variables: Why you should consider calculating a relative day variable within each domain (i.e. Visit dates or assessment dates, time to event dates, adverse event dates, etc) Timing variables expected within protocol design (i.e. Labs, Exposure, Vitals, Pharmacokinetics, etc) Different data or analysis domains require special attention to ensure patient s identity is protected especially when looking across data collections, medical coding, and types of assessments collected. Let s review a few. 8 Author 00 Month Year Set area descriptor Sub level 1

Data Considerations cont d Special Attention Data Files Adverse events, medical and disease history, concomitant medication/procedures, prior and concomitant/subsequent therapy should be closely scrutinized What to do when older versions of medical dictionaries are used: Does Requester want to update the coding Patient safety/scientific relevance Redaction Laboratory and ECG Data: What variables to de identify and why Vital Signs Data (including weight and height): What variables and why? Should therapeutic area or analyses be considered? 9 Author 00 Month Year Set area descriptor Sub level 1

Data Considerations cont d Exposure data and Other Analyses Exposure data: Does the data file consist of other important patient level identities? What variables should be considered? Should they be deidentified or removed? Consider all therapeutic areas and data used for efficacy. (i.e. tumor locations, genomic data, translational medicines, asthma equipment, cardiovascular digital equipment, ambulatory serial numbers,etc. ) Ensure any data pieces that may provide personal identity of a patient to any knowledgeable medical/technical employee. 10 Author 00 Month Year Set area descriptor Sub level 1

Documentation Upon completion of de-identifying the data, documentation is important for internal purposes as well as safety of patient s personal data Recommend the following: 1. Complete documentation to explain which variables were deidentified and to what level of de-identification (i.e. explain the process). See additional slides for options of recording. 2. Check data for conformance and traceability of variable/label name from raw/source data to analysis data 3. Complete SAS transport files 4. Provide to requestor as specified per Data Sponsor 11 Author 00 Month Year Set area descriptor Sub level 1

De-identification Reviewer s Guide Contents Introduction 1.1 Purpose Acronyms Current process Name Version and Compliances Protocol Description 2.1 Protocol Number and Title 2.2 Data files included in Individual Patient-Level Data Delivery Subject Data Description 3.1 Overview 3.2 De-Identified Source Data Domains 3.2.1 Data set name Adverse Events 3.2.2 Data set Name Demographics 3.2.3 Data set Name Concomitant Medications 3.2.4 Dataset Name Label of Dataset Name ** Continue recording for all applicable data files ** 3.3 De-Identified Analysis Data Domains 3.3.1 Data set Name Subject Level Data 3.3.2 CM Concomitant Medications 3.3.3 Dataset Name Label of Dataset Name 3.3.4 Data set name label of dataset name 3.3.5 Continue the process until all analysis files are recorded. Data Conformance Summary 4.1 Data Issues 4.2 Data Issues Summary

Introduction 1.1 Purpose This document provides context for the sole purpose of de-identification of Individual Patient-Level Data (ILDP) as agreed upon and specified within Data Sponsor as specified in current process governances and Clinical Data Sharing Agreement. In addition, this document provides a summary of the data points included as well how IPLD data information was de-identified in a conformance with the original data findings and written agreement. Acronyms Acronym IPLD HIPAA Translation Individual Patient-Level Data Health Insurance Portability and Accountability Act Current process Name Version and Compliances Standard or Dictionary Versions Used Current Process name Version Final 0.0/ Month/Year 13 Author 00 Month Year Set area descriptor Sub level 1

Protocol Description 2.1 Protocol Number and Title Protocol Number: Protocol Title: Protocol Versions/Date: 2.2 Data files included in Individual Patient-Level Data Delivery Raw/Oracle Datasets? Yes SDTM Datasets? No Analysis/ADAM Datasets? Yes 14 Author 00 Month Year Set area descriptor Sub level 1

Subject Data Description 3.1 Overview Date of the Clinical Data Sharing Agreement: Date: DD-MM-YYYY Were data de-identified as requested based on Data Sharing Agreement? Yes Specific the date of the original data files being used for this request: Date: DD-MM-YYYY Were the Raw/Oracle/SDTM datasets used as sources for the analysis datasets? Yes In what data format were the original data stored? In what data format will the de-identification datasets be delivered? How were the de-identified data sets transferred? 15 Author 00 Month Year Set area descriptor Sub level 1

16 Author 00 Month Year Set area descriptor Sub level 1

17 Author 00 Month Year Set area descriptor Sub level 1

18 Author 00 Month Year Set area descriptor Sub level 1

Conclusion In conclusion, what was covered when de-identifying Legacy data: 1. Review the request and Therapeutic area requested. 2. Determine if raw/source and/or analysis files will be needed 3. Decide the domains requested and types of data requested 4. Review and ensure consistency and traceability of variable names and lengths within the data request. 5. Complete the link file with intentions of ensuring patient level deidentified data points have the capability to carry from raw/ source data to analysis and/or open-label extension studies. 6. Review whether protocol or analysis time variables 7. Check all treatment assignments and randomizations 8. Can the data be re-identified by any medical employee/officers? 9. Check all available data points for personal data : de-identifying or redact all data (including comments). 10. Document all process applied in Data Sponsor provided format. 11. Validate the process thoroughly before delivery. 19 Author 00 Month Year Set area descriptor Sub level 1

Contact Information and Presentation Disclosure All contents of this presentation are the sole expressions and experiences of the presenter. Contact information: Kymberly Lee AstraZeneca, Gaithersburg Campus (301) 398-0715 Confidentiality Notice This file is private and may contain confidential and proprietary information. If you have received this file in error, please notify us and remove it from your system and note that you must not copy, distribute or take any action in reliance on it. Any unauthorized use or disclosure of the contents of this file is not permitted and may be unlawful. AstraZeneca PLC, 2 Kingdom Street, London, W2 6BD, UK, T: +44(0)20 7604 8000, F: +44 (0)20 7604 8151, www.astrazeneca.com 20 Author 00 Month Year Set area descriptor Sub level 1