SDTM Validation: Methodologies and Tools

Similar documents
PharmaSUG Paper CD13

SDTM, ADaM and define.xml with OpenCDISC Matt Becker, PharmaNet/i3, Cary, NC

Sanofi-Aventis Experience Submitting SDTM & Janus Compliant Datasets* SDTM Validation Tools - Needs and Requirements

Training/Internship Brochure Advanced Clinical SAS Programming Full Time 6 months Program

Business & Decision Life Sciences

Statistical Operations: The Other Half of Good Statistical Practice

USE CDISC SDTM AS A DATA MIDDLE-TIER TO STREAMLINE YOUR SAS INFRASTRUCTURE

Practical application of SAS Clinical Data Integration Server for conversion to SDTM data

PhUSE Paper CD13

Automate Data Integration Processes for Pharmaceutical Data Warehouse

Lessons on the Metadata Approach. Dave Iberson- Hurst 9 th April 2014 CDISC Euro Interchange 2014

Understanding CDISC Basics

Alcohol Use Disorder Identification Test Self-Report Version (AUDIT-SR)

SDTM AND ADaM: HANDS-ON SOLUTIONS

Use of Metadata to Automate Data Flow and Reporting. Gregory Steffens Novartis PhUSE 13 June 2012

A Macro to Create Data Definition Documents

SAS CLINICAL TRAINING

Using the SAS XML Mapper and ODS PDF to create a PDF representation of the define.xml (that can be printed)

Building and Customizing a CDISC Compliance and Data Quality Application Wayne Zhong, Accretion Softworks, Chester Springs, PA

ABSTRACT INTRODUCTION THE MAPPING FILE GENERAL INFORMATION

How to easily convert clinical data to CDISC SDTM

Data Conversion to SDTM: What Sponsors Can Do to Facilitate the Process

OpenCDISC.org an open source initiative delivering tools for validation of CDISC data

CDISC SDTM & Standard Reporting. One System

Karnofsky Performance Status Scale (KPS Scale)

Bridging Statistical Analysis Plan and ADaM Datasets and Metadata for Submission

Clinical Trial Data Integration: The Strategy, Benefits, and Logistics of Integrating Across a Compound

SDTM-ETL 3.1 New Features

SAS CLINICAL STANDARDS TOOKIT

Introduction to the CDISC Standards

Implementation of SDTM in a pharma company with complete outsourcing strategy. Annamaria Muraro Helsinn Healthcare Lugano, Switzerland

WHITE PAPER. CONVERTING SDTM DATA TO ADaM DATA AND CREATING SUBMISSION READY SAFETY TABLES AND LISTINGS. SUCCESSFUL TRIALS THROUGH PROVEN SOLUTIONS

CDER/CBER s Top 7 CDISC Standards Issues

ADaM Implications from the CDER Data Standards Common Issues and SDTM Amendment 1 Documents Sandra Minjoe, Octagon Research Solutions, Wayne, PA

SDTM-ETL TM. The user-friendly ODM SDTM Mapping software package. Transforming operational clinical data into SDTM datasets is not an easy process.

Use of standards: can we really be analysis ready?

Analysis Data Model (ADaM)

Package R4CDISC. September 5, 2015

A white paper presented by: Barry Cohen Director, Clinical Data Strategies Octagon Research Solutions, Inc. Wayne, PA

A Brief Introduc/on to CDISC SDTM and Data Mapping

Managing Custom Data Standards in SAS Clinical Data Integration

How to build ADaM from SDTM: A real case study

The ADaM Solutions to Non-endpoints Analyses

PK IN DRUG DEVELOPMENT. CDISC management of PK data. Matteo Rossini Milan, 9 February 2010

Business & Decision Life Sciences What s new in ADaM

STUDY DATA TECHNICAL CONFORMANCE GUIDE

PharmaSUG2010 HW06. Insights into ADaM. Matthew Becker, PharmaNet, Cary, NC, United States

PharmaSUG Paper DS15

Using SAS Data Integration Studio to Convert Clinical Trials Data to the CDISC SDTM Standard Barry R. Cohen, Octagon Research Solutions, Wayne, PA

QUALITY CONTROL AND QUALITY ASSURANCE IN CLINICAL RESEARCH

Implementing the CDISC standards into an existing CDMS

From The Little SAS Book, Fifth Edition. Full book available for purchase here.

STUDY DATA TECHNICAL CONFORMANCE GUIDE

From Validating Clinical Trial Data Reporting with SAS. Full book available for purchase here.

ABSTRACT INTRODUCTION WINDOWS SERVER VS WINDOWS WORKSTATION. Paper FC02

CDISC Roadmap Outline: Further development and convergence of SDTM, ODM & Co

CDISC SDTM/ADaM Pilot Project 1 Project Report

Copyright 2012, SAS Institute Inc. All rights reserved. VISUALIZATION OF STANDARD TLFS FOR CLINICAL TRIAL DATA ANALYSIS

Using SAS in Clinical Research. Greg Nelson, ThotWave Technologies, LLC.

Development of CDISC Tuberculosis Data Standards

Managing and Integrating Clinical Trial Data: A Challenge for Pharma and their CRO Partners

Meta-programming in SAS Clinical Data Integration

ABSTRACT INTRODUCTION PATIENT PROFILES SESUG Paper PH-07

ProjectTrackIt: Automate your project using SAS

Analysis Data Model: Version 2.0

Pharmaceutical Applications

«How we did it» Implementing CDISC LAB, ODM and SDTM in a Clinical Data Capture and Management System:

Best Practice in SAS programs validation. A Case Study

4 Other useful features on the course web page. 5 Accessing SAS

STUDY DATA TECHNICAL CONFORMANCE GUIDE

Electronic Submission of Regulatory Information, and Creating an Electronic Platform for Enhanced Information Management

Guidance for Industry

Summary Level Information and Data for CDER s Inspection Planning. Paul Okwesili Office of Scientific Investigations Office of Compliance, CDER/FDA

ClinPlus. Report. Technology Consulting Outsourcing. Create high-quality statistical tables and listings. An industry-proven authoring tool

ABSTRACT INTRODUCTION. Paper RS08

Current Status and Future Perspectives for Systemization of Clinical Study related the issues of CDISC in USA and other

PharmaSUG 2015 Paper SS10-SAS

Innovative Techniques and Tools to Detect Data Quality Problems

Rationale and vision for E2E data standards: the need for a MDR

CDISC standards and data management The essential elements for Advanced Review with Electronic Data

The Development of the Clinical Trial Ontology to standardize dissemination of clinical trial data. Ravi Shankar

Accenture Accelerated R&D Services: CDISC Conversion Service Overview

Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram

PharmaSUG2010 Paper CD04 CD04

Strategies and Practical Considerations for Creating CDISC SDTM Domain Data Sets from Existing CDM Data Sets

UTILIZING CDISC STANDARDS TO DRIVE EFFICIENCIES WITH OPENCLINICA Mark Wheeldon CEO, Formedix Boston June 21, 2013

New features in SDTM-ETL v SDTM-ETL TM. New Features in version 1.2

The Importance of Good Clinical Data Management and Statistical Programming Practices to Reproducible Research

ADaM or SDTM? A Comparison of Pooling Strategies for Integrated Analyses in the Age of CDISC

Einführung in die CDISC Standards CDISC Standards around the World. Bron Kisler (CDISC) & Andrea Rauch DVMD Tagung

Gregory S. Nelson ThotWave Technologies, Cary, North Carolina

Metadata Submission Guidelines Appendix to the Study Data Tabulation Model Implementation Guide

MOVING THE CLINICAL ANALYTICAL ENVIRONMENT INTO THE CLOUD

PharmaSUG Paper DS07

StARScope: A Web-based SAS Prototype for Clinical Data Visualization

Metadata and ADaM.

Transcription:

SDTM Validation: Methodologies and Tools Bay Area CDISC Implementation Network Meeting Friday, April 30 th, 2010 Dan Shiu

Disclaimer The ideas and examples presented here do NOT imply: They have been or will be implemented at Amgen They have not been or will not be implemented at Amgen Amgen agrees or disagrees with them The ideas and examples presented here DO represent: My personal views My sweat and blood

Regulations, Guidance, and Expectations on SDTM Validation FDA 21 CFR Part 11 applies to computer systems (e.g. Base SAS) but not to use/output of the systems (e.g. SAS programs/datasets) FDA Guidance for Industry: Study Data Specifications for electronic submission data tabulation datasets should follow SDTMIG FDA website: SDTM Validation Specifications validation checks from FDA software tools (Janus) Data submitted to regulatory agency is expected to be complete and accurate, regardless of the regulatory requirement

SDTM Validation Categories SDTM Mapping Validation Raw Data Mapping Specifications/aCRF Programming SDTM Data Verify raw data is CORRECTLY and TRUTHFULLY converted to SDTM data SDTM Compliance Checks Rules have been developed to ensure the software used by FDA (WebSDM by PhaseForward) can check and load the submitted SDTM data into their data warehouse (Janus) Each rule carries a degree of severity for non-compliance in the worst case may result in refusal to file

SDTM Mapping Validation vs. Compliance Checks SDTM Mapping Validation SDTM Compliance Checks The QS domain is not intended for use in submitting diaries capturing routine study data Measurement, Test, or Examination values must have consistent standard unit value (--STRESU) across all records in EG, LB, QS, VS Start Date/Time of Observation (--STDTC) must be less than or equal to End Date/Time of Observation (--ENDTC)

SDTM Validation Methodologies SDTM Mapping Validation Full Independent-programming Risk-based QC Process Characteristics-based QC Process SDTM Compliance Checks WebSDM (v1.5/v2.6/v3.0) Janus (v1.0 Draft) Other SDTMIG custom checks

Full Independent-Programming Create SDTM mapping specifications/acrfs Programmer creates production SDTM datasets based on mapping specifications/acrfs QC role creates QC SDTM datasets based on the same mapping specifications/acrfs PROC COMPARE production vs. QC SDTM datasets Resolve discrepancies until production SDTM matches with QC SDTM

Issues with Full Independent- Programming Result is still dependent and biased Inconsistent QC process across products/studies/milestones QC not based on risk spend more time on less important/risky issues Double resources programmers, codes, datasets, documentation Inefficiency delayed deliverables

Risk-based QC Not all uses of SDTM data are equally important Not all programming steps are equally errorprone Align QC efforts with the intended use of SDTM as well as the programming steps used to produce data Spend most of your QC resources on data with the greatest business/quality risk!

Risk-based QC Concept

Risk Assessment Examples Complexity Programming Complexity Low Medium High - No pooling or merging of data - No calculations or derivations - Basic data steps and sorting - Simple data merges - Simple pre-processing of data, sub-setting, where/if clauses, retains, arrays, transposing - Steps involving validated/standard macros - Complex merging data across various source data - Complex derivation and calculation of data

Risk Assessment Examples Intended Use Intended Use of SDTM Data Low - Internal use only - Not to be used for major business decisions Medium - Data/safety review - Non-endpoint data High - Regulatory submission - Primary analysis/final CSR - Endpoint safety and efficacy data

Risk-based QC Method Examples Method Responsibility Time Needed Log Review use automated log checking utility to detect potential errors Code Review line-by-line review of code and log Requirements/Specifications Review comparison of SDTM data with specifications/acrf Spot Check Review ad hoc programming/visual checks on SDTM/raw data Independent Programming programming to produce matching datasets Programmer, QC Role QC Role, designated group Programmer, QC Role, Statistician QC Role, Statistician QC Role Short Medium Medium Medium Long

Risk Matrix Examples High 1. Log Review 1. Log Review 1. Log Review Complexity of Program Medium Low 2. Requirements/ Specifications Review 3. Code Review 1. Log Review 2. Requirements/ Specifications Review 1. Log Review 2. Requirements/ Specifications Review 2. Requirements/ Specifications Review 3. Spot Check Review 4. Code Review 1. Log Review 2. Requirements/ Specifications Review 3. Spot Check Review 1. Log Review 2. Requirements/ Specifications Review 3. Spot Check Review 2. Requirements/ Specifications Review 3. Independent Programming 1. Log Review 2. Requirements/ Specifications Review 3. Spot Check Review 4. Code Review 1. Log Review 2. Requirements/ Specifications Review 3. Spot Check Review Low Medium High Intended Use (Business Risk/Impact of Error)

Characteristics-based QC SDTM Mapping Validation: Raw Data Mapping Specifications / acrf / Programming SDTM Data Full Independent-programming Risk-based QC Are these the best ways?

Characteristics-based QC Concept "Grandma, what big eyes you have! Grandma what big ears you have! Grandma what big teeth you have!" Each data element has characteristics Characteristics describe a data element as whole If all characteristics match, data elements match If all data elements match, raw data is CORRECTLY and TRUTHFULLY converted to SDTM

Data Element Examples Data Element: a group of data, regardless of datasets, variables, records, attributes, that together represent a precise meaning or semantics CDISC SHARE Project: The vision for CDISC SHARE is to build a global, accessible electronic library, which through advanced technology, enables precise and standardized data element definitions that can be used in applications and studies to improve biomedical research and its link with healthcare. Age Element: USUBJID, AGE Race Element: USUBJID, RACE, SUPPDM.QNAM= RACEOTH, QVAL AE Term Element: USUBJID, AETERM, AEDECOD SF36 Score Element: USUBJID, QSCAT= SF36, QSORRES, QSSTRESC, QSSTRESN, QSSTAT, QSREASND

Data Element Characteristics Numeric Characteristics Descriptive Statistics: can be generated from PROC SUMMARY, PROC MEANS, PROC UNIVARIATE N, NMISS, MIN, MAX, MEAN, MODE SUM, RANGE, VAR, STD, STDMEAN Coefficient of Variation, Skewness, Kurtosis Character Characteristics FREQ, NOBS, min/max length Checksum: e.g. odd parity bit a simplified algorithm Pain =01010000011000010110100101101110 Count the number of 1s 14 To keep odd parity pit, add 1 to 14 checksum=1 If all checksums match all character values match If statistics of all checksums match all character values match

Characteristics-based QC Examples QC on Age Element From raw data: demog.age_raw From SDTM: DM.AGE Compare: N, MIN, MAX, MEAN, MODE, SUM, STD QC on AE Term Element From raw data: adverse.subjectid, adverse.aevt, adverse.aept From SDTM: AE.USUBJID, AE.AETERM, AE.AEDECOD Compare: FREQ, NOBS, min/max length, checksum QC on SF36 Score Element: From raw data: sf36.subjectid, sf36.score_raw, sf36.cmt From SDTM: QS.USUBJID, QS.QSCAT= SF36, QS.QSORRES, QS.QSSTRESC, QS.QSSTRESN, QS.QSSTAT, QS.QSREASND Compare numeric: N, NMISS, MIN, MAX, MEAN, MODE, SUM, STD, RANGE Compare character: FREQ, NOBS, min/max length

Characteristics-based QC Benefits Data element characteristics exist as soon as data is created/refreshed Characteristics-based QC is an extension of risk-based QC in a more consistent way Characteristics-based QC can be applied to all end-to-end data conversion processes (e.g. raw to SDTM, SDTM to ADaM) Characteristics-based QC can be automated!

SDTM Compliance Checks Raw Data SDTM Mapping Validation

SDTM Validation and Loading at FDA Electronic Submission Sponsor: SDTM Define.xml ectd Communication FDA Review Tools: JMP J-Review WebSDM Etc. Review Communication / Refuse to File FDA Electronic Document Room JANUS Data Repository Data Validation and Loading WebSDM Checks JANUS Checks Pass Pass

WebSDM v3.0 Checks 154 rules based on SDTMIG 3.1.2 Checks apply to data (classes, domains, variables, values) and metadata (define.xml, SDTM Terminology.xls) Severity (Low, Medium, High) is only an indicator of potential problems or anomalies in the data. There is no direct correlation between a severity value and a FDA decision about whether the data is acceptable for review or not.

Janus v1.0 (Draft) Checks 109 rules based on SDTMIG 3.1.1 Overlap with WebSDM rules but with different definition of the severity levels Severity High Medium Low Description The error is serious and will prevent the study data from being loaded successfully into the Janus repository. The SDTM study will not be loaded into the Janus repository. The error may impact the reviewability of the submission, but will not have an impact on loading the study data into the Janus repository. The SDTM study will be loaded into the Janus repository. The error may or may not impact the reviewability or the integrity of the submission but will not have an impact on loading the study data into the Janus repository. The SDTM study will be loaded into the Janus repository.

WebSDM vs. Janus Severity WebSDM and Janus may assign different severity levels for the same rule

Custom SDTMIG Compliance Checks WebSDM/Janus checks cannot cover all of the explicit/implicit rules in SDTMIG: 8/40/200 character limitation check USUBJID value must be unique for each subject across all trials in the submission IDVAR (variable), IDVARVAL (record) reference check against parent domain for CO ISO 8601 format check on Duration, Elapsed Time, and Interval values And many more

Tools for SDTM Compliance Checks Proprietary Software: WebSDM from Phase Forward,., etc. Free Software: OpenCDISC Validator Direct-download and installation on PC Graphic user interface Reporting in Excel, CSV, and HTML SAS Clinical Standards Toolkit PC/UNIX installation support from IT Interactive/Batch SAS programming interface Reporting functions not provided but can be custom-built

SAS CST is a framework including: Directory structure Metadata: datasets, format catalog, XML, Excel Data: datasets, format catalog, XML, Excel Source code: SAS programs/macros

Tools Comparison Installation OpenCDISC Validator User direct-download PC/USB flash drive, tweak on UNIX SAS Clinical Standards Toolkit IT/SAS administrator support PC (9.1.3/9.2) and UNIX (9.2) Interface Graphic user interface Interactive/Batch SAS programming interface Supported Standards / Features Reporting Validate SDTMIG 3.1.1/3.1.2 based on WebSDM v3/janus v1 draft Additional custom checks CDISC-NCI Terminology Generate/Validate define.xml based on CRTDDS v1 Excel/CSV/HTML reports Can only limit number of occurrence per rule WebSDM/Janus rule ID on website but not on reports Severity levels follow Janus Validate SDTMIG 3.1.1 based on WebSDM v2.6/janus v1 draft Additional custom checks CDISC-NCI Terminology Generate/Validate define.xml based CRTDDS v1 Results in SAS datasets Can limit number of occurrence per rule/dataset/actual value WebSDM/Janus ID in results Severity levels follow WebSDM/Janus

Tools Comparison (Cont d) Processing Real memory OpenCDISC Validator Check on SAS transport XPT or other delimited text files Performance Fair (hours) for small studies but potential memory crash for large studies Maintenance Open XML code for configuration Open Java code on website Standard/Custom metadata in XML/Excel Flexibility Need XML/Java expertise for any customization/enhancement SAS Clinical Standards Toolkit Disk and real memory Redundant processing steps Check on SAS datasets To be improved (1+ day) Open source SAS code/configuration Standard/Custom metadata in SAS datasets Select/Deselect rules to check in SAS code Build custom checks with SAS code Build graphic user interface in SAS/Excel Documentation Website Instructions Installation Instructions IQ/OQ document Examples/Exercises User s Guide Technical Support Website forum SAS technical support from phone/email/website

References and Contact FDA Guidance for Industry, Part 11, Electronic Records; Electronic Signatures Scope and Application http://www.fda.gov/downloads/drugs/guidancecomplianceregula toryinformation/guidances/ucm072322.pdf FDA Guidance for Industry, Study Data Specifications (v1.5.1): http://www.fda.gov/downloads/drugs/developmentapprovalproce ss/formssubmissionrequirements/electronicsubmissions/ucm1 99759.pdf WebSDM Checks: http://www.phaseforward.com/products/cdisc Janus Checks: http://www.fda.gov/forindustry/datastandards/studydatastandar ds/ucm155327.htm OpenCDISC Validator: http://www.opencdisc.org SAS Clinical Standards Toolkit: http://ftp.sas.com/techsup/download/hotfix/12clintlkt.html Contact Information: dan.shiu@amgen.com