PharmaSUG 2014 - Paper PO01



Similar documents
Abstract. Introduction. System Requirement. GUI Design. Paper AD

Pharmaceutical Applications

Essential Project Management Reports in Clinical Development Nalin Tikoo, BioMarin Pharmaceutical Inc., Novato, CA

Graphing Made Easy for Project Management

Switching from PC SAS to SAS Enterprise Guide Zhengxin (Cindy) Yang, inventiv Health Clinical, Princeton, NJ

Building and Customizing a CDISC Compliance and Data Quality Application Wayne Zhong, Accretion Softworks, Chester Springs, PA

PharmaSUG 2016 Paper IB10

How To Write A Clinical Trial In Sas

Automate Data Integration Processes for Pharmaceutical Data Warehouse

USE CDISC SDTM AS A DATA MIDDLE-TIER TO STREAMLINE YOUR SAS INFRASTRUCTURE

The ADaM Solutions to Non-endpoints Analyses

StARScope: A Web-based SAS Prototype for Clinical Data Visualization

Create an Excel report using SAS : A comparison of the different techniques

PharmaSUG2010 HW06. Insights into ADaM. Matthew Becker, PharmaNet, Cary, NC, United States

ABSTRACT INTRODUCTION CLINICAL PROJECT TRACKER OF SAS TASKS. Paper PH

Creating a Windows Service using SAS 9 and VB.NET David Bosak, COMSYS, Kalamazoo, MI

ABSTRACT INTRODUCTION THE MAPPING FILE GENERAL INFORMATION

PharmaSUG 2014 Paper CC23. Need to Review or Deliver Outputs on a Rolling Basis? Just Apply the Filter! Tom Santopoli, Accenture, Berwyn, PA

Managing Data Issues Identified During Programming

We begin by defining a few user-supplied parameters, to make the code transferable between various projects.

SDTM, ADaM and define.xml with OpenCDISC Matt Becker, PharmaNet/i3, Cary, NC

Clinical Data Management (Process and practical guide) Dr Nguyen Thi My Huong WHO/RHR/RCP/SIS

Rationale and vision for E2E data standards: the need for a MDR

How to easily convert clinical data to CDISC SDTM

Clinical Trial Data Integration: The Strategy, Benefits, and Logistics of Integrating Across a Compound

SAS CLINICAL TRAINING

Applications Development

Let SAS Modify Your Excel File Nelson Lee, Genentech, South San Francisco, CA

Lessons Learned from the QC Process in Outsourcing Model Faye Yeh, Takeda Development Center Americas, Inc., Deerfield, IL

Improve your Clinical Data Management With Online Query Management System

Connecting to an Excel Workbook with ADO

SAS Enterprise Guide in Pharmaceutical Applications: Automated Analysis and Reporting Alex Dmitrienko, Ph.D., Eli Lilly and Company, Indianapolis, IN

A Macro to Create Data Definition Documents

VBA Microsoft Access 2007 Macros to Import Formats and Labels to SAS

Organization Profile. IT Services

PharmaSUG Paper DG06

Integrating SAS with JMP to Build an Interactive Application

How-To Guide. Crystal Report Demo. Copyright Topaz Systems Inc. All rights reserved.

Paper PO03. A Case of Online Data Processing and Statistical Analysis via SAS/IntrNet. Sijian Zhang University of Alabama at Birmingham

Exam Name: IBM InfoSphere MDM Server v9.0

id_prob_result_coredump_aix.ppt Page 1 of 15

August

Using Web Services to exchange information via XML

Get in Control! Configuration Management for SAS Projects John Quarantillo, Westat, Rockville, MD

Introduction to SAS Business Intelligence/Enterprise Guide Alex Dmitrienko, Ph.D., Eli Lilly and Company, Indianapolis, IN

Analysis Data Model (ADaM)

How To Use Sas With A Computer System Knowledge Management (Sas)

QlikView 11 Source Control Walkthrough

Needs, Providing Solutions

Automated distribution of SAS results Jacques Pagé, Les Services Conseils HARDY, Quebec, Qc

SAS Logic Coding Made Easy Revisit User-defined Function Songtao Jiang, Boston Scientific Corporation, Marlborough, MA

Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram

Post Processing Macro in Clinical Data Reporting Niraj J. Pandya

Qualification Process for Standard Scripts in the Open Source Repository with Cloud Services

Add an Audit Trail to your Access Database

IBM InfoSphere MDM Server v9.0. Version: Demo. Page <<1/11>>

A Microsoft Access Based System, Using SAS as a Background Number Cruncher David Kiasi, Applications Alternatives, Upper Marlboro, MD

SAS Credit Scoring for Banking 4.3

SAS Office Analytics: An Application In Practice

Implementation of SDTM in a pharma company with complete outsourcing strategy. Annamaria Muraro Helsinn Healthcare Lugano, Switzerland

Statistical Operations: The Other Half of Good Statistical Practice

Configuring a SAS Business Intelligence Client with the SAS Server to Support Multilingual Data Wei Zheng, SAS Institute Inc.

Experiences in Using Academic Data for BI Dashboard Development

Clinical Data Management Overview

How to build ADaM from SDTM: A real case study

Software Validation in Clinical Trial Reporting: Experiences from the Biostatistical & Data Sciences Department

[Jet-Magento Integration]

William E Benjamin Jr, Owl Computer Consultancy, LLC

ADaM Implications from the CDER Data Standards Common Issues and SDTM Amendment 1 Documents Sandra Minjoe, Octagon Research Solutions, Wayne, PA

Using Pharmacovigilance Reporting System to Generate Ad-hoc Reports

Choosing the Best Method to Create an Excel Report Romain Miralles, Clinovo, Sunnyvale, CA

MOVING THE CLINICAL ANALYTICAL ENVIRONMENT INTO THE CLOUD

CS170 Lab 11 Abstract Data Types & Objects

Oracle OLAP. Describing Data Validation Plug-in for Analytic Workspace Manager. Product Support

Figure 1. Example of an Excellent File Directory Structure for Storing SAS Code Which is Easy to Backup.

Program to solve first and second degree equations

PharmaSUG Paper HS01. CDASH Standards for Medical Device Trials: CRF Analysis. Parag Shiralkar eclinical Solutions, a Division of Eliassen Group

Purchase Agent Installation Guide

ProjectTrackIt: Automate your project using SAS

Workflow Solutions for Very Large Workspaces

Atlas.ti Basic Training Qualitative Research QR II, 2006/2007 K. Fritz

New Tricks for an Old Tool: Using Custom Formats for Data Validation and Program Efficiency

INF 111 / CSE 121. Homework 4: Subversion Due Tuesday, July 14, 2009

QualityView - a program database and validation documentation tool

Exporting Client Information

Section 1 Project Management, Project Communication/Process Design, Mgmt, Documentation, Definition & Scope /CRO-Sponsor Partnership

PISA 2003 MAIN STUDY DATA ENTRY MANUAL

Sanofi-Aventis Experience Submitting SDTM & Janus Compliant Datasets* SDTM Validation Tools - Needs and Requirements

Share Point Document Management For Sage 100 ERP

SAS Rule-Based Codebook Generation for Exploratory Data Analysis Ross Bettinger, Senior Analytical Consultant, Seattle, WA

PharmaSUG Paper DS07

REx: An Automated System for Extracting Clinical Trial Data from Oracle to SAS

Using SAS Data Integration Studio to Convert Clinical Trials Data to the CDISC SDTM Standard Barry R. Cohen, Octagon Research Solutions, Wayne, PA

Permuted-block randomization with varying block sizes using SAS Proc Plan Lei Li, RTI International, RTP, North Carolina

Paper Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation

Transcription:

ABSTRACT PharmaSUG 2014 - Paper PO01 A User Friendly Tool to Facilitate the Data Integration Process Yang Wang, Seattle Genetics, Inc., Bothell, WA Boxun Zhang, Seattle Genetics, Inc., Bothell, WA For any clinical program, meta-analysis, including Development Safety Update Report (DSUR), Integrated Summary of Safety (ISS), and Integrated Summary of Efficacy (ISE), integrating data is a challenging process and is often labor-intensive and error-prone. The challenges arise primarily due to the following reasons: (1) different study designs and data standards, (2) evolving industry standards including CDISC, MedDRA and WHODrug, and (3) different data collection methods across vendors. In order to ensure all the variables can be integrated for analysis, we built a user-friendly tool to dynamically display the differences in variable attributes and values across multiple clinical studies. This paper, with examples and programming components, shows how this tool can be used to facilitate data integration by identifying and resolving issues resulting from variable inconsistencies. INTRODUCTION Data integration is often needed in clinical trial studies. Integration across studies is usually done to pool variables for meta-analysis, while integration within a study is usually done to merge data (such as when demographic data is merged with labs and endpoint results). In this paper, we focus on pooling data from multiple studies for metaanalysis. Meta-analysis usually focuses on efficacy and safety data. Integrated safety data is used to generate regulatory reports such as DSUR, Periodic Benefit-Risk Evaluation Report (PBRER), etc. Integrated efficacy data is used to generate ISE reports, which typically involves multiple subgroup analyses. The integration process involves four elements as illustrated in Figure 1: Select data sources, compare data, resolve issues, and analyze. After datasets are integrated, new datasets can be added to the pooled datasets, and so a new round of integration starts. Selecting Data Sources Data Comparision across Studies Meta-analysis Issue Resolution Figure 1. Four Elements in Data Integration Process Clinical trial data are collected from multiple sources at different times. ecrf data are collected at investigator sites and entered by study coordinators whereas lab data are sent from different vendors. Data collected at different times may have different standards and conventions. Before multiple studies are integrated, the study design, data sources, and the data collected have to be fully understood, to determine whether each study can be integrated. Once the studies that will be integrated and analyzed are identified, critical variables for meta-analysis will be identified and compared. The key for data integration is to ensure pooled variables have the same definitions and attributes. Data attributes include variable length, variable type, label, and format. Data values also include units and coding. By checking the variable labels, variable values or ranges, we can usually conclude if the variable has the same definition. For example, if two variables have the same label of gender with values M and F, it is very likely these two variables are defined in the same way and can be stacked together using SAS set statement for meta-analysis. If a variable with different attributes or definitions is identified, it will be processed and harmonized before it can be 1

pooled with other studies to perform meta-analysis. We developed a simple tool that can facilitate the integration process using SAS and Visual Basic for Applications (VBA). This tool focuses on the data comparison element of the integration process. It facilitates easy comparison of variables across multiple studies and systematically identifies the variables that cannot be stacked simplifying variable standardization. PROCESS FLOW FOR THE TOOL DEVELOPMENT The following steps describe the process of the tool development as illustrated in Figure 2: 1. Set up a source information page that describes the locations of the data (we use an Excel sheet to store library names and the path/name of the source SAS Data files.) 2. Set up an interface in Excel using VBA to call SAS programs to generate outputs for data comparison. 3. Use the Excel interface to select datasets and variables of interest. 4. Display the data comparison results in Excel with color coded output to make the result easy to read. The attributes and values of each variable from multiple studies are displayed in separate Excel sheets. Any discrepancy from any study for each variable will be displayed in red. User can select specific studies and variables of interest. Data source Generate reports Select variables of interest Display Outputs Figure 2. Process flow for the tool development Data sources are SAS datasets. Reports are generated in SAS and displayed in Excel. Excel is also used as an interface for data source selection and output display. This is done by SAS and VBA. The SAS code is called from VBA scripts. The connection code is shown below. '--> SAS IOM Workspace Connection Declarations Public obwsm As New SASWorkspaceManager.WorkspaceManager Public obws As SAS.Workspace Public obwsflag As Integer Sub fetchfromsas() '--> ADO Database Connection Declarations Dim obconn As New ADODB.Connection Dim obrs As New ADODB.Recordset Dim errorstring As String Dim SAScode As String '--> Create a Local SAS Workspace (One-Time Processing) If obwsflag = 0 Then Set obws = obwsm.workspaces.createworkspacebyserver("", VisibilityProcess, Nothing, "", "", errorstring) obwsflag = 1 End If '--> Set Path Vars spath = ActiveWorkbook.path SAS_DataPath = VBAProject.UserForm1.Text_Browse.Value '--> Call SAS code which is stored in string and declare variables obws.languageservice.submit SAScode SASlog = obws.languageservice.flushlog(50000) 2

'--> First set a string which contains the path to the file you want to create. '--> This example creates one and stores it in the root directory MyFile = spath & "/SASlog.log" '--> Set and open file for output fnum = FreeFile Open MyFile For Output As fnum Print #fnum, SASlog Close #fnum PROCESS FLOW OF HOW TO USE THE TOOL DATA SELECTION Data selection interface is an Excel file. It has two steps: 1. User puts data source paths in the interface and all the datasets that exist in the source paths will be compared. 2. In the results page, the user will select the studies, datasets, and variables of interest for display purpose. Figure 3 and Figure 4 show the data selection interfaces in Excel. Figure 3. Excel interface that will be used to select studies and data sources. 3

Figure 4. Excel interface to select studies and variables of interest. RESULTS DISPLAY Data comparison includes two aspects: attributes and values. They are displayed on two Excel sheets illustrated in Figure 5 and Figure 6. Figure 5. Variable attributes are displayed in an Excel sheet. The red text brings attention to the difference in variable length across studies for AEACN and the difference in variable type across studies for SAFFL. 4

Figure 6. Variable values are displayed in an Excel sheet. Variable TRT1P has values of VI in study 01 and 02, 1.0mg/kg and missing values in study 03, and 2.0 mg/kg in study 04. Variable SAFFL is consistently set to N or Y in all studies. Variable ONSTUDY is not available in study 04, has missing values in study 03, 1 or missing values in study 02, and 1 or 0 in study 01. EXAMPLES OF HOW THE RESULTS FROM THIS TOOL HELP DATA INTEGRATION VARIABLE LABEL AND LENGTH DIFFERENCES When variables with the same name but different lengths are stacked by SAS, the following warning message is generated: WARNING: Multiple lengths were specified for the variable SAFFL by input data set(s).this may cause truncation of data. When multiple studies are involved, without a tool, each study must be checked before the variable with the different length can be identified. With this integration tool, we can easily identify all the studies that contain variables with different lengths and modify them before stacking them together. The example below shows variable DTHAE has a different length in studies 03 and 04. It is noticeable that variable labels are different as well. This variable is defined differently among studies. Use of this tool will allow these discrepancies to be rectified prior to stacking the data. Figure 7. Example of Variable Label and Length Differences VARIABLE TYPE DIFFERENCES When variables with different data types are stacked, an error message is generated and data cannot be pooled. With this tool, it is very easy to identify the studies that contain the discrepant variables and harmonize the data types prior to integration. An example can be found in Figure 5 where variable name SAFFL has a different data type in study 04. VARIABLES WITH DIFFERENT DEFINITIONS AND VALUES Variables with different values from different studies usually indicate those variables that are defined differently in each study. These variables need to be identified and issues have to be resolved before data can be analyzed. The example below shows variable DTHAE is defined differently in study 03 when compared with studies 01 and 02. Additionally, this variable is not available in study 04. Similar observations can be found in variable DTHREL. 5

Figure 8. Variable value and variable definitions are different. CONCLUSION This paper introduced a user-friendly tool that compares and summarizes datasets in a simple output so the data can be harmonized prior to integration. This tool is powerful because an unlimited number of studies and variables can be compared. Yet the tool is also flexible and easy to set up. All the sources are set up in an Excel interface, so it is very generic, easy to maintain and can be used in any data source structure folders. It identifies the variables that differ in attributes and values, to ensure the integrated datasets variables have the same definitions. We recommend using this tool for any integration of data for meta-analysis, or to QC already integrated data before final analysis. ACKNOWLEDGMENTS We thank Norm Fox, Shawn Hopkins for their assistance of VBA scripts, Rajeev Karanam and Shefalica Chand for their valuable suggestions and comments. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Name: Yang Wang Enterprise: Seattle Genetics, Inc. Address: 21823 30 th Drive Southeast City, State ZIP: Bothell, WA 98021 E-mail: yawang@seagen.com Name: Boxun Zhang Enterprise: Seattle Genetics, Inc. Address: 21823 30 th Drive Southeast City, State ZIP: Bothell, WA 98021 E-mail: bzhang@seagen.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 6