REx: An Automated System for Extracting Clinical Trial Data from Oracle to SAS



Similar documents
How To Use Sas With A Computer System Knowledge Management (Sas)

Get in Control! Configuration Management for SAS Projects John Quarantillo, Westat, Rockville, MD

SAS, Excel, and the Intranet

A Method for Cleaning Clinical Trial Analysis Data Sets

From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL

Interfacing SAS Software, Excel, and the Intranet without SAS/Intrnet TM Software or SAS Software for the Personal Computer

Technical Paper. Defining an ODBC Library in SAS 9.2 Management Console Using Microsoft Windows NT Authentication

Paper PO03. A Case of Online Data Processing and Statistical Analysis via SAS/IntrNet. Sijian Zhang University of Alabama at Birmingham

SAS System and SAS Program Validation Techniques Sy Truong, Meta-Xceed, Inc., San Jose, CA

ABSTRACT THE ISSUE AT HAND THE RECIPE FOR BUILDING THE SYSTEM THE TEAM REQUIREMENTS. Paper DM

Using Proc SQL and ODBC to Manage Data outside of SAS Jeff Magouirk, National Jewish Medical and Research Center, Denver, Colorado

9.1 SAS/ACCESS. Interface to SAP BW. User s Guide

Managing Tables in Microsoft SQL Server using SAS

We begin by defining a few user-supplied parameters, to make the code transferable between various projects.

Effective Use of SQL in SAS Programming

Clinical Data Management (Process and practical guide) Dr Nguyen Thi My Huong WHO/RHR/RCP/SIS

Carl R. Haske, Ph.D., STATPROBE, Inc., Ann Arbor, MI

Importing Excel File using Microsoft Access in SAS Ajay Gupta, PPD Inc, Morrisville, NC

Using Pharmacovigilance Reporting System to Generate Ad-hoc Reports

Automation of Large SAS Processes with and Text Message Notification Seva Kumar, JPMorgan Chase, Seattle, WA

DiskPulse DISK CHANGE MONITOR

Analyzing the Server Log

A Macro to Create Data Definition Documents

How To Write A Clinical Trial In Sas

ing Automated Notification of Errors in a Batch SAS Program Julie Kilburn, City of Hope, Duarte, CA Rebecca Ottesen, City of Hope, Duarte, CA

SPI Backup via Remote Terminal

Building and Customizing a CDISC Compliance and Data Quality Application Wayne Zhong, Accretion Softworks, Chester Springs, PA

How To Create An Audit Trail In Sas

Top Ten SAS DBMS Performance Boosters for 2009 Howard Plemmons, SAS Institute Inc., Cary, NC

Dream Report vs MS SQL Reporting. 10 Key Advantages for Dream Report

How to easily convert clinical data to CDISC SDTM

You have got SASMAIL!

PrivateWire Gateway Load Balancing and High Availability using Microsoft SQL Server Replication

Data Presentation. Paper Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs

Effective Use of SAS/CONNECT ~ Cheryl Garner SAS Institute Inc., Cary, NC

What's New in SAS Data Management

Automate Data Integration Processes for Pharmaceutical Data Warehouse

# or ## - how to reference SQL server temporary tables? Xiaoqiang Wang, CHERP, Pittsburgh, PA

Installation and Operating Instructions Audit Trail Software for 6126/6127/6128/6129 Series

Using SAS Enterprise Business Intelligence to Automate a Manual Process: A Case Study Erik S. Larsen, Independent Consultant, Charleston, SC

Feith Rules Engine Version 8.1 Install Guide

Overview. NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT

An Overview of REDCap, a secure web-based application for Electronic Data Capture

A SAS Based Correspondence Management System Bernd E. Imken, Patented Medicine Prices Review Board, Ottawa, Canada

ABSTRACT INTRODUCTION THE MAPPING FILE GENERAL INFORMATION

Project management integrated into Outlook

It s not the Yellow Brick Road but the SAS PC FILES SERVER will take you Down the LIBNAME PATH= to Using the 64-Bit Excel Workbooks.

Improving Your Relationship with SAS Enterprise Guide

Date/Time Stamped Files and Audit Trails: What Part 11 Compliant SAS Systems are Made of. Carolyn Dougherty, ViroPharma Incorporated, Exton, PA

Building an Integrated Clinical Trial Data Management System With SAS Using OLE Automation and ODBC Technology

IBM Sterling Control Center

Methodologies for Converting Microsoft Excel Spreadsheets to SAS datasets

Statistical Operations: The Other Half of Good Statistical Practice

PharmaSUG 2014 Paper CC23. Need to Review or Deliver Outputs on a Rolling Basis? Just Apply the Filter! Tom Santopoli, Accenture, Berwyn, PA

Exploiting Key Answers from Your Data Warehouse Using SAS Enterprise Reporter Software

Implementing an Audit Trail within a Clinical Reporting Tool Paul Gilbert, Troy A. Ruth, Gregory T. Weber DataCeutics, Inc.

Paper FF-014. Tips for Moving to SAS Enterprise Guide on Unix Patricia Hettinger, Consultant, Oak Brook, IL

Paper-less Reporting: On-line Data Review and Analysis Using SAS/PH-Clinical Software

SAS Drug Development Release Notes 35DRG07

Automated distribution of SAS results Jacques Pagé, Les Services Conseils HARDY, Quebec, Qc

USING SAS WITH ORACLE PRODUCTS FOR DATABASE MANAGEMENT AND REPORTING

Managing very large EXCEL files using the XLS engine John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc., Ridgefield, CT

Division of IT Security Best Practices for Database Management Systems

Paper An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois

Using Macros to Automate SAS Processing Kari Richardson, SAS Institute, Cary, NC Eric Rossland, SAS Institute, Dallas, TX

An macro: Exploring metadata EG and user credentials in Linux to automate notifications Jason Baucom, Ateb Inc.

JD Edwards World. Database Audit Manager Release A9.3 E

Tips and Tricks SAGE ACCPAC INTELLIGENCE

ER/Studio 8.0 New Features Guide

Access Control and Audit Trail Software

CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases

Seamless Dynamic Web Reporting with SAS D.J. Penix, Pinnacle Solutions, Indianapolis, IN

MOVING THE CLINICAL ANALYTICAL ENVIRONMENT INTO THE CLOUD

Exporting Client Information

Technical Paper. Migrating a SAS Deployment to Microsoft Windows x64

Supporting a Global SAS Programming Envronment? Real World Applications in an Outsourcing Model

Applications Development

SUGI 29 Posters. Mazen Abdellatif, M.S., Hines VA CSPCC, Hines IL, 60141, USA

PharmaSUG Paper AD11

StARScope: A Web-based SAS Prototype for Clinical Data Visualization

Virto Pivot View for Microsoft SharePoint Release User and Installation Guide

CDW DATA QUALITY INITIATIVE

IT Service Level Management 2.1 User s Guide SAS

Accessing a Microsoft SQL Server Database from SAS on Microsoft Windows

Seamless Web Data Entry for SAS Applications D.J. Penix, Pinnacle Solutions, Indianapolis, IN

Developing an On-Demand Web Report Platform Using Stored Processes and SAS Web Application Server

William E Benjamin Jr, Owl Computer Consultancy, LLC

ImageNow Report Library Catalog

How are tags and messages archived in WinCC flexible? WinCC flexible. FAQ May Service & Support. Answers for industry.

Transcription:

REx: An Automated System for Extracting Clinical Trial Data from Oracle to SAS Edward McCaney, Centocor Inc., Malvern, PA Gail Stoner, Centocor Inc., Malvern, PA Anthony Malinowski, Centocor Inc., Malvern, PA Robert Karvois, Centocor Inc., Malvern, PA ABSTRACT A primary need in the reporting of clinical trial data is the ability to periodically extract clean data from a clinical database management system. This paper presents the methodology currently in use by Centocor. REx (Reporting Effort Extraction) is a Microsoft Access 97 based application that extracts clinical trial data from an Oracle database using SAS/ACCESS and creates permanent SAS data sets. REx uses SAS macros to perform additional processing based on modifications (actions) requested by users. REx provides an interface to define actions, create and run extract code, archive extractions, and run reports. Additionally, the system has the ability to obtain metadata, update it with user-defined actions, and pass it to downstream processes. lists and dictionaries from the CDMS. It also allows the user to define straightforward additions, modifications and deletions to the extracted data sets. The goal of these modifications or actions is to enable consistent creation of standard reporting variables, to perform dictionary merges, and to perform other processing to facilitate analysis and review. All actions are reflected in an updated data model that serves as the foundation for an electronic submission. The clinical data, format catalog, dictionary, and data model are then passed to another system for analysis data set definition. INTRODUCTION When Centocor switched its clinical data management system (CDMS), the need arose to design a new process to extract clinical data from an Oracle database into SAS data sets for analysis and reporting. The new extraction system, REx, is a component of a redesigned analysis and reporting infrastructure. Previously, SAS data sets and format catalogs were pushed by data management as straight dumps of the database, including standard fields that did not pertain to the trial to be analyzed. Data set documentation was provided separately in paper format and could easily become outdated. From the snapshots of the clinical data, programmers created SAS analysis data sets for reporting and submission. Analysis data set definitions were stored in Word documents, a method that made it difficult to maintain standards across studies. Data definition documentation was generated from SAS PROC CONTENTS and required a substantial manual effort to comply with FDA electronic submission guidelines. All systems were VAX-based legacy hardware and software that received limited support from the company s IT department. The new system (Figure 1) runs on a Windows NT platform and enables users (programmers) to pull and snapshot the data model (metadata), clinical data, code Figure 1 REx Process Overview

ACCESS DATABASE The information used by REx is accessed through MS Access tables. REx consists of both internal and external tables. The external tables are part of a larger Access database that is referred to as the Data Model, a metadata table which describes the clinical data. An example of a metadata table is a list of data sets that are included in a particular study. For instance, the metadata (data set list) is used to populate the list boxes used for screen entry. The Data Model is used throughout the REx process. A second table, probably the more pivotal, is an internal table - the Action table. The Action table records all user input to REx and is used to create the extract code that ultimately creates the SAS data sets. When the extract code is created, the action variables are assembled to call the SAS macros. REx acts as a code generator using the Action table data as the information source. Once REx applies the actions to the database, the Data Model must be updated to reflect the changes. For instance, if REx drops a variable from a SAS data set, the variable would also have to be dropped from the Data Model. REx performs this step automatically after the programmer has extracted the clinical data. The user is able to produce a post-rex Data Model reflecting the applied actions. This is systematically accomplished through the merging of the pre-rex Data Model and the Action table. ORACLE DATABASE As mentioned in the last section, the Data Model and the support tables for REx are stored in MS Access. However, the tables do not contain any clinical data. The clinical data that is used to create permanent data sets is stored in an Oracle database. While REx s primary goal is to extract clinical data, REx never actually accesses the Oracle database. Instead, it produces extract programs that are run in batch mode on a server. The extract programs contain code that reference the SAS views (discussed in the next section) which access the Oracle database. The data accessed through these SAS views is then manipulated and saved permanently as SAS data sets. SAS VIEWS REx accesses the clinical data, dictionaries, and code lists (formats) contained in Oracle via permanent SAS views stored in a production view directory. These views are defined in SAS/ACCESS scripts (automatically generated by the CDMS) which contain PROC SQL modules for each data set. When the view scripts are run (external to REx) the PROC SQL code connects to the Oracle database and creates the views by selecting the data model defined columns. Variable labeling and formatting are also defined by the view according to the data model. A sample PROC SQL script is shown below for a demographics (DEMOG) data set (Figure 2). View Creation Script /* Module: DEMOG */ PROC SQL; CONNECT TO ORACLE(USER=XXXXXXXX PASSWORD=XXXXXXXX PATH='XXXXX'); CREATE VIEW EXAMPLE.DEMOG as SELECT REC_ID label = 'Record Identifier', PNO label = 'Protocol Number', CNO label = 'Center ID', PATNO label = 'Patient Number', EVENT_ID label = 'Event Identifier', DOB_DT label = 'Date of birth' Format=DATE9., SEX label = 'Sex' Format=SEX., RACE label = 'Race' Format=RACE. FROM CONNECTION TO ORACLE (SELECT to_char(rec_id), PNO, CNO, PATNO, EVENT_ID, trunc(c_dob_dt)- to_date('01-01-1960', 'DD-MM-YYYY'), c_sex, c_race FROM TESTPROT.TEST_DEMOG ORDER BY CNO, PATNO, EVENT_ID, REC_ID ) AS S(REC_ID, PNO, CNO, PATNO, EVENT_ID, DOB_DT, SEX,RACE); DISCONNECT FROM ORACLE; ACTION ENTRY Figure 2 View Creation Script Actions are user-specified, parameter-driven, SAS macro modules invoked by the system to create SAS code that is applied to a selected variable. The actions are used to prep the data so that it becomes more user friendly. REx actions can be applied either on a single variable contained within a data set or on a single variable contained in multiple data sets. SAS macro parameters are specified through the action entry screens and stored in the Action table (Figure 3).

of an additional prompt where the user specifies the format used to decode the variable. The actions are implemented via SAS macros. Figure 5 shows the SAS macro code for the RESET action. %RESET Action Macro %RESET(DEMOG, SEX, $SEX). %MACRO RESET(DATASET,VARIABLE,FORMAT); /*Upcase the parameter VARIABLE */ %let VARIABLE = %upcase(&variable); Figure 3 Action Entry Variable Level Screen Variable level actions are applied to a single variable within a data set. A data set, variable, and action are selected from the entry screen. If additional information is required to complete the action, an action-specific prompt (example Figure 4 below) will be displayed requesting the user to enter the necessary required parameter information. The standard parameters created are: parameter 1 - the data set on which the action will be performed parameter 2 - the variable parameter 3 - the information specific to the action. /* Obtain the label from the original variable. */ Proc contents data = &DATASET out= _TLABEL (keep = NAME LABEL) noprint; /* Create a macro variable containing the original variable label. */ Data _TLABEL; Set _TLABEL (where = (NAME = "&VARIABLE")); Call symput("m_label",label); /* Use the format to create the new value for the varible. Attach the label from the original variable. */ Data &DATASET (drop = TEMPVAR) ; Set &DATASET (rename= (&VARIABLE =TEMPVAR)); &VARIABLE = put(tempvar,&format); label &VARIABLE = &M_LABEL; If left(&variable) = '.' then &VARIABLE=' '; /* Get rid of the work data set. */ Proc Datasets nolist; delete _TLABEL; quit; %MEND RESET; Figure 5 %RESET Action Macro Figure 4 - Action Entry Variable Level Screen Showing Action Specific Prompt An example of a variable level action is changing a coded value to a decoded value. In REx, this is known as a RESET action. The RESET action, requires: parameter 1- DEMOG (data set name) parameter 2 - SEX (variable to be processed) parameter 3 - $SEX. (the SAS format). The user selects the data set, variable, and action from the entry screen. The RESET action triggers the display Data set level action entry is very similar to variable level entry, the differentiation being the number of data sets to which the action is going to be applied. This screen allows the user to efficiently specify the same action on many data sets. The user first selects multiple data sets from the data set column, the variable, and then the action to be applied. Again, if additional information is needed to complete the actions, an action-specific screen will appear. An individual action is created in the action table for each data set. For example, to RESET a variable contained in multiple data sets, the user selects all of the data sets, the variable, and the action RESET. A prompt will request the user to enter the format. Variables that are not selected for action processing are passed through without an action being added to the action table.

CREATING EXTRACT CODE REx creates a single program for each data set. The code that is created by REx includes all the appropriate libname statements necessary to execute the program. The programs will include any actions that the user has specified in the Action Entry screens (Figures 3 and 4). When creating the extract code, the actions and the parameters are assembled so they can call an existing SAS macro module. The following example (Figure 6) displays the DEMOG extract program that was created by REx using the information stored in the Action table. Extract Code (REx_DEMOG) %macro REx_DEMOG(Raw_Data=,Rex_Data=); * Allocate libraries; Libname DS 'data library' ; Libname DEMOG 'SAS view library'; Libname Raw_Data "&Raw_Data"; Libname Rex_Data "&Rex_Data"; Libname Library 'SAS format library'; * Perform REx actions; Data DEMOG; set DEMOG.DEMOG; %DROP_VAR(DEMOG,REC_ID); %PATID(DEMOG); %RENAME(DEMOG,EVENT_ID,VISIT); %RESET(DEMOG,SEX,SEX.); %RESET(DEMOG,RACE,RACE.); %UNIQUE_P(DEMOG); * Create archived copy of the pre-rex data; Data Raw_Data.DEMOG; set DEMOG.DEMOG; * Create permanent copy of post-rex data for analysis; Data DS.DEMOG; set DEMOG; * Create archived date/time stamped copy of post-rex data for audit purposes; Data Rex_Data.DEMOG; set DEMOG; %mend REx_DEMOG; Figure 6 Extract Code (REx_DEMOG) Figure 7 Reporting Effort Extraction Screen The user then clicks the Run Program button and REx creates a macro call to the extract code as seen below in figure 8. Extract Code Macro Call %REx_demog (raw_data= C:\prod\archive\01oct20011205\data, rex_data= C:\prod\archive\01oct20011205\rex); Figure 8 Extract Code Macro Call This program contains the extract code macro call that passes the values of the archive parameters to the extract code macro (i.e. REX_DEMOG, see Figure 6). These parameters are the names of the archive directories that have been created by REx. A new directory and call are created every time REx is run. Once this file has been created, it is sent to the server to be executed. Notice that this SAS macro accepts two parameters. These parameters pass the archive directory names (which contain the date/time of the REx extraction) to the macro. Every time REx extracts data from the Oracle database a new archive directory is created thus giving the user an audit trail of REx activities. RUNNING EXTRACT CODE Once created, the REx system can use the extract code to create the SAS data sets. The user can select the data sets to be extracted from the Create Data Sets screen (Figure 7).

OUTPUT After running the REx extraction code, SAS data sets containing clinical and dictionary data, and a SAS format catalog are produced. These data sets reflect a snapshot of the raw clinical data contained in the CDMS with the REx actions applied. The clinical data is written to a production data directory. The dictionaries and formats are written to a production formats directory. These data sets are now available for use in creation of analysis data sets. REx also creates archived copies of the extracted data. Copies of both the pre-rex (raw clinical data prior to application of REx actions) and post-rex data sets are archived in a date/time stamped folder in a production archive directory. These archived snapshots provide an audit trail of REx activities. In addition to generating the SAS data sets, the clinical data model is updated to reflect the effects of the REx actions. This updated data model is used by a downstream application in generating analysis data set definitions. REx also provides a reporting function. These reports reflect the contents of the action table for a particular reporting effort. Users can generate reports of REx actions by data set or by action type. These reports are used to ensure that the desired actions and parameters have been defined for a data extraction. CONCLUSION The new system has been successfully designed and implemented to run on Windows NT. It has grown from a single-user, single-project prototype to a production system that supports multiple users across multiple projects. It has fulfilled requirements for pulling data and performing user-defined actions as well as updating the data model to reflect those actions. The REx process of specifying actions and the use of the macros to execute those requests has increased the consistency of variable definition across projects and brought uniformity to the code used to create these variables. The ready availability of the data model reduces the documentation burden and allows programmers and statisticians to focus on the definition, coding and quality control of more critical analysis variables. ACKNOWLEDGEMENTS SAS and all other SAS Institute, Inc. product or service names are registered trademarks or trademarks of SAS Institute, Inc. in the United States and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective company. CONTACT INFORMATION Edward McCaney mccaneye@centocor.com Gail Stoner stoner@centocor.com Anthony Malinowski malinowskia@centocor.com Robert Karvois karvoisr@centocor.com In summary, the implementation of the REx system has significantly enhanced the programming group s ability to efficiently perform downstream processing. This includes generation of analysis data sets, maintenance of standard variable definitions and automatic generation of data definition documentation for electronic submission to the FDA.