Building an Integrated Clinical Trial Data Management System With SAS Using OLE Automation and ODBC Technology



Similar documents
Clinical Data Management (Process and practical guide) Dr Nguyen Thi My Huong WHO/RHR/RCP/SIS

Clinical Data Management (Process and practical guide) Nguyen Thi My Huong, MD. PhD WHO/RHR/SIS

Carl R. Haske, Ph.D., STATPROBE, Inc., Ann Arbor, MI

.CRF. Electronic Data Capture and Workflow System for Clinical Trials

SigmaSoft International Software Features

Internal Control Deliverables. For. System Development Projects

Organization Profile. IT Services

The Importance of Good Clinical Data Management and Statistical Programming Practices to Reproducible Research

Supplement to the Guidance for Electronic Data Capture in Clinical Trials

QUALITY CONTROL AND QUALITY ASSURANCE IN CLINICAL RESEARCH

CDISC Journal. Using CDISC ODM to Migrate Data. By Alan Yeomans. Abstract. Setting the Scene

Compliance Response Edition 07/2009. SIMATIC WinCC V7.0 Compliance Response Electronic Records / Electronic Signatures. simatic wincc DOKUMENTATION

Development, Acquisition, Implementation, and Maintenance of Application Systems

DiskPulse DISK CHANGE MONITOR

Managing & Validating Research Data

Synergy Document Management

ACDM GUIDELINES TO FACILITATE PRODUCTION OF A DATA HANDLING PROTOCOL

SAS Enterprise Guide in Pharmaceutical Applications: Automated Analysis and Reporting Alex Dmitrienko, Ph.D., Eli Lilly and Company, Indianapolis, IN

REx: An Automated System for Extracting Clinical Trial Data from Oracle to SAS

ROLE OF THE RESEARCH COORDINATOR

Quality Assurance: Best Practices in Clinical SAS Programming. Parag Shiralkar

Declaration of Conformity 21 CFR Part 11 SIMATIC WinCC flexible 2007

Needs, Providing Solutions

Data Management and Good Clinical Practice Patrick Murphy, Research Informatics, Family Health International

Programme Guide PGDCDM

Normalized EditChecks Automated Tracking (N.E.A.T.) A SAS solution to improve clinical data cleaning

Census Data Capture with OCR Technology: Ghana s Experience.

ABBYY recognition technologies ideal alternative to manual data entry. Automating processing of exam tests.

CoSign for 21CFR Part 11 Compliance

Document Storage Tips: Inside the Vault

A white paper. Data Management Strategies

The Clinical Research Center

Clinical database/ecrf validation: effective processes and procedures

FlexTrac Client Support & Software Maintenance Policies

Nova Southeastern University Standard Operating Procedure for GCP. Title: Electronic Source Documents for Clinical Research Study Version # 1

Crystal Gears. Crystal Gears. Overview:

SCIENTIFIC SUBSTANTIATION OF STANDARD OPERATIONAL PROCEDURES ROLE IN QUALITY ASSURANCE SYSTEM IN CLINICAL DRUG TRIALS

Statistical Operations: The Other Half of Good Statistical Practice

This is a controlled document. The master document is posted on the JRCO website and any print-off of this document will be classed as uncontrolled.

XMailer Reference Guide

For more information about UC4 products please visit Automation Within, Around, and Beyond Oracle E-Business Suite

Guidance for Industry Computerized Systems Used in Clinical Investigations

DOCUMATION S CUSTOMER SERVICES SOLUTION

Office of History. Using Code ZH Document Management System

White Paper. The Five Keys to a Successful Document Management System ABSTRACT. Command Your Content

DeltaV Event Chronicle

Data Management Unit Research Institute for Health Sciences, Chiang Mai University

Eclipsys Sunrise Clinical Manager Enterprise Electronic Medical Record (SCM) and Title 21 Code of Federal Regulations Part 11 (21CFR11)

How To Use A Court Record Electronically In Idaho

INSERT COMPANY LOGO HERE BEST PRACTICES RESEARCH

Signature Requirements for the etmf

EXAM PREPARATION GUIDE

Format OCR ICR. ID Protect From Vanguard Systems, Inc.

MODULE 2: SMARTLIST, REPORTS AND INQUIRIES

Section 1 Project Management, Project Communication/Process Design, Mgmt, Documentation, Definition & Scope /CRO-Sponsor Partnership

STATE OF NEBRASKA STATE RECORDS ADMINISTRATOR DURABLE MEDIUM WRITTEN BEST PRACTICES & PROCEDURES (ELECTRONIC RECORDS GUIDELINES) OCTOBER 2009

ATTACHMENT III Tender No HD Laboratory Information Management System (LIMS)

GFI White Paper: GFI FaxMaker and HIPAA compliance

Computerised Systems. Seeing the Wood from the Trees

An Introduction to Electronic Data Capture Software. Learn the basics to find the right system for your needs

DOCUMATION S ACCOUNTS RECEIVABLE SOLUTION

How To Use Sas With A Computer System Knowledge Management (Sas)

MANAGED FILE TRANSFER: 10 STEPS TO HIPAA/HITECH COMPLIANCE

rsdm and 21 CFR Part 11

BarTender Web Print Server

CHAPTER 11 COMPUTER SYSTEMS INFORMATION TECHNOLOGY SERVICES CONTROLS

Optical Character Recognition (OCR)

Dell Statistica Web Data Entry

3.11 System Administration

Introduction. The Evolution of the Data Management Role: The Clinical Data Liaison

Increase your efficiency with maximum productivity and minimal work

The use of computer systems

SysPatrol - Server Security Monitor

21 CFR Part 11 Implementation Spectrum ES

MANAGED FILE TRANSFER: 10 STEPS TO PCI DSS COMPLIANCE

DOCUMATION S ACCOUNTS PAYABLE INVOICE MANAGEMENT SOLUTION (IMS)

Managing Clinical Trials Data using SAS Software

Document Management Software. Find what you need fast Break through organizational barriers Work from wherever you want, whenever you want

Version history Version number Version date Effective date 01 dd-mon-yyyy dd-mon-yyyy 02 dd-mon-yyyy dd-mon-yyyy 03 (current) dd-mon-yyyy dd-mon-yyyy

UNIVERSITY OF LEICESTER, UNIVERSITY OF LOUGHBOROUGH & UNIVERSITY HOSPITALS OF LEICESTER NHS TRUST JOINT RESEARCH & DEVELOPMENT SUPPORT OFFICE

MFC Mikrokomerc OFFER

Dream Report vs MS SQL Reporting. 10 Key Advantages for Dream Report

More power for your processes

Infoset builds software and services to advantage business operations and improve patient s life

Antelope Enterprise. Electronic Documents Management System and Workflow Engine

Knowledge Base Data Warehouse Methodology

More power for your processes ELO Business Logic Provider for Microsoft Dynamics NAV

Producing Listings and Reports Using SAS and Crystal Reports Krishna (Balakrishna) Dandamudi, PharmaNet - SPS, Kennett Square, PA

Radiological Assessment Display and Control System

Adoption by GCP Inspectors Working Group for consultation 14 June End of consultation (deadline for comments) 15 February 2012

USER MANUAL (PRO-CURO LITE, PRO & ENT) [SUPPLIED FOR VERSION 3]

Transcription:

Building an Integrated Clinical Trial Data Management System With SAS Using OLE Automation and ODBC Technology Alexandre Peregoudov Reproductive Health and Research, World Health Organization Geneva, Switzerland Introduction Handling clinical data is in many ways different from dealing with any other traditional data. Guidelines for Good Clinical Practice (GCP) impose various strict requirements on computerized systems and procedures to be used in clinical trials. The SAS System features many of those characteristics that are essential for Clinical Trial Data Management: a powerful database engine, advanced statistical analysis tools, elaborated reporting and data presentation procedures. It is also important to mention that the SAS self-documented database format is accepted by many national drug regulation agencies as a standard for Computer Assisted New Drug Applications (CANDA). New features and further enhancements of the SAS System with the Nashville Release improve its integration into Microsoft Windows platform and interoperability with other Windows compatible applications. OLE automation and ODBC support provided by the SAS System pave the way for software integration with other OLE automation compliant Windows applications to take advantage of the best available in these products. This paper describes how different program components are being integrated by an OLE automation controller through Visual Basic programming into a Clinical Trial Data Management System (CTDMS 2000) with the SAS System as a core. System Overview CTDMS 2000 is being developed by the Reproductive Health and Research Department (RHR), WHO in response to the needs for centralized data management of multi-centre clinical trials. The main objectives are: GCP compliance, lower development and maintenance costs, portability, and provision for new technologies, such as automatic data capture with optical character recognition and WEB-based data entry. RHR has a world-wide network of trial sites. More than 100 institutions in five continents participate in the collaborative research being co-ordinated by the RHR secretariat from Geneva. These institutions are located both in developed and developing countries. Some have at their disposal advanced telecommunication and informatics infrastructure with adequately trained staff. Others are far away from the most recent technological developments, still relying on traditional post, telephone and fax as the only ways of communication with the outside institutions. These specific conditions impose a number of severe limitations on the system design and implementation. CTDMS 2000 should be able to receive the clinical trial data case report Page 1 of 9

forms (CRFs) - from the sites both as paper copies posted to the centralized data management unit and remotely through the WEB-based data entry. The data should be processed in a uniform way regardless of the CRF input mode and results should be communicated back to the sites in exactly the same manner. SAS/AF Frame objects, like Data Table and Data Form, facilitate the speedy development of data entry applications with graphical user interface to the data. But the SAS/AF solution, in the given setting, seems to be neither cost effective nor technically feasible. First, because it is too expensive to provide a SAS license from the limited RHR budget to all the centres that are technologically ready to implement this option. Second, because it does not entirely solve data acquisition problems, especially for the centres from developing countries, and does not permit a flexible multi-facet data entry. Apart from that, paper CRFs are still in use in 100% of the clinical trials conducted by RHR (and by many other sponsors). It is unlikely that within the next few years there will be migration to complete electronic data processing systems. Technological levels, due to wide geographic representation and differences in economical status, vary too much from one trial site to another. Therefore, CRFs should be designed, translated into various local languages, printed in multiple copies and distributed amongst the trial sites. As the CRF is part and parcel of the clinical trial protocol, it appears to be logical to integrate the form design tool in the clinical trial data management system. Most form design software products not only create a graphical environment for CRF development but also provide means for direct data entry through electronic forms. This is GCP compliant (if the data is protected by electronic signature) and certainly the least expensive solution, but it needs a reliable Internet connection and qualified technical support staff at the trial site. The paper CRFs - comprising the total current data flow - will be processed by OCR (optical character recognition) software. Preliminary testing showed that the OCR engines are capable of reading form images and capturing data from both machine-printed and hand-written forms, including text, check marks, bar codes, labels and other fields, with a very high degree of precision. Advantages of the OCR automatic data capture are numerous. OCR not only replaces manual data entry, the bottleneck of any data collection application, with fast, reliable and cost-effective procedures. It provides all the means to build electronic archives of the form images and link them with the extracted data, thus ensuring an effective mechanism for audit trail required by GCP. Therefore, the solution that has been adopted for CTDMS 2000 enforces the SAS system with two OLE automation compatible software tools (see Fig. 1). CRF design and related electronic data entry are being implemented with JetForm (JF). Paper CRF processing and automatic data capture from the forms is being dealt with by Eyes and Hands for Forms (EHF), the OCR software from ReadSoft. Both packages are Windows compatible applications and allow for full integration with the SAS system under the Windows NT platform. The core of the data management procedures - maintenance and update of clinical trial databases, validation of transaction records, data quality report generation - is being implemented with the SAS Base software using its macro and formatting facilities. Page 2 of 9

Trial Site Trial Site CRF Processing with EHF CTDMS 2000 Trial WEB Site CRF Processing with JF Transaction database Transaction database OLE Automation Controller Quality Reporting Data Query Database Update D b U d Data Validation Data View Data Dictionary Main Database Validation Dictionary OLE Automation Controller CRF Definition with EHF CRF Design with JF Fig. 1. CTDMS 2000 Overview Page 3 of 9

Provision is being made to fully utilize new features of the SAS System that are now available in the Version 8. Form Design and Definition CTDMS 2000 exposes the full functionality of the JetForm (Design & Filler ) graphical development tools for form design and data entry. JF Design module has been used for RHR clinical trials for many years, initially as a mainframe application and then under Windows. A fragment of a CRF designed with this tool is shown in Fig. 2. JF proved to be an efficient, flexible and powerful graphical design tool, with many more attractive features in the Windows environment, such as WEB-based data entry, forms dictionary and ODBC support. The outcome of the form design system is the CRF in an electronic format. Being an integral part of the clinical trial protocol, the CRFs are distributed amongst the trial sites in hard copies and, if necessary, electronically. In order to be processed by the OCR engine for automatic data capture the form initially has to be defined with EHF. The definition procedure works with the form image (TIF format) produced by a scanner from the paper copy. This is an advantage of the EHF software. It does not depend on any particular design tool. EHF deals with forms of any type and origin, even those drawn by hand, provided that the form design agrees with a few minor features imposed by the software. Data Dictionary The form design and definition process results in a primary data dictionary being created during these phases. Whether the dictionary is JF or EHF-specific, both include such key attributes as variable name, type and format. These dictionaries are exported and further translated into the SAS System format through Visual Basic programming. At this stage the primary dictionary is extended with a few more control variables to identify: transaction batch, user, date/time of entry and validation, and with a few more data attributes to supply SAS compatible variable labels and print formats. The data dictionary is stored within CTDMS 2000 as a SAS data set. This allows for an automatic generation of the related SAS transaction databases thus reducing overall application set-up time. Most of the data dictionary components may be modified if and when necessary. When both JF and EHF primary dictionaries are created, the data dictionary module checks whether they are comparable and makes sure that both are translated in the same data dictionary. Transaction data dictionaries serve as a source for design of all other databases, such as, for example, main database and analysis database. This is done merely by selecting the required variables from different dictionaries and writing the SAS code to compile the data. Page 4 of 9

Multi-Facet Data Entry CTDMS 2000 is providing two data entry modes: paper CRFs centrally processed by OCR engine with automatic data capture (EHF tool) and remote WEB-based entry through electronic CRF format (JF tool). In the case of OCR reading, data manager intervention is only required at the data verification stage. By comparing the CRF image on the computer screen with the extracted information (see Fig. 2), the data manager takes the immediate decision to accept or correct the data or to generate a data query to the investigator at the trial site. Fig. 2. On-screen data verification after OCR reading by EHF Whatever the input mode of the given CRF, CTDMS 2000 generates an unique form-specific SAS transaction database. The transaction database is completed directly by EHF or JF through the SAS ODBC driver. A possibility to export a flat ASCII transaction file from any data entry mode also exists as a back-up option. Page 5 of 9

Database Update The transaction database has a very short life span. It is used as a bridge between the data entry module and the main database. Once the transaction database is completed (optionally, validated and corrected) the main database is automatically updated. Database update in the SAS System is a straightforward Data step processing that require just a few basic SAS statements. It becomes slightly more complicated in order to allow CTDMS 2000 to control procedures: preventing an update with transaction records already existing in the main database, determining the number of the input and the output records and reporting summary of the update processing to the user for logging. Data Quality Control A great deal of effort needs to be put into quality control when managing clinical trials data. Ideally, computer records should 100% mirror the original data from the CRFs. Quite often, standard operating procedures applied by the pharmaceutical industry require complete checks of a certain percentage of electronic records against the CRFs. Collected data should be checked for completeness, consistency and correctness. Any detected error should be annotated and, if necessary, referred to the investigator in charge as a data query. When resolved, the data query should be translated into a formal data correction and applied to the trial data base. Depending on the complexity of the trial and amount of information to be collected, data quality checks may be very numerous. The more thorough the data quality control the more reliable the data will be. There are two types of quality checks that are established in CTDMS 2000. Both require the SAS Base features only. Range check (RChk) addresses a single item of information (i.e. a single SAS variable) and verifies whether the value falls within the specified range (continuous variable) or belongs to the specified list (discrete variable). Cross check (XChk) involves a number of information items (i.e. a number of SAS variables) that may belong to different databases. Normally, XChk is defined through the SAS language as a certain logical condition to be satisfied. Validation Dictionary Every variable in the SAS transaction database may be linked by CTDMS 2000 with a range check. The validation dictionary module takes the check definition as a valid range or as a valid list of values that the variable may be assigned. This definition is then transformed by the module into the SAS user-written format. The RChk format is applied at the database validation step only. When executed by the function put(variable,rchk-format), any value of the given variable is translated into one of the RChk-format keywords. The main one is Page 6 of 9

the standard keyword that stands for the valid case. The others signal a range error. When defining RChk, the user may assign specific keywords to the different range errors depending on their severity. The collection of validation formats defined for the whole database constitutes the RChk part of the validation dictionary. Physically, range checks in a source code are maintained as a SAS data set that may be updated as necessary. After compilation they are stored in a SAS catalogue. A cross check (XChk) definition consists of five components: list of related SAS variables, the SAS code which sets the logical condition that the variables should meet, the descriptive text and the short message associated with the XChk, and the clause controlling when the check should be activated (always, if and only if all the relevant variables are valid, not active, etc.). The collection of cross checks defined for the whole database constitutes the XChk part of the validation dictionary. The validation dictionary module verifies all the XChk components provided by the user and stores them as a SAS data set that may be updated as necessary. The XChk descriptive text may include a macro variable that will be assigned a value after execution of the XChk code by the SAS System. By their definition, cross checks may be either limited to the transaction database or be applicable to the main database. The XChk clause provides a flexible way to temporary switch off some of the global cross checks that are not applicable to a particular transaction database. Validation Procedures The burden of the final data validation and quality control remains with the SAS Base. Though simple range checks can be set-up in both data entry modes, thorough final validation of the transaction and main databases is essential to justify corrections to the source data as well as for documentation purposes. The SAS formatting features and macro facilities turn out to be extremely powerful and flexible for the data quality checks and reporting. The validation processing in CTDMS 2000 is a 100% SAS Data step application. It employs Data step and macro functions with a few procedures available in the SAS Base. The executable codes for range and cross checks are defined as SAS macros. When initialised, they generate a series of global macro variables and statements that make the SAS Base system apply the validation dictionary. In fact, the validation dictionary components are copied to the macro variables. Most of the processing during the validation step takes place in the macro environment. It results in the validation report - a SAS data set - which brings together all the violations of RChk and XChk rules that occurred for every database record. Page 7 of 9

Validation Reporting The goal of the clinical trial data validation is to identify incomplete, inconsistent, and incorrect data and to report it to the data manager and investigators for the prompt corrective action. The validation report should be as detailed as is possible. Any relevant information from the data and validation dictionaries as well as from the database may help to take quick and adequate decisions with respect to corrections. ODS formatting and graphical options in the Nashville Release allow one to produce very attractive looking validation reports (see Fig. 3). What is more important, the report can be directed through ODS to the HTML destination. When the CTDMS 2000 WEB site is set-up (future plans) the validation report will be immediately available world-wide for referral by all the investigators involved in the trial. RHR Project 97902 (PH) Mifepristone and Two Regimens of Levonorgestrel in Emergency Contraception Page 1 Validation Report for: Centre 16 - Ulaanbaatar Date: 18 Apr 2000 16:31 Case No Check Id Reference/Label Value Message 0028-P ESQ08D ESQ08M ESQ08Y ESQ13 Date of ultrasound exam (DD) Date of ultrasound exam (MM) Date of ultrasound exam (YYYY) Fetal heart beats 9 Invalid value xcheck 129 Suspected pregnancy. Please verify the data and check pregnancy report for completeness Pregnancy ESQ04 ESQ08 ESQ09 ESQ10 ESQ11A ESQ11B ESQ12 ESQ13 ESQ14 Pregnancy test Date of ultrasound exam Duration of pregnancy Date of conception Amniotic sac (A) Amniotic sac (B) Crown-rump length Fetal heart beats Carry pregnancy to term 2. 35 04AUG18 9 1 0110-Z xcheck 205 Second dose should be taken 10-18 hours after the first one. Actual delay is 9 hours Protocol violation AMQ23A AMQ23B F1Q02A F1Q02B Date 1-st dose taken Time 1-st dose taken Date 2-nd dose taken Time 2-nd dose taken 13JAN19 12:40 13JAN19 21:40 Fig. 3. Validation report in HTML format Data Query Resolution The validation data set produced by CTDMS 2000 contains all the necessary information items to automatically generate (within Visual Basic) query resolution electronic forms. After Page 8 of 9

a data query is replied to by the investigator or fixed by the data manager, correct data values are entered through these forms. From the data entries, the query resolution module creates SAS data correction statements of the sort: if Rec_id then variable = correct_value. These apply modifications to the database. It would be natural to integrate data query entry boxes directly into the HTML-based validation report, next to the Value column (see Fig. 3). If it is feasible, the data query resolution could be done directly by the investigators via the Internet. Further Developments Some components of the CTDMS 2000 have already been programmed and tested. The validation processing is being used for the data management of the on-going RHR clinical trials. The Nashville Release (distributed in Switzerland as from April 2000) brought more power that extensively improves the SAS System and its integration with other OLE automation compliant Windows applications. Most of the effort now being made are intended to incorporate into the existing CTDMS 2000 version new enhancements of the SAS System and explore new features available in the Output Delivery System. Page 9 of 9