Paper PO12 Pharmaceutical Programming: From CRFs to Tables, Listings and Graphs, a process overview with real world examples ABSTRACT INTRODUCTION



Similar documents
Pharmaceutical Applications

PharmaSUG2010 HW06. Insights into ADaM. Matthew Becker, PharmaNet, Cary, NC, United States

Best Practice in SAS programs validation. A Case Study

Paper PO06. Randomization in Clinical Trial Studies

Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL

How to build ADaM from SDTM: A real case study

Metadata and ADaM.

Creating Word Tables using PROC REPORT and ODS RTF

PROC SUMMARY Options Beyond the Basics Susmita Pattnaik, PPD Inc, Morrisville, NC

Defining a Validation Process for End-user (Data Manager / Statisticians) SAS Programs

Comparing JMP and SAS for Validating Clinical Trials Sandra D. Schlotzhauer, Chapel Hill, NC

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

Implementing CDASH Standards Into Data Collection and Database Design. Robert Stemplinger ICON Clinical Research

Histogram of Numeric Data Distribution from the UNIVARIATE Procedure

Bridging Statistical Analysis Plan and ADaM Datasets and Metadata for Submission

Implementation of SDTM in a pharma company with complete outsourcing strategy. Annamaria Muraro Helsinn Healthcare Lugano, Switzerland

PharmaSUG Paper AD08

And Now, Presenting...

USE CDISC SDTM AS A DATA MIDDLE-TIER TO STREAMLINE YOUR SAS INFRASTRUCTURE

Enrollment Data Undergraduate Programs by Race/ethnicity and Gender (Fall 2008) Summary Data Undergraduate Programs by Race/ethnicity

The Essentials of Finding the Distinct, Unique, and Duplicate Values in Your Data

SAS CLINICAL TRAINING

ScianNews Vol. 9, No. 1 Fall 2006 CRF Design, Kyung-hee Kelly Moon 1

Clinical Trials Terminology for SAS Programmers

ABSTRACT On October 1st, 2008, CDASH released the first 16 common CRF streams (or domains) for use by the Pharmaceutical Industry.

Supporting a Global SAS Programming Envronment? Real World Applications in an Outsourcing Model

Permuted-block randomization with varying block sizes using SAS Proc Plan Lei Li, RTI International, RTP, North Carolina

2014 Demographics PROFILE OF THE MILITARY COMMUNITY

Opioid Treatment Program Participant Satisfaction Survey

QUALITY CONTROL AND QUALITY ASSURANCE IN CLINICAL RESEARCH

PURDUE UNIVERSITY - West Lafayette Campus

Salary. Cumulative Frequency

Utilizing Clinical SAS Report Templates with ODS Sunil Kumar Gupta, Gupta Programming, Simi Valley, CA

Clinical Trial Transparency. What is available?

The ADaM Solutions to Non-endpoints Analyses

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015

1. Create a study in the UM Velos training database. Use your Unique Name as study number.

RETAIL/SELF STORAGE FOR SALE

Clinical Study Synopsis

Counting the Ways to Count in SAS. Imelda C. Go, South Carolina Department of Education, Columbia, SC

Paper-less Reporting: On-line Data Review and Analysis Using SAS/PH-Clinical Software

Application Process: There is a two-stage application process, which includes a Conditional Approval and then a Final Application package.

SAS and Clinical IVRS: Beyond Schedule Creation Gayle Flynn, Cenduit, Durham, NC

College of Medicine Enrollment MD and MD/MPH Fall 2002 to Fall 2006

Demographic and Labor Market Profile of the city of Detroit - Michigan

Why Community Engagement?

Data Management and Analysis for Successful Clinical Research. Lily Wang, PhD Department of Biostatistics Vanderbilt University

SQL SUBQUERIES: Usage in Clinical Programming. Pavan Vemuri, PPD, Morrisville, NC

Using the American Community Survey Data

Effective Use of SQL in SAS Programming

Graphical Analyses of Clinical Trial Safety Data

Simulate PRELOADFMT Option in PROC FREQ Ajay Gupta, PPD, Morrisville, NC

Paper TU_09. Proc SQL Tips and Techniques - How to get the most out of your queries


Programme Guide PGDCDM

FAST FACTS: 3 YEAR TREND DATA

The Clinical Research Center

Contingency Tables and the Chi Square Statistic. Interpreting Computer Printouts and Constructing Tables

Survival Analysis of the Patients Diagnosed with Non-Small Cell Lung Cancer Using SAS Enterprise Miner 13.1

PharmaSUG Paper IB05

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Training/Internship Brochure Advanced Clinical SAS Programming Full Time 6 months Program

Sponsor Novartis Pharmaceuticals

CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin

Listings and Patient Summaries in Excel (SAS and Excel, an excellent partnership)

2012 Demographics PROFILE OF THE MILITARY COMMUNITY

WEB TABLES. Characteristics of Associate s Degree Attainers and Time to Associate s Degree U.S. DEPARTMENT OF EDUCATION MARCH 2012 NCES

Total Enrollment Fall 2007 to Fall 2011

Survey of Clinical Trial Awareness and Attitudes

Treatment for. Malaria: DHA/PQP. Science Day MMV Stakeholders Meeting. Defeating Malaria Together 1

Overview. Benefits and Features

Allan Hancock College Medical Billing and Coding Certificate Program Application Period: March 15 th May 31 st, 2016

How to develop a small business marketing plan

How To Write A Clinical Trial In Sas

CC03 PRODUCING SIMPLE AND QUICK GRAPHS WITH PROC GPLOT

Annual Report On Insurance Agent Licensing Examinations

Chapter 20: Analysis of Surveillance Data

Innovative Techniques and Tools to Detect Data Quality Problems

2013 Demographics PROFILE OF THE MILITARY COMMUNITY

StARScope: A Web-based SAS Prototype for Clinical Data Visualization

Integrated Clinical Data with Oracle Life Sciences Applications. An Oracle White Paper October 2006

Total Males Females (0.4) (1.6) Didn't believe entitled or eligible 13.0 (0.3) Did not know how to apply for benefits 3.4 (0.

Ryokan College Student Profile Program Completion Rates from 1991 to 2011 Program & Graduation Rate Student 85%

Efficacy analysis and graphical representation in Oncology trials - A case study

Transcription:

Paper PO12 Pharmaceutical Programming: From CRFs to Tables, Listings and Graphs, a process overview with real world examples Mark Penniston, Omnicare Clinical Research, King of Prussia, PA Shia Thomas, Omnicare Clinical Research, King of Prussia, PA ABSTRACT SAS is the de facto standard programming language for statistical analysis in the pharmaceutical industry. The mainstay of its use is in the generation of tables, listings and graphs based upon the rules and instructions described in the statistical analysis plan on data stored within SAS datasets usually derived from a clinical data management database. This information is collected on a case report form (CRF) or an electronic data capture (EDC) system processed through a database for query resolution with the source documents at the site and sent to the Statisticians and SAS programmers for their analysis. INTRODUCTION The purpose of this paper is to provide an overview process of table and listing generation as is it applies in the SAS pharmaceutical programming arena. It is not presented as the only method for table generation. It is an attempt to show the fundamental data flow process, from data capture to presentation and the methods SAS is used in such presentation. Clinical trials have many documents two of which are: A protocol which describes the purpose of the clinical trial. It will present a hypothesis for the action of a particular drug, biologic agent or device and describes a test to prove this thinking. A Case Report Form (CRF) which are a series of forms to be completed at the location of the clinical trial (typically an investigator s site) recording information for a particular person in the trial. For the purpose of this paper assume the protocol is a randomized trial, patients can be enrolled equally into either a compound called Treatment X or Placebo (a sugar pill) equally. That the hypothesis to be tested is that one can enroll patients into this trial equally. Figure 1 presents one particular page in a CRF. The data it is interested in collecting is demographic data or patient characteristic data. A person enrolled in a clinical trial will have information such as this collected to determine the homogeneity of the patient or subject population enrolled in the trial. A person at the investigational site will complete the form on this crf. This data will then be entered into a database to create an electronic version of the paper information.

A Statistical Analysis Plan (SAP) is a document describing the planned analysis that will be performed on the electronic CRF data. The following represents some sample SAP text: The purpose of this study is compare study drug X with placebo in demographic information for baseline testing. Subjects will be enrolled in a 1:1 ratio in this 2 arm open-label trial to see what baseline effects, if any, occur. Descriptive statistics will be presented for all parameters collected with no inferential analysis being performed. Statistics for continuous parameters (age) will be presented by N, mean, median, minimum and maximum values. Age will be calculated from the difference of the study randomization date and the date of birth. Categorical parameters (gender, ethnicity) will have groupings presented as counts. All information collected will be listed. As the SAP text is written it is very common for the statistician to create mock data displays which are tables and listings demonstrating how the analysis described in the SAP will be presented. The mock describes the layout of the data in listings and the statistics performed in the table. Figures 2 and 3 demonstrate mock a mock table and listing based on the crf data to be collected and the sample SAP text previously stated. In pharmaceutical SAS programming, a listing supporting a table is almost always produced. One listing can support many tables. Figure 2: Sample Mock Table Mock Table 1 (Intent-to-Treat Population) Treatment X Placebo Total Age[1] (yrs) n n n n Mean x.x x.x x.x Median x.x x.x x.x Min, Max x.x, x.x x.x, x.x x.x, x.x Sex Male n (%) n (%) n (%) Female n (%) n (%) n (%) Race African n (%) n (%) n (%) Asian n (%) n (%) n (%) Caucasian n (%) n (%) n (%) Hispanic n (%) n (%) n (%) Other n (%) n (%) n (%) Percentages are based on the total number of subjects in each treatment group. [1] Based on date of collection. Figure 3: Sample Listing Mock Mock Listing 1 Intent-to-Treat Subjects Site/ Subject Date of Age Treatment Number Birth (yrs) Gender Ethnic Origin Treatment X 0001/0001 DDMMMYYYY 23 Female Caucasian Placebo 0002/0064 DDMMMYYY 37 Male Hispanic

Now we have the Protocol, CRF, SAP and the mocks. The next item to consider is the database that the information captured on the CRF is to be placed into. Using a data entry database package we can obtain our data into a SAS dataset. When we run a proc contents on this data we find the following variables: -----Alphabetic List of Variables and Attributes----- # Variable Type Len Pos Label ---------------------------------------------------------- 6 dmaged Num 3 151 Age (Calculated) 4 dmdob Char 8 19 Date of Birth 5 dmdobd Num 8 0 8 dmeth Char 10 33 Ethnicity 9 dmethsp Char 100 43 Ethnicity Specify 7 dmgndr Char 6 27 Gender 3 dminit Char 3 16 Initials 10 dtrt Char 8 143 Treatment Group 1 siteno Char 4 8 Site Number 2 subjid Char 4 12 Subject Identifier Looking at the dataset through SAS viewer with the label statement turned off see the following information captured: siteno subjid dminit dmdob dmdobd dmaged dmgndr dmeth dmethsp dtrt 0001 0009 MTW 19530723 51 Male Caucasian placebo 0001 0025 SST 19810828 23 Female Asian x 0003 0023 SIN 19590521 45 Male Asian 0004 0047 NAP 19650312 39 Male Caucasian x 0004 0057 QAA 19841218 20 Female Other Angloindian 0006 0008 TSC 19721001 32 Female Asian x 0008 0040 ECN 19571109 47 Female African placebo 0012 0003 SAV 19580527 46 Male Other American Indian placebo 0021 0065 TTM 19480531 56 Male Hispanic placebo 0033 0005 ADC 19100210 94 Female Hispanic x Many times the CRF will be annotated with the SAS variable names to aid programming. The next series of steps a programmer can take are the annotation of the mock tables and listings with the SAS variables to be used to present each part of the data to be presented. Mock annotation provides a the following benefits: It provides other people the information on what variables are being presented It provides the programmer a tool to state what derived (calculated) variables will need to be presented It records a plan of action to be taken before any SAS code is written

Figures 4 and 5 represent the annotated mocks for the study. Figure 4: Annotated Mock Table Mock Table 1 (Intent-to-Treat Population) DERIVED.itt=1 DERIVED DERIVED.trt_d Treatment X Placebo Total Age[1] (yrs) dmaged n n n n Mean x.x x.x x.x Median x.x x.x x.x Min, Max x.x, x.x x.x, x.x x.x, x.x Sex sex_d Male n (%) n (%) n (%) Female n (%) n (%) n (%) Race ethn_d African n (%) n (%) n (%) Asian n (%) n (%) n (%) Caucasian n (%) n (%) n (%) Hispanic n (%) n (%) n (%) Other n (%) n (%) n (%) Percentages are based on the total number of subjects in each treatment group. [1] Based on date of collection. Figure 5: Annotated Mock Listing Mock Listing 1 Intent-to-Treat Subjects DERIVED.itt=1 Treatment trt_d Site/ Subject Date of Age Number Birth (yrs) sitesubj dob_d dmaged Gender dmgndr Ethnic Origin dmeth Treatment X 0001/0001 DDMMMYYYY 23 Female Caucasian Placebo 0002/0064 DDMMMYYY 37 Male Hispanic

Collectively we now have the following: A protocol A CRF A database with data A Statistical Analysis Plan (SAP) with mocks Annotated mocks With this information, programming can now begin. It is important to try to obtain (or create) as many of the documents while programming. This gives the programmer all the information needed to generate the tables and listings correctly the first time. The pharmaceutical industry is a regulated industry. As such, a programmer should always be able to describe the methodology and documentation for generating summarized information. One approach for programmers to use is to store their calculated fields in a dataset prior to table and listing generation. These datasets are called derived (as derived from raw) and allow others to see the calculation prior their display on the output files (tables and listings). It is easier to store an age calculation in a dataset than to duplicate it in the programs producing the tables and listings. The following program creates a derived dataset called DERIVED. *******************************************; * Title: Derived Dataset for Presentation * Program: derived.sas * Author: Shia Thomas * Date: September 30, 2004 ********************************************; *Creating the derived dataset from the raw dataset.; data data.derived; set data.testdemo; *Creating the intent to treat population.; if dtrt='x' or dtrt='placebo' then itt=1; else itt=0; *Creating a variable for concatenating site number and subject number.; length sitesubj $10; sitesubj = trim(left(siteno)) '/' trim(left(subjid)); *Creating the derived variable for the treatments.; if dtrt='x' then trt_d=1; else if dtrt='placebo' then trt_d=2; else trt_d=.; *Creating the derived variable for sex.; if dmgndr='male' then sex_d=1; else if dmgndr='female' then sex_d=2; else sex_d=.; *Creating the intent to treat male population.; if sex_d=1 and itt=1 then mitt=1; else mitt=0; *Creating the derived variables for race.; if dmeth='african' then ethn_d=1; else if dmeth='asian' then ethn_d=2; else if dmeth='caucasian' then ethn_d=3; else if dmeth='hispanic' then ethn_d=4;

else ethn_d=5; *Formatting the date variable.; run; format dob_d date9.; dob_d=input(dmdob, yymmdd8.); Proc contents and SAS viewer display of the derived dataset based on the mock annotations and the previously described SAS program. ----Alphabetic List of Variables and Attributes----- # Variable Type Len Pos Format Label --------------------------------------------------------------------------------- 6 dmaged Num 3 209 Age (Calculated) 4 dmdob Char 8 67 Date of Birth 5 dmdobd Num 8 0 8 dmeth Char 10 81 Ethnicity 9 dmethsp Char 100 91 Ethnicity Specify 7 dmgndr Char 6 75 Gender 3 dminit Char 3 64 Initials 17 dob_d Num 8 48 DATE9. Date of Birth 10 dtrt Char 8 191 Treatment Group 16 ethn_d Num 8 40 Ethnicity 11 itt Num 8 8 Intent to Treat Population 15 mitt Num 8 32 Male Intent to Treat Population 14 sex_d Num 8 24 Gender 1 siteno Char 4 56 Site Number 12 sitesubj Char 10 199 Site and Subject Number 2 subjid Char 4 60 Subject Identifier 13 trt_d Num 8 16 Treatment Group siteno subjid dminit dmdob dmdobd dmaged dmgndr dmeth dmethsp dtrt itt trt_d sex_d mitt ethn_d dob_d 0001 0009 MTW 19530723 51 Male Caucasian placebo 1 2 1 1 3 7/23/1953 0001 0025 SST 19810828 23 Female Asian x 1 1 2 0 2 8/28/1981 0003 0023 SIN 19590521 45 Male Asian 0 1 0 2 5/21/1959 0004 0047 NAP 19650312 39 Male Caucasian x 1 1 1 1 3 3/12/1965 0004 0057 QAA 19841218 20 Female Other Angloindian 0 2 0 5 12/18/1984 0006 0008 TSC 19721001 32 Female Asian x 1 1 2 0 2 10/1/1972 0008 0040 ECN 19571109 47 Female African placebo 1 2 2 0 1 11/9/1957 0012 0003 SAV 19580527 46 Male Other American placebo 1 2 1 1 5 5/27/1958 0021 0065 TTM 19480531 56 Male Hispanic placebo 1 2 1 1 4 5/31/1948 0033 0005 ADC 19100210 94 Female Hispanic x 1 1 2 0 4 2/10/1910

From the derived dataset one can now write code to produce the table and listing. The following shows the final output from these programs. The output can be created through many of SAS s procedures or through a data null statement. Figure 6: Table Output as programmed in SAS Table 1 (Intent-to-Treat Population) Treatment X (N=4) Placebo (N=4) Total (N=8) Age[1] (yrs) n 4 4 8 Mean 47.0 50.0 48.5 Median 35.5 49.0 46.5 Min, Max 23, 94 46, 56 23, 94 Sex Male 1 (25%) 3 (75%) 4 (50%) Female 3 (75%) 1 (25%) 4 (50%) Race African - 1 (25%) 1 (12.5%) Asian 2 (50%) - 2 (25.0%) Caucasian 1 (25%) 1 (25%) 2 (25.0%) Hispanic 1 (25%) 1 (25%) 2 (25.0%) Other - 1 (25%) 1 (12.5%) Percentages are based on the total number of subjects in each treatment group. [1] Based on date of collection. Figure 7: Listing Output as programmed in SAS Listing 1 Intent-to-Treat Subjects Site/ Subject Date of Age Treatment Number Birth (yrs) Gender Ethnic Origin Treatment X 0001/0025 28AUG1981 23 Female Asian 0004/0047 12MAR1965 39 Male Caucasian 0006/0008 01OCT1972 32 Female Asian 0033/0005 10FEB1910 94 Female Hispanic Placebo 0001/0009 23JUL1953 51 Male Caucasian 0008/0040 09NOV1957 47 Female African 0012/0003 27MAY1958 46 Male Other: American Indian 0021/0065 31MAY1948 56 Male Hispanic

CONCLUSION SAS programming of tables and listings in the pharmaceutical industry is a stepwise process, always dependent on previous documents and descriptions of what is to be produced. Many companies have various different processes and documents in addition to those described in this paper. It is important to understand those processes that are specific to a given company. In general the flow of rules and data can be described as in the figure 8, each step dependent on the previous one. When the steps are not followed, there is the potential for mistakes. Protocol and CRF SAP/Mocks Database Annotated CRF Annotated Mocks Derived Datasets Programming Rules Tables and Listings CONTACT INFORMATION (HEADER 1) (In case a reader wants to get in touch with you, please put your contact information at the end of the paper.) Your comments and questions are valued and encouraged. Contact the author at: Mark Penniston Shia Thomas Omnicare Clinical Research 630 Allendale Road King of Prussia, PA 19406 Work Phone: 484 679 2436 Fax: 484 679 2509 Email: mark.penniston@omnicarecr.com shia.thomas@omnicarecr.com Web:www.omnicarecr.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies.