Louise Hadden, Abt Associates Inc., Cambridge, MA
|
|
|
- Gwendoline Copeland
- 9 years ago
- Views:
Transcription
1 PROC SURVEYSELECT: A Simply Serpentine Solution for Complex Sample Designs Louise Hadden, Abt Associates Inc., Cambridge, MA ABSTRACT SAS programmers are frequently called upon to draw a statistically defensible sample for surveys. Many of us have become adept at various data step sampling techniques over the years. However, SAS 's relatively new suite of SURVEY procedures has made our lives much easier. Stratification? No problem. Systematic random sampling? No problem. Proportional sampling? No problem. All in the same design? Yes! This paper will demonstrate the use of PROC SURVEYSELECT to facilitate the drawing of a valid, stratified, random, proportional sample. The examples presented were run on a mainframe computer (OS/390) running SAS V8.2, and on a PC (WIN2K PRO) using SAS INTRODUCTION I began using PROC SURVEYSELECT when an analyst requested that I create a sample that seemed impossible to achieve no matter how many gymnastic routines were performed in the data step. The data resided on a mainframe computer running SAS V8.2 on OS/390 and there were approximately 5 million records on the file. 5 million records is a mere pittance in these days of data warehousing, but the processing time in dealing with the file was non-trivial. The particular task was to draw a sample of Medicare Drug Card Beneficiaries. Twentyseven different drug cards were pre-selected, and then stratified by a subsidy indicator (general cardholders vs. those receiving transitional assistance). Each of the 54 strata (card by subsidy) were to have 600 potential respondents randomly selected from the universe. While 54 strata is not particularly easy to sample within a data step, it is also not impossible, and that was my initial approach. But then, the analyst said, Oh, and can you make the sample within each stratum proportional to the actual proportion of aged and disabled within the stratum? This increased the number of strata to 108, and required the calculation of a share for each stratum in a separate data set that had to be merged onto the master file. It was still possible to do within a data step, but not pretty with a file of 5 million records that was not sorted or indexed by the stratum id. The straw that broke the camel s back came when the analyst said, and can you make SURE that we have complete representation in the sample of all possible values of just a few more variables. This would have increased the number of strata exponentially. Try as I might, I couldn t think of a way to sort the data set efficiently to achieve this particular end. I researched serpentine sorts, and found that PROC SUR- VEYSELECT utilizes serpentine sorts as part of the selection process if the programmer specifies control variable(s). This made PROC SURVEYSELECT the ideal choice for my sampling problem. SNAKING YOUR WAY THROUGH SAMPLE SELECTION PROC SURVEYSELECT DATA= METHOD= SEED= SAMPSIZE= OUT= OUTSORT= The PROC SURVEYSELECT statement itself accomplishes many of the goals I set out to achieve. Aside from the customary DATA= option, there are a number of other important items. The first of these is METHOD. PROC SURVEYSELECT allows you to perform simple random sampling, unrestricted random sampling (with replacement), systematic random sampling, sequential random sampling, and a number of methods for PPS (prob- 1
2 ability proportional to size) sampling. Just reading about some of the PPS methods made my head ache, so I opted for the systematic random sampling method, or METHOD=SYS. 150 PROC SURVEYSELECT DATA=TEMP METHOD=SYS Some of the procedure statement options are dependent on the sampling method. I will only discuss the options I used for my systematic random sample, but invite you to delve further into the mysteries of PROC SURVEYSE- LECT by reading Chapter 72 of the online SAS 9.1 documentation! PROC SURVEYSELECT makes drawing a stratified sample a piece of cake, for example. Since I needed to kludge a couple of different sampling methods together, I couldn t take advantage of that particular sampling method. Since I wanted to be able to exactly replicate my sample if I had to rerun it, I used the SEED= option. This allows you to specify the initial seed for random number generation. Rerunning the same program on the same (unsorted) data will replicate your sample if you use a seed. 151 SEED= The SAMPSIZE= option allowed me to feed the desired N to select from each stratum. PROC SURVEYSE- LECT expects that you will have at least the number requested for each stratum within your sampling frame, AND it expects that your strata identifier is sequential when feeding your desired Ns in this way (i.e., you must sort your frame by the strata identifier prior to sampling.) You can also specify N= to specify a particular N for each stratum, or SAMPRATE= to specify a sampling rate, or feed a file with the stratum variable (sorted sequentially of course!) and desired Ns, to name a few other options. For the purposes of illustration, I am showing the SAMP- SIZE=( ) option in all its glory. In a later paragraph I will show how I derived these numbers which I would have to have done for data step sampling as well. 152 SAMPSIZE= 153 ( ) PROC SURVEYSELECT allows you to specify your OUT and OUTSORT data sets. The OUT data set will contain your sample information, stratum variable, ID variable(s), and CONTROL variable(s). If you specify CON- TROL variables, the data set will be sorted by your CONTROL variables with whatever sort method is used (SERPENTINE, NESTED, etc.) This may not be desirable for very large sample files you may want to remerge onto your original file, so if you want to maintain the sort of the input data set in the output data set, specify OUT- SORT= to hold the (control) sorted data set. 161 OUT=&OUT1; STRATA The STRATA statement allows you to specify your stratifying variable. In my case, I constructed a variable which was a combination of a drug card identifier (27), an aged/disabled dichotomous variable, and a transitional assistance/general dichotomous variable. I had 108 strata in all. It s much easier to use existing variables as strata variables, but I needed to mix proportional (aged/disabled) and non-proportional (600 each from transitional assistance and general.) In my case, the input file had to be sorted by strata in ascending order due to the particular sampling method I was using. 162 STRATA STRATUM; CONTROL The CONTROL statement is where you specify additional variables (other than strata) to sort by when performing sampling. The default sort is hierarchical serpentine sorting. This was key for me, as my project director wanted 2
3 to ensure adequate representation of different age cohorts, genders and races in the sample. You can also specify SORT=NEST on the PROC SURVEYSELECT statement if you do not wish to use the default serpentine sort. 163 CONTROL AGE_COHORT GENDER RACE; ID The ID statement allows you to specify variables from the input file or sampling frame to carry into the output file. The default is that ALL variables in the input file are carried to the output file. At the very least, an identifier that allows you to merge back to the sampling frame is a good idea. Any strata or control variables are included automatically, as well as sample proportion numbers, etc. from the procedure. 164 ID ABTID STRATUM AGED RACE GENDER TRANS05 AGE_COHORT 165 ETHNCTY SUB5DR05; NOTE: THE DATA SET OUTSAMP.SURVEY27 HAS OBSERVATIONS AND 11 VARIABLES. NOTE: THE PROCEDURE SURVEYSELECT PRINTED PAGE 1. NOTE: THE PROCEDURE SURVEYSELECT USED CPU SECONDS AND 5418K. As you can see below, PROC SURVEYSELECT provides you with the relevant information on your sampling routine in a convenient one page format. In addition, it is a good idea to print a few records of your output file and take a look at the created variables such as SAMPLINGWEIGHT and SELECTIONPROB. DRUGCARD: OUTPUT NATIONAL SAMPLE ROUND 2 14:27 MONDAY, FEBRUARY 28, BENEFICIARY EXTRACT FILE THE SURVEYSELECT PROCEDURE SELECTION METHOD STRATA VARIABLE CONTROL VARIABLES CONTROL SORTING SYSTEMATIC RANDOM SAMPLING STRATUM AGE_COHORT GENDER RACE SERPENTINE INPUT DATA SET TEMP RANDOM NUMBER SEED NUMBER OF STRATA 108 TOTAL SAMPLE SIZE OUTPUT DATA SET SURVEY27 TE00.#EMPDDC.LIB.DCARDLIB(EEVS63) -- 28FEB05 DRUGCARD: OUTPUT NATIONAL SAMPLE ROUND 2 14:27 MONDAY, FEBRUARY 28, BENEFICIARY EXTRACT FILE Selection Sampling OBS STRATUM ABTID SUB5DR05 Prob Weight 1 1 D D D D D D D D D D D D D D D D D D D D
4 NS TO GET In this case creating an input file (from which to create my list used in the SAMPSIZE= statement above) was fairly complex. For a simple proportional sample using data step sampling, you can simply use proc freq on your stratum variable(s), output the percents, divide the percents by 100, and apply to the total desired number after sorting by your stratum variable(s) and a random number. (OR, it s even easier using one of PROC SURVEYSE- LECT s proportional sampling methods!) My project officers wanted to select 600 cases from each drug card and transitional assistance / general combination (27 card ids by the dichotomous variable for TA / general = 54 strata ). Then they wanted the 600 cases within each stratum to proportionally represent the numbers of aged versus disabled enrollees. I wrote a macro (iterated 54 times) which performed a frequency on the aged / disabled dichotomous variable for each stratum, outputting the percents, dividing by 100, and multiplying by 600 to get the Ns to sample for each stratum (now 108). Then I set the 108 lines together sequentially and created a stratum variable using _n_. Naturally it is important that this stratum variable match what it is in your sampling frame! You can use the file created this way as an input file to PROC SURVEYSELECT, or create a macro list from it. NOTE: I was lucky enough that the sampling frame was large enough that I did not have difficulties achieving exactly 600 per stratum. This won t always be the case either in PROC SURVEYSELECT or with data set sampling. It is important to carefully review your output samples! MORE REAL LIFE EXAMPLES Although I began using PROC SURVEYSELECT to process a very large file on the mainframe, I found it so easy to use and versatile that I began to use it for other applications. Three additional samples are presented below. The first is to do sample replacement for the original use (very large file on the mainframe). NOTE: had I known a little more about PROC SURVEYSELECT, I could have set up sample replacement within the original program! The second and third examples are for much smaller applications on the PC, for the same use (the analysts changed their minds multiple times.) The purpose of these samples is to demonstrate the great utility, versatility and ease of use of this procedure. You will notice a distinct difference in the amount of information SAS gives you in the logs between the two versions used here (8.2 on the mainframe for Sample 1, and on the PC for Samples 2 and 3.) I m looking forward to see what happens when I start using PROC SURVEYSELECT with which I recently received! SAMPLE PROC SURVEYSELECT DATA=TEMP4 METHOD=SYS 178 SEED= SAMPSIZE=( ) 183 OUT=&OUT1; 184 STRATA NEWSTRAT; 185 CONTROL AGE_COHORT SEX RACE; 186 ID ABTID STRATVAR AGED_DIS RACE SEX TRANSGEN AGE_COHORT 187 ETHNCTY CARDNUM MCRSTA BENEADR: BENE_ST STATE BENECITY 188 BENEFNAM BENEMI BENELNAM HIC NEWSTRAT 189 ZIPCODE; 190 RUN; NOTE: THE DATA SET OUT1.SURVEY27 HAS 192 OBSERVATIONS AND 24 VARIABLES. NOTE: THE PROCEDURE SURVEYSELECT PRINTED PAGE 9. NOTE: THE PROCEDURE SURVEYSELECT USED CPU SECONDS AND 6042K. THE SURVEYSELECT PROCEDURE SELECTION METHOD STRATA VARIABLE SYSTEMATIC RANDOM SAMPLING NEWSTRAT 4
5 CONTROL VARIABLES CONTROL SORTING AGE_COHORT SEX RACE SERPENTINE INPUT DATA SET TEMP4 RANDOM NUMBER SEED NUMBER OF STRATA 68 TOTAL SAMPLE SIZE 192 OUTPUT DATA SET SURVEY27 Variables created by PROC SURVEYSELECT: SamplingWeight SelectionProb SAMPLING WEIGHT PROBABILITY OF SELECTION SAMPLE 2 NOTE: There were 7600 observations read from the data set WORK.UNIVERSE. WHERE eligtosamp=1; NOTE: The data set WORK.TOBESAMPLED has 7600 observations and 80 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.61 seconds cpu time 0.04 seconds proc surveyselect data=tobesampled method=sys 122 seed= sampsize=( ) 124 out=lib.sample01; 125 strata sampcat; 126 control census_region rural; 127 id provider; 128 run; NOTE: The CONTROL sorted data set replaces the DATA= input data set by default. To store the sorted data in an output data set, use the OUTSORT= option. NOTE: The data set LIB.SAMPLE01 has 1200 observations and 6 variables. NOTE: The PROCEDURE SURVEYSELECT printed page 6. NOTE: PROCEDURE SURVEYSELECT used (Total process time): real time 1.53 seconds cpu time 0.12 seconds OASIS-T01: PREPARE POS0412G FOR SAMPLING CREATE SAMPLE FOR OASIS TO1 The SURVEYSELECT Procedure Selection Method Strata Variable Control Variables Control Sorting Systematic Random Sampling sampcat census_region rural Serpentine Input Data Set TOBESAMPLED Random Number Seed Number of Strata 4 Total Sample Size 1200 Output Data Set SAMPLE01 5
6 Below a screenshot of a spreadsheet analyzing the sampling frame or universe against the drawn sample. Note the effect of the serpentine sort on the control variables. The stratum variable was size category, while census region and urban/rural were control variables. Unlike a nested sort that would have yielded more proportional numbers, the serpentine sort simply ensured that all bases (combinations of control variables) were covered within each stratum. This is a very important distinction to understand. If you need to have a representative sample, you should use a nested sort or a different sampling method within PROC SURVEYSELECT. If you need, as I did, to have a sample in which all populations (as defined by strata and control variables) have a chance of being selected, then the serpentine sort is the way to go! SAMPLE 3 Following the draw of the sample above (in Sample 2) there was a complication (other than my own project directors changing their minds several times regarding sample frames and sizes!) Another project needed to draw a sample from the same universe. Our sample as drawn would have made it impossible for the other project to obtain a sample using their stratum of state. We were able to reconfigure our sample in a manner similar to the method used for the drug card sample above, using a combination of state and size categories as the stratum variable instead of size category alone. Our end result was similar, but allowed the other project enough potential sample in their strata to obtain an adequate sample. 360 proc surveyselect data=tosamp method=sys outsort=sortsamp 361 seed= sampsize=( /* Ntoget */
7 ) 370 out=lib.sample02; 371 strata stratum; 372 control rural; 373 id provider; 374 run; OASIS-T01: PREPARE POS0412G FOR SAMPLING CREATE SAMPLE FOR OASIS TO1 - ROUND 3 The SURVEYSELECT Procedure Selection Method Strata Variable Control Variable Systematic Random Sampling stratum rural Input Data Set TOSAMP Sorted Data Set SORTSAMP Random Number Seed Number of Strata 147 Total Sample Size 1200 Output Data Set SAMPLE02 CONCLUSION PROC SURVEYSELECT is an extremely powerful and versatile tool for the selection of both simple and complex sample designs. The procedure allows for statistically defensible probability-based random sampling via a number of different methods including equal probability sampling and PPS (probability proportional to size) sampling. The examples I have shown are a drop in the bucket compared to the vast capability of PROC SURVEYSELECT. Paired with the robust survey analysis procedures such as SURVEYLOGISTIC, SURVEYMEANS, etc. not mentioned in this paper, SAS provides us with one stop shopping in the area of survey implementation and analysis, making it a clear choice for both SAS programmers and sampling statisticians. REFERENCES SAS Online Documentation (SAS V9.1) ACKNOWLEDGMENTS K.P Srinath of Abt Associates Inc. has been my guide and mentor in the world of statistical sampling and analysis. SAS Technical Support and R&D have been incredibly helpful. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies. 7
8 CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Louise Hadden Abt Associates Inc. 55 Wheeler St. Cambridge, MA Work Phone: Fax: KEYWORDS SAS; PROC SURVEYSELECT; SAMPLING; RANDOM; PROPORTIONAL; SERPENTINE; SYSTEMATIC 8
Chapter 63 The SURVEYSELECT Procedure
Chapter 63 The SURVEYSELECT Procedure Chapter Table of Contents OVERVIEW...3275 GETTING STARTED...3276 Simple Random Sampling...3277 StratifiedSampling...3279 Stratified Sampling with Control Sorting...3282
Paper PO06. Randomization in Clinical Trial Studies
Paper PO06 Randomization in Clinical Trial Studies David Shen, WCI, Inc. Zaizai Lu, AstraZeneca Pharmaceuticals ABSTRACT Randomization is of central importance in clinical trials. It prevents selection
New SAS Procedures for Analysis of Sample Survey Data
New SAS Procedures for Analysis of Sample Survey Data Anthony An and Donna Watts, SAS Institute Inc, Cary, NC Abstract Researchers use sample surveys to obtain information on a wide variety of issues Many
Chapter 11 Introduction to Survey Sampling and Analysis Procedures
Chapter 11 Introduction to Survey Sampling and Analysis Procedures Chapter Table of Contents OVERVIEW...149 SurveySampling...150 SurveyDataAnalysis...151 DESIGN INFORMATION FOR SURVEY PROCEDURES...152
Descriptive Methods Ch. 6 and 7
Descriptive Methods Ch. 6 and 7 Purpose of Descriptive Research Purely descriptive research describes the characteristics or behaviors of a given population in a systematic and accurate fashion. Correlational
Why Sample? Why not study everyone? Debate about Census vs. sampling
Sampling Why Sample? Why not study everyone? Debate about Census vs. sampling Problems in Sampling? What problems do you know about? What issues are you aware of? What questions do you have? Key Sampling
Selecting a Stratified Sample with PROC SURVEYSELECT Diana Suhr, University of Northern Colorado
Selecting a Stratified Sample with PROC SURVEYSELECT Diana Suhr, University of Northern Colorado Abstract Stratified random sampling is simple and efficient using PROC FREQ and PROC SURVEYSELECT. A routine
The SURVEYFREQ Procedure in SAS 9.2: Avoiding FREQuent Mistakes When Analyzing Survey Data ABSTRACT INTRODUCTION SURVEY DESIGN 101 WHY STRATIFY?
The SURVEYFREQ Procedure in SAS 9.2: Avoiding FREQuent Mistakes When Analyzing Survey Data Kathryn Martin, Maternal, Child and Adolescent Health Program, California Department of Public Health, ABSTRACT
Demonstrating a DATA Step with and without a RETAIN Statement
1 The RETAIN Statement Introduction 1 Demonstrating a DATA Step with and without a RETAIN Statement 1 Generating Sequential SUBJECT Numbers Using a Retained Variable 7 Using a SUM Statement to Create SUBJECT
Comparing Alternate Designs For A Multi-Domain Cluster Sample
Comparing Alternate Designs For A Multi-Domain Cluster Sample Pedro J. Saavedra, Mareena McKinley Wright and Joseph P. Riley Mareena McKinley Wright, ORC Macro, 11785 Beltsville Dr., Calverton, MD 20705
Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY
Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY ABSTRACT PROC FREQ is an essential procedure within BASE
NON-PROBABILITY SAMPLING TECHNIQUES
NON-PROBABILITY SAMPLING TECHNIQUES PRESENTED BY Name: WINNIE MUGERA Reg No: L50/62004/2013 RESEARCH METHODS LDP 603 UNIVERSITY OF NAIROBI Date: APRIL 2013 SAMPLING Sampling is the use of a subset of the
Sampling and Sampling Distributions
Sampling and Sampling Distributions Random Sampling A sample is a group of objects or readings taken from a population for counting or measurement. We shall distinguish between two kinds of populations
Page 18. Using Software To Make More Money With Surveys. Visit us on the web at: www.takesurveysforcash.com
Page 18 Page 1 Using Software To Make More Money With Surveys by Jason White Page 2 Introduction So you re off and running with making money by taking surveys online, good for you! The problem, as you
SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.
SAMPLING & INFERENTIAL STATISTICS Sampling is necessary to make inferences about a population. SAMPLING The group that you observe or collect data from is the sample. The group that you make generalizations
Counting the Ways to Count in SAS. Imelda C. Go, South Carolina Department of Education, Columbia, SC
Paper CC 14 Counting the Ways to Count in SAS Imelda C. Go, South Carolina Department of Education, Columbia, SC ABSTRACT This paper first takes the reader through a progression of ways to count in SAS.
Survey Analysis: Options for Missing Data
Survey Analysis: Options for Missing Data Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD Abstract A common situation researchers working with survey data face is the analysis of missing
Chapter 8: Quantitative Sampling
Chapter 8: Quantitative Sampling I. Introduction to Sampling a. The primary goal of sampling is to get a representative sample, or a small collection of units or cases from a much larger collection or
Reflections on Probability vs Nonprobability Sampling
Official Statistics in Honour of Daniel Thorburn, pp. 29 35 Reflections on Probability vs Nonprobability Sampling Jan Wretman 1 A few fundamental things are briefly discussed. First: What is called probability
Imputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
Chapter 7 Sampling (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.
Chapter 7 Sampling (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.) The purpose of Chapter 7 it to help you to learn about sampling in
Elementary Statistics
Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,
Newspaper Multiplatform Usage
Newspaper Multiplatform Usage Results from a study conducted for NAA by Frank N. Magid Associates, 2012 1 Research Objectives Identify typical consumer behavior patterns and motivations regarding content,
XI 10.1. XI. Community Reinvestment Act Sampling Guidelines. Sampling Guidelines CRA. Introduction
Sampling Guidelines CRA Introduction This section provides sampling guidelines to assist examiners in selecting a sample of loans for review for CRA. General Sampling Guidelines Based on loan sampling,
The Sample Overlap Problem for Systematic Sampling
The Sample Overlap Problem for Systematic Sampling Robert E. Fay 1 1 Westat, Inc., 1600 Research Blvd., Rockville, MD 20850 Abstract Within the context of probability-based sampling from a finite population,
AP Stats- Mrs. Daniel Chapter 4 MC Practice
AP Stats- Mrs. Daniel Chapter 4 MC Practice Name: 1. Archaeologists plan to examine a sample of 2-meter-square plots near an ancient Greek city for artifacts visible in the ground. They choose separate
Audit Sampling 101. BY: Christopher L. Mitchell, MBA, CIA, CISA, CCSA [email protected]
Audit Sampling 101 BY: Christopher L. Mitchell, MBA, CIA, CISA, CCSA [email protected] BIO Principal KBA s Risk Advisory Services Team 15 years of internal controls experience within the following
The HPSUMMARY Procedure: An Old Friend s Younger (and Brawnier) Cousin Anh P. Kellermann, Jeffrey D. Kromrey University of South Florida, Tampa, FL
Paper 88-216 The HPSUMMARY Procedure: An Old Friend s Younger (and Brawnier) Cousin Anh P. Kellermann, Jeffrey D. Kromrey University of South Florida, Tampa, FL ABSTRACT The HPSUMMARY procedure provides
INTERNATIONAL STANDARD ON AUDITING 530 AUDIT SAMPLING AND OTHER MEANS OF TESTING CONTENTS
INTERNATIONAL STANDARD ON AUDITING 530 AUDIT SAMPLING AND OTHER MEANS OF TESTING (Effective for audits of financial statements for periods beginning on or after December 15, 2004) CONTENTS Paragraph Introduction...
SAMPLING. A Practical Guide for Quality Management in Home & Community-Based Waiver Programs. A product of the National Quality Contractor
SAMPLING A Practical Guide for Quality Management in Home & Community-Based Waiver Programs A product of the National Quality Contractor developed by: Human Services Research Institute And The MEDSTAT
Beyond the Simple SAS Merge. Vanessa L. Cox, MS 1,2, and Kimberly A. Wildes, DrPH, MA, LPC, NCC 3. Cancer Center, Houston, TX. vlcox@mdanderson.
Beyond the Simple SAS Merge Vanessa L. Cox, MS 1,2, and Kimberly A. Wildes, DrPH, MA, LPC, NCC 3 1 General Internal Medicine and Ambulatory Treatment, The University of Texas MD Anderson Cancer Center,
Self-Check and Review Chapter 1 Sections 1.1-1.2
Self-Check and Review Chapter 1 Sections 1.1-1.2 Practice True/False 1. The entire collection of individuals or objects about which information is desired is called a sample. 2. A study is an observational
SAMPLING METHODS IN SOCIAL RESEARCH
SAMPLING METHODS IN SOCIAL RESEARCH Muzammil Haque Ph.D Scholar Visva Bharati, Santiniketan,West Bangal Sampling may be defined as the selection of some part of an aggregate or totality on the basis of
Inclusion and Exclusion Criteria
Inclusion and Exclusion Criteria Inclusion criteria = attributes of subjects that are essential for their selection to participate. Inclusion criteria function remove the influence of specific confounding
Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.
Paper 23-27 Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA. ABSTRACT Have you ever had trouble getting a SAS job to complete, although
INTERNATIONAL STANDARD ON AUDITING (UK AND IRELAND) 530 AUDIT SAMPLING AND OTHER MEANS OF TESTING CONTENTS
INTERNATIONAL STANDARD ON AUDITING (UK AND IRELAND) 530 AUDIT SAMPLING AND OTHER MEANS OF TESTING CONTENTS Paragraph Introduction... 1-2 Definitions... 3-12 Audit Evidence... 13-17 Risk Considerations
Christianna S. Williams, University of North Carolina at Chapel Hill, Chapel Hill, NC
Christianna S. Williams, University of North Carolina at Chapel Hill, Chapel Hill, NC ABSTRACT Have you used PROC MEANS or PROC SUMMARY and wished there was something intermediate between the NWAY option
SAS and Clinical IVRS: Beyond Schedule Creation Gayle Flynn, Cenduit, Durham, NC
Paper SD-001 SAS and Clinical IVRS: Beyond Schedule Creation Gayle Flynn, Cenduit, Durham, NC ABSTRACT SAS is the preferred method for generating randomization and kit schedules used in clinical trials.
EXTRACTING DATA FROM PDF FILES
Paper SER10_05 EXTRACTING DATA FROM PDF FILES Nat Wooding, Dominion Virginia Power, Richmond, Virginia ABSTRACT The Adobe Portable Document File (PDF) format has become a popular means of producing documents
Constructing a Table of Survey Data with Percent and Confidence Intervals in every Direction
Constructing a Table of Survey Data with Percent and Confidence Intervals in every Direction David Izrael, Abt Associates Sarah W. Ball, Abt Associates Sara M.A. Donahue, Abt Associates ABSTRACT We examined
Recovering Business Rules from Legacy Source Code for System Modernization
Recovering Business Rules from Legacy Source Code for System Modernization Erik Putrycz, Ph.D. Anatol W. Kark Software Engineering Group National Research Council, Canada Introduction Legacy software 000009*
CALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
Models of a Vending Machine Business
Math Models: Sample lesson Tom Hughes, 1999 Models of a Vending Machine Business Lesson Overview Students take on different roles in simulating starting a vending machine business in their school that
Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL
Paper SA01-2012 Methods for Interaction Detection in Predictive Modeling Using SAS Doug Thompson, PhD, Blue Cross Blue Shield of IL, NM, OK & TX, Chicago, IL ABSTRACT Analysts typically consider combinations
HM REVENUE & CUSTOMS. Child and Working Tax Credits. Error and fraud statistics 2008-09
HM REVENUE & CUSTOMS Child and Working Tax Credits Error and fraud statistics 2008-09 Crown Copyright 2010 Estimates of error and fraud in Tax Credits 2008-09 Introduction 1. Child Tax Credit (CTC) and
2013 State of Colorado Distracted Driver Study
2013 State of Colorado Distracted Driver Study Colorado Department of Transportation SEAT BE L STUDY T INSTITUTE OF TRANSPORTATION MANAGEMENT EXECUTIVE SUMMARY The Institute of Transportation Management
Observing and describing the behavior of a subject without influencing it in any way.
HOW TO CHOOSE FROM THE DIFFERENT RESEARCH METHODS* The design is the structure of any scientific work. It gives direction and systematizes the research. The method you choose will affect your results and
Generating Randomization Schedules Using SAS Programming Chunqin Deng and Julia Graz, PPD, Inc., Research Triangle Park, North Carolina
Paper 267-27 Generating Randomization Schedules Using SAS Programming Chunqin Deng and Julia Graz, PPD, Inc., Research Triangle Park, North Carolina ABSTRACT Randomization as a method of experimental control
Greatest Common Factor and Least Common Multiple
Greatest Common Factor and Least Common Multiple Intro In order to understand the concepts of Greatest Common Factor (GCF) and Least Common Multiple (LCM), we need to define two key terms: Multiple: Multiples
Agile QA Process. Anand Bagmar [email protected] [email protected] http://www.essenceoftesting.blogspot.com. Version 1.
Agile QA Process Anand Bagmar [email protected] [email protected] http://www.essenceoftesting.blogspot.com Version 1.1 Agile QA Process 1 / 12 1. Objective QA is NOT the gatekeeper of the quality
2: Entering Data. Open SPSS and follow along as your read this description.
2: Entering Data Objectives Understand the logic of data files Create data files and enter data Insert cases and variables Merge data files Read data into SPSS from other sources The Logic of Data Files
SUGI 29 Posters. Mazen Abdellatif, M.S., Hines VA CSPCC, Hines IL, 60141, USA
A SAS Macro for Generating Randomization Lists in Clinical Trials Using Permuted Blocks Randomization Mazen Abdellatif, M.S., Hines VA CSPCC, Hines IL, 60141, USA ABSTRACT We developed a SAS [1] macro
THE VIRTUAL DATA WAREHOUSE (VDW) AND HOW TO USE IT
THE VIRTUAL DATA WAREHOUSE (VDW) AND HOW TO USE IT Table of Contents Overview o Figure 1. The HCSRN VDW and how it works Data Areas o Figure 2: HCSRN VDW data structures Steps for Using the VDW Multicenter
Instant Interactive SAS Log Window Analyzer
ABSTRACT Paper 10240-2016 Instant Interactive SAS Log Window Analyzer Palanisamy Mohan, ICON Clinical Research India Pvt Ltd Amarnath Vijayarangan, Emmes Services Pvt Ltd, India An interactive SAS environment
Adopting Agile Testing
Adopting Agile Testing A Borland Agile Testing White Paper August 2012 Executive Summary More and more companies are adopting Agile methods as a flexible way to introduce new software products. An important
Oh No, a Zero Row: 5 Ways to Summarize Absolutely Nothing
Paper CC22 Oh No, a Zero Row: 5 Ways to Summarize Absolutely Nothing Stacey D. Phillips, i3 Statprobe, San Diego, CA Gary Klein, i3 Statprobe, San Diego, CA ABSTRACT SAS is wonderful at summarizing our
An Introduction to Secondary Data Analysis
1 An Introduction to Secondary Data Analysis What Are Secondary Data? In the fields of epidemiology and public health, the distinction between primary and secondary data depends on the relationship between
Statistics Knowledge Sharing Workshop on Measurements for the Informal Economy
NEPAL Statistics Knowledge Sharing Workshop on Measurements for the Informal Economy 14 15 May, 2013 New Delhi, India Outline of the Presentation 1. Background Information in measuring the informal sector.
Building Qualtrics Surveys for EFS & ALC Course Evaluations: Step by Step Instructions
Building Qualtrics Surveys for EFS & ALC Course Evaluations: Step by Step Instructions Jennifer DeSantis August 28, 2013 A relatively quick guide with detailed explanations of each step. It s recommended
Anyone Can Learn PROC TABULATE
Paper 60-27 Anyone Can Learn PROC TABULATE Lauren Haworth, Genentech, Inc., South San Francisco, CA ABSTRACT SAS Software provides hundreds of ways you can analyze your data. You can use the DATA step
Sampling strategies *
UNITED NATIONS SECRETARIAT ESA/STAT/AC.93/2 Statistics Division 03 November 2003 Expert Group Meeting to Review the Draft Handbook on Designing of Household Sample Surveys 3-5 December 2003 English only
Paper 2917. Creating Variables: Traps and Pitfalls Olena Galligan, Clinops LLC, San Francisco, CA
Paper 2917 Creating Variables: Traps and Pitfalls Olena Galligan, Clinops LLC, San Francisco, CA ABSTRACT Creation of variables is one of the most common SAS programming tasks. However, sometimes it produces
The East End Customer Service Centre. - The Views of Users 2009
The East End Customer Service Centre - The Views of Users This Research was Designed, Undertaken and Completed by: For further information please contact Lisa Grabham on: (0191) 2773487 Email: [email protected]
Welcome back to EDFR 6700. I m Jeff Oescher, and I ll be discussing quantitative research design with you for the next several lessons.
Welcome back to EDFR 6700. I m Jeff Oescher, and I ll be discussing quantitative research design with you for the next several lessons. I ll follow the text somewhat loosely, discussing some chapters out
Who can benefit from charities?
1 of 8 A summary of how to avoid discrimination under the Equality Act 2010 when defining who can benefit from a charity A. About the Equality Act and the charities exemption A1. Introduction All charities
C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods
C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods Overview 1 Determining Data Relationships 1 Understanding the Methods for Combining SAS Data Sets 3
The Query Builder: The Swiss Army Knife of SAS Enterprise Guide
Paper 1557-2014 The Query Builder: The Swiss Army Knife of SAS Enterprise Guide ABSTRACT Jennifer First-Kluge and Steven First, Systems Seminar Consultants, Inc. The SAS Enterprise Guide Query Builder
Inform Racing User Guide.
Inform Racing User Guide. Speed Ratings Race Card Here the main Inform Racing race card provides all relevant speed ratings plus draw data, VDW ratings, run style information, links to form guides, advanced
Outcomes Assessment for School and Program Effectiveness: Linking Planning and Evaluation to Mission, Goals and Objectives
Outcomes Assessment for School and Program Effectiveness: Linking Planning and Evaluation to Mission, Goals and Objectives The Council on Education for Public Health (CEPH) views the planning and evaluation
AP STATISTICS 2010 SCORING GUIDELINES
2010 SCORING GUIDELINES Question 4 Intent of Question The primary goals of this question were to (1) assess students ability to calculate an expected value and a standard deviation; (2) recognize the applicability
Sampling: What is it? Quantitative Research Methods ENGL 5377 Spring 2007
Sampling: What is it? Quantitative Research Methods ENGL 5377 Spring 2007 Bobbie Latham March 8, 2007 Introduction In any research conducted, people, places, and things are studied. The opportunity to
ThreatSpike Dome: A New Approach To Security Monitoring
ThreatSpike Dome: A New Approach To Security Monitoring 2015 ThreatSpike Labs Limited The problem with SIEM Hacking, insider and advanced persistent threats can be difficult to detect with existing product
Lab 11. Simulations. The Concept
Lab 11 Simulations In this lab you ll learn how to create simulations to provide approximate answers to probability questions. We ll make use of a particular kind of structure, called a box model, that
PharmaSUG 2013 - Paper MS05
PharmaSUG 2013 - Paper MS05 Be a Dead Cert for a SAS Cert How to prepare for the most important SAS Certifications in the Pharmaceutical Industry Hannes Engberg Raeder, inventiv Health Clinical, Germany
Memo. Open Source Development and Documentation Project English 420. instructor name taken out students names taken out OSDDP Proposal.
Memo Date: 11/3/2005 To: From: RE: instructor name taken out students names taken out OSDDP Proposal Description: The Wikipedia encyclopedia was introduced in 2001. It is a free encyclopedia that anyone
Introduction to Sampling. Dr. Safaa R. Amer. Overview. for Non-Statisticians. Part II. Part I. Sample Size. Introduction.
Introduction to Sampling for Non-Statisticians Dr. Safaa R. Amer Overview Part I Part II Introduction Census or Sample Sampling Frame Probability or non-probability sample Sampling with or without replacement
2015 Medicare CAHPS At-A-Glance Report
2015 Medicare CAHPS At-A-Glance Report Advantage by Bridgeway Health Solutions CMS MA PD Contract: H5590 Project Number(s): 30103743 Current data as of: 07/01/2015 1965 Evergreen Boulevard Suite 100, Duluth,
Teaching & Learning Plans. Plan 1: Introduction to Probability. Junior Certificate Syllabus Leaving Certificate Syllabus
Teaching & Learning Plans Plan 1: Introduction to Probability Junior Certificate Syllabus Leaving Certificate Syllabus The Teaching & Learning Plans are structured as follows: Aims outline what the lesson,
Survey Research: Choice of Instrument, Sample. Lynda Burton, ScD Johns Hopkins University
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
Introduction... 3. Qualitative Data Collection Methods... 7 In depth interviews... 7 Observation methods... 8 Document review... 8 Focus groups...
1 Table of Contents Introduction... 3 Quantitative Data Collection Methods... 4 Interviews... 4 Telephone interviews... 5 Face to face interviews... 5 Computer Assisted Personal Interviewing (CAPI)...
Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY
Paper FF-007 Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY ABSTRACT SAS datasets include labels as optional variable attributes in the descriptor
Technical Information
Technical Information Trials The questions for Progress Test in English (PTE) were developed by English subject experts at the National Foundation for Educational Research. For each test level of the paper
Excel Formatting: Best Practices in Financial Models
Excel Formatting: Best Practices in Financial Models Properly formatting your Excel models is important because it makes it easier for others to read and understand your analysis and for you to read and
Study Designs. Simon Day, PhD Johns Hopkins University
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
Technical Note. Consumer Confidence Survey Technical Note February 2011. Introduction and Background
Technical Note Introduction and Background Consumer Confidence Index (CCI) is a barometer of the health of the U.S. economy from the perspective of the consumer. The index is based on consumers perceptions
Assessing Research Protocols: Primary Data Collection By: Maude Laberge, PhD
Assessing Research Protocols: Primary Data Collection By: Maude Laberge, PhD Definition Data collection refers to the process in which researchers prepare and collect data required. The data can be gathered
Permuted-block randomization with varying block sizes using SAS Proc Plan Lei Li, RTI International, RTP, North Carolina
Paper PO-21 Permuted-block randomization with varying block sizes using SAS Proc Plan Lei Li, RTI International, RTP, North Carolina ABSTRACT Permuted-block randomization with varying block sizes using
Using games to support. Win-Win Math Games. by Marilyn Burns
4 Win-Win Math Games by Marilyn Burns photos: bob adler Games can motivate students, capture their interest, and are a great way to get in that paperand-pencil practice. Using games to support students
EDITED TRANSCRIPTION OF TESTIMONY Interim Committee Training for Chairs and Vice Chairs Monday, September 26, 2011
EDITED TRANSCRIPTION OF TESTIMONY Interim Committee Training for Chairs and Vice Chairs Monday, September 26, 2011 Following is an edited transcript of the questions asked and answers given at the Interim
Global Food Security Programme A survey of public attitudes
Global Food Security Programme A survey of public attitudes Contents 1. Executive Summary... 2 2. Introduction... 4 3. Results... 6 4. Appendix Demographics... 17 5. Appendix Sampling and weighting...
