Interactive introduction to SAS

Similar documents
SAS: A Mini-Manual for ECO 351 by Andrew C. Brod

Data Analysis Stata (version 13)

EXST SAS Lab Lab #4: Data input and dataset modifications

Getting Started with the SAS System Point and Click Approach

Introduction to SAS on Windows

4 Other useful features on the course web page. 5 Accessing SAS

Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY

Getting started with the Stata

2: Entering Data. Open SPSS and follow along as your read this description.

KEYBOARD SHORTCUTS. Note: Keyboard shortcuts may be different for the same icon depending upon the SAP screen you are in.

This can be useful to temporarily deactivate programming segments without actually deleting the statements.

From The Little SAS Book, Fifth Edition. Full book available for purchase here.

Appendix K Introduction to Microsoft Visual C++ 6.0

3 IDE (Integrated Development Environment)

SAS Analyst for Windows Tutorial

How to Use the H-ITT Analyzer Version 2.4.4

Preparing your data for analysis using SAS. Landon Sego 24 April 2003 Department of Statistics UW-Madison

Data Cleaning 101. Ronald Cody, Ed.D., Robert Wood Johnson Medical School, Piscataway, NJ. Variable Name. Valid Values. Type

AN INTRODUCTION TO MACRO VARIABLES AND MACRO PROGRAMS Mike S. Zdeb, New York State Department of Health

BIGPOND ONLINE STORAGE USER GUIDE Issue August 2005

File Management With Windows Explorer

Getting Started with R and RStudio 1

Getting Started With SPSS

The Reprographics Online E-Copy Process is used to submit print jobs to the RUSD Reprographics department.

USING PROCEDURES TO CREATE SAS DATA SETS... ILLUSTRATED WITH AGE ADJUSTING OF DEATH RATES 1

Descriptive Statistics Categorical Variables

MEDIAplus administration interface

SPSS 12 Data Analysis Basics Linda E. Lucek, Ed.D

Paper FF-014. Tips for Moving to SAS Enterprise Guide on Unix Patricia Hettinger, Consultant, Oak Brook, IL

Creating tables of contents and figures in Word 2013

Basic Statistical and Modeling Procedures Using SAS

Appendix III: SPSS Preliminary

Introduction to SPSS 16.0

edgebooks Quick Start Guide 4

Basics of STATA. 1 Data les. 2 Loading data into STATA

TxEIS on Internet Explorer 7

Hands-on Practice. Hands-on Practice. Learning Topics

WHO STEPS Surveillance Support Materials. STEPS Epi Info Training Guide

Using SANDBOXIE to Safely Browse the Internet (verified with ver 4.20) Jim McKnight Sandboxie.lwp revised

Before You Begin... 2 Running SAS in Batch Mode... 2 Printing the Output of Your Program... 3 SAS Statements and Syntax... 3

Windows XP Managing Your Files

An Introduction to Box.com

Macintosh: Microsoft Outlook Web Access

How to test and debug an ASP.NET application

EASA Airworthiness Directives publishing tool 2008 EASA

INTRODUCTION to ESRI ARCGIS For Visualization, CPSC 178

January 26, 2009 The Faculty Center for Teaching and Learning

Quick Start to Data Analysis with SAS Table of Contents. Chapter 1 Introduction 1. Chapter 2 SAS Programming Concepts 7

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

1. Right click using your mouse on the desktop and select New Shortcut.

B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn);

Information Technology. Introduction to Vista

Data management and SAS Programming Language EPID576D

I Didn t Know SAS Enterprise Guide Could Do That!

Using FTP, Views, and PROC SUMM4RY to analyse large databases.

Working with Windows Handout

EHR: Scavenger Hunt I

Today s Topics... Intro to Computer Science (cont.) Python exercise. Reading Assignment. Ch 1: 1.5 (through 1.5.2) Lecture Notes CPSC 121 (Fall 2011)

PharmaSUG Paper QT26

1 Checking Values of Character Variables

CSC 120: Computer Science for the Sciences (R section)

Microsoft Outlook Introduction

Stata Tutorial. 1 Introduction. 1.1 Stata Windows and toolbar. Econometrics676, Spring2008

Integrated Accounting System for Mac OS X

GYM PLANNER. User Guide. Copyright Powerzone. All Rights Reserved. Software & User Guide produced by Sharp Horizon.

Entourage - an Introduction to

Using the SAS Enterprise Guide (Version 4.2)

OVERVIEW OF R SOFTWARE AND PRACTICAL EXERCISE

REUTERS/TIM WIMBORNE SCHOLARONE MANUSCRIPTS COGNOS REPORTS

Downloading <Jumping PRO> from Page 2

A Short Introduction to Eviews

Macros in Word & Excel

Welcome to the topic on Master Data and Documents.

After you complete the survey, compare what you saw on the survey to the actual questions listed below:

Directions for Frequency Tables, Histograms, and Frequency Bar Charts

Writing Packages: A New Way to Distribute and Use SAS/IML Programs

CentralMass DataCommon

Introduction to proc glm

LEARNING LAB WORKBOOK

Reducing the size of your Exchange mailbox

SPSS: Getting Started. For Windows

Welcome to GAMS 1. Jesper Jensen TECA TRAINING ApS This version: September 2006

Connecting to LUA s webmail

Setting up a site directly to the H-drive in Dreamweaver CS4

From the list of Cooperative Extension applications, choose Contacts Extension Contact Management System.

Android Programming Family Fun Day using AppInventor

SPSS Workbook 1 Data Entry : Questionnaire Data

SNM Content-Editor. Magento Extension. Magento - Extension. Professional Edition (V1.2.1) Browse & Edit. Trust only what you really sees.

Cal Answers Analysis Training Part I. Creating Analyses in OBIEE

INTRODUCTORY LAB: DOING STATISTICS WITH SPSS 21

Umbraco v4 Editors Manual

MyOra 3.0. User Guide. SQL Tool for Oracle. Jayam Systems, LLC

Software Update for Clinical Audit and/or Contract Manager in Wales

Pcounter Web Administrator User Guide - v Pcounter Web Administrator User Guide Version 1.0

Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1

SPSS Introduction. Yi Li

Online Digitizing and Editing of GIS Layers (On-Screen or Head s Up Digitizing)

NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT

Data Preparation in SPSS

How To Use Spss

Transcription:

Interactive introduction to SAS Thursday 8 March 2007 PhD course in Epidemiology Spring 2007 Typical use of SAS for statistical analysis You have data in some format Get the data into SAS Look at the data using SAS Perhaps transform or select part of the data, i.e. making the data ready for statistical analysis Use the appropriate SAS statistical procedure, called a "proc", e.g. proc ttest for t-test Get your results out Check that SAS did what you asked for (SAS does some "logging" of your instructions) Interpret the SAS output 1

Basic structure of the SAS system SAS Core Database system: SAS-data set Programming language: SAS language Base SAS (50 different proc s) Data manipulation: so-called "data-step", proc sort Showing data: proc contents, proc print, proc format Basic statistics: PROC FREQ, proc means, proc univariate Special modules SAS/STAT: PROC GENMOD, proc glm,... (more than 65 proc s) SAS/GRAPH: proc gplot (23 proc s) other modules: SAS/ASSIST, QC, ETS, FSP, IML,... 2 SAS-data sets Observations variables : sex age weight name 1 8 25 John 2. 17 Anna 2 13 48 Maria etc Variable names: sex age weight name (OBS: not æøå) Variable types: Numeric or Character MISSING NUMERICAL VALUE is. (a dot) MISSING CHARACTER VALUE is "" (an empty string) Data set also have a header, with information about when it was created, no. of observations, no. of variables, variable names, etc. 3

SAS Analyst Known from Basic Statistics PhD course Add-on to SAS ( application ) Menu/form-oriented Writes and runs programs for you BUT: Does not cover everything especially not PROC GENMOD Tedious to use in the long run... Reproduction and documentation difficult 4 SAS Display Manager Framework for program development and data handling EDITOR window: Writing SAS programs LOG window: "Logging", i.e. what happen when and did it work? OUTPUT window: Results from proc s SAS Explorer: Navigating the SAS system as you have seen Results: Structured output, designed for helping the user navigating in large amount of output 5

Keys Shortcut keys for navigating the SAS system, press F9 to see. Important ones are: F5: Editor F6: Log F7: Output F3: "Submit", i.e. run the SAS program in the Editor Close down the Keys window using either your mouse or the standard Windows "Ctrl+F4". Press F5 to go to the Editor. NB: If you by accident closed the Editor you can get another one using the menu View Enhanced Editor (the View menu also have a "program editor" which is an older less featured version of the editor). 6 Editor Here you write programs and "submit" them to SAS Standard free text editing SAS is case-insentitive (SEX, sex, Sex all refer to the same) Load and save editor contents (called.sas files, e.g. henrik.sas) Line numbers, Colours, indentation, and separators 7

Programs: DATA steps and PROC steps Roughly speaking, SAS programs consist of two kinds of steps (= blocks of instructions): DATA steps define datasets. E.g. by reading raw data from a text file, computing transformed variables, selecting cases, etc. PROC steps contain standard procedures that operate on datasets. You can t, e.g., transform variables in a PROC step. Normal arrangement of a SAS program is to put DATA steps at the beginning, but they can occur intermixed There are a a few SAS statements in addition to the DATA and PROC steps. They typically set up definitions for later use: FILENAME, OPTIONS, TITLE are some examples. 8 Getting some data into SAS - a data step There are several ways of getting data into SAS. The one we will use is reading data from the web. Have a look (using Internet Explorer) at the Framingham data on the homepage of the course: framing.txt. To read these data into SAS you need to type the following SAS program in the Editor lots of semicolons: data framing; filename framfile url "http://www.biostat.ku.dk/~pka/epif07/framing.txt"; infile framfile firstobs=2; input id sex age frw sbp sbp10 dbp chol cig chd yrschd death yrsdth cause; /* Define variable chdny as indicator for CHD during follow-up */ if chd = 1 then chdny =.; /* prevalent case - set to missing */ if chd = 0 then chdny = 0; if chd > 1 then chdny = 1; The SAS program can be found on the homepage of the course http://www.biostat.ku.dk/~pka/epif07. Download the program to your P-drive and open the file in the program editor. 9

A SAS dataset called framing is generated on the basis of the observations in the text file framing.txt. The option firstobs=2 tells SAS only to read data from line 2 in the file framing.txt. This is because the first line holds the variable names, which SAS cannot read. The if-statements are an example of how to generate (recode) a new variable (chdny) from an old one (chd). Press F3 or hit the "running man" to submit the SAS program and look in the log window. 10 Looking at the data - some proc steps proc contents data=framing; proc print data=framing; proc print data=framing; var id chd chdny; 11

If the data file was on your own computer (e.g. c:\framing.txt) we would have written: data framing; infile "c:\framing.txt" firstobs=2; input id sex age frw sbp sbp10 dbp chol cig chd yrschd death yrsdth cause; /* Define variable chdny as indicator for CHD during follow-up */ if chd = 1 then chdny =.; /* prevalent case - set to missing */ if chd = 0 then chdny = 0; if chd > 1 then chdny = 1; 12 Log window In the Editor, add the option position, but you should misspell it as potition proc contents data=framing potition; Well, something happend, inspite of the error. Try to look in the log window: Press F6 or use your mouse. SAS does some guessing - very helpful. The option position will add a list where the variables are listed in order of their "position" in the data set. Now try to write politi instead. Now we get an error - look in the log. INFORMATIONS IN THE LOG WINDOW IS IMPORTANT TO CONSULT EVEN IF YOU BELIEVE EVERYTHING IS OKAY... 13

Results window Results window got folders added like Print: The SAS system Contents: The SAS system Double-click the nodes inside these to see its contents 14 Saving a SAS program, Output window, Log window To save what you have been producing in the Editor, you act as in e.g. Word: File Save as... You save the Output and Log window in the same manner. File extensions are: *.sas Program files *.log Log files *.lst Output files 15

Elements of a data step data framing; infile "c:\framing.txt" firstobs=2; input id sex age frw sbp sbp10 dbp chol cig chd yrschd death yrsdth cause; /* Define variable chdny as indicator for CHD during follow-up */ if chd = 1 then chdny =.; /* prevalent case - set to missing */ if chd = 0 then chdny = 0; if chd > 1 then chdny = 1; logchol=log(chol); /* natural logarithm */ expage=exp(age); /* exponential function */ ageny=log(exp(age)); silly=(log(age)+log10(age))/exp(chol); /* Making a new data set called new from framing, and selecting only males more than 50 years of age */ data new; set framing; if sex=1 and age>50; 16 Operators Arithmetic + - * / ** (to the power of) Relations: = < > <= >= <> (Not equal) eq lt gt le ge ne (alternative forms) Logical &! and or not 17

Comments Two ways of making comments in SAS programs: /* Comments */ * Comments ; Example: /* Here I make a scatter plot */ proc gplot data=framing; plot chol * age; quit; * Here I make another scatter plot; proc gplot data=framing; plot sbp * age; quit; 18 PROC MEANS and PROC UNIVARIATE options pagesize=100; proc means data=framing; var age; proc means data=framing; class sex; var age; proc univariate data=framing ; var age; 19

PROC FREQ proc freq data=framing; table sex; proc freq data=framing; table sex*chdny; proc freq data=framing; table sex*chdny /?; title "Risk of CHD among males and females"; proc freq data=framing; table sex*chdny / relrisk chisq nocol nopercent; 20 The help system Via the small book icon on the toolbar or F1 but does not work here Usually, you ll need to use the index You generally need to know roughly what you are looking for. It is not a textbook. SAS OnlineDoc which contains material that used to be found in large manuals (you can still buy those): http://support.sas.com/onlinedoc/913/docmainpage.jsp "The Little SAS Book" http://support.sas.com/publishing/bbu/companion_site/56649.html 21

How to get organized Interactive runs are convenient, but also dangerous Remember to save and keep program code, at least for important runs. Collect the fragments into coherent.sas files that can be run from scratch. Test your programs in a freshly started SAS session. Save the log files. They can be as important as the output. Organize your data sets in one disk folder e.g. sasdata and programs, output, and logs in another disk folder e.g. saspgm Learning by doing. 22