SAS Lesson 2: More Ways to Input Data



Similar documents
A short simple tutorial on. SAS libname statements. for SAS for Windows

SAS Tips and Tricks. Disclaimer: I am not an expert in SAS. These are just a few tricks I have picked up along the way.

A Computer Glossary. For the New York Farm Viability Institute Computer Training Courses

Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.

EXST SAS Lab Lab #4: Data input and dataset modifications

Using Excel to find Perimeter, Area & Volume

Advanced Excel 10/20/2011 1

Introduction to SAS on Windows

Import and Export User Guide PowerSchool Student Information System

PEMBINA TRAILS SCHOOL DIVISION. Information Technology Department. Mayet Online Reports

Introduction to SAS Informats and Formats

Creating Personal Web Sites Using SharePoint Designer 2007

TRUST Online u s e r g u i d e v e r s i o n 8. 4 O c t o b e r

Your data is valuable! Think of how much time it will take you to manually re-enter all your data!

Before You Begin... 2 Running SAS in Batch Mode... 2 Printing the Output of Your Program... 3 SAS Statements and Syntax... 3

Computer Literacy Syllabus Class time: Mondays 5:00 7:00 p.m. Class location: 955 W. Main Street, Mt. Vernon, KY 40456

Lab 14A: Using Task Manager and Event Viewer

Introduction To Microsoft Office PowerPoint Bob Booth July 2008 AP-PPT5

Contents. Microsoft Office 2010 Tutorial... 1

Option 1 - Electronic scales

Microsoft Access 2010 Part 1: Introduction to Access

Importing Data into SAS

Enterprise Asset Management System

Business Portal for Microsoft Dynamics GP User s Guide Release 5.1

Top 10 Things to Know about WRDS

Terminal 4 Site Manager User Guide. Need help? Call the ITD Lab, x7471

Step 2: Save the file as an Excel file for future editing, adding more data, changing data, to preserve any formulas you were using, etc.

Reading Delimited Text Files into SAS 9 TS-673

How To Use Optimum Control EDI Import. EDI Invoice Import. EDI Supplier Setup General Set up

For further support information, refer to the Help Resources appendix. To comment on the documentation, send an to

Instructions for Using Excel as a Grade Book

Endnote Web: Beginners Guide to Using Endnote Web and the Cite While You Write Function

Windows 95. 2a. Place the pointer on Programs. Move the pointer horizontally to the right into the next window.

I. Setting Listserv password

SPSS for Windows importing and exporting data

Introduction. Data Security Warning

1. Base Programming. GIORGIO RUSSOLILLO - Cours de prépara+on à la cer+fica+on SAS «Base Programming»

Time Clock Import Setup & Use

SPSS: Getting Started. For Windows

Technical Paper. Reading Delimited Text Files into SAS 9

Microsoft Office. Mail Merge in Microsoft Word

UOFL SHAREPOINT ADMINISTRATORS GUIDE

Personal Portfolios on Blackboard

B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn);

SAS Certified Base Programmer for SAS 9 A SAS Certification Questions and Answers with explanation

Presentations and PowerPoint

Q&As: Microsoft Excel 2013: Chapter 2

Select the Crow s Foot entity relationship diagram (ERD) option. Create the entities and define their components.

EXCEL FINANCIAL USES

Import and Export User Guide. PowerSchool 7.x Student Information System

SAS Hints. data _null_; infile testit pad missover lrecl=3; input answer $3.; put answer=; run; May 30, 2008

CMS Training Manual. A brief overview of your website s content management system (CMS) with screenshots. CMS Manual

5. Crea+ng SAS Datasets from external files. GIORGIO RUSSOLILLO - Cours de prépara+on à la cer+fica+on SAS «Base Programming»

Instructions for Configuring a SAS Metadata Server for Use with JMP Clinical

Business Objects Version 5 : Introduction

NDSR Utilities. Creating Backup Files. Chapter 9

Welcome to The Grid 2

CowCalf5. for Dummies. Quick Reference. D ate: 3/26 /

Event Record Monitoring and Analysis Software. Software Rev. 3.0 and Up. User s Guide

Batch Processing Version 2.0 Revision Date: April 29, 2014

NHS Mail Basic Training Guide

Using EndNote Online Class Outline

Google Docs A Tutorial

How to transfer your Recipient Address Book from FedEx Ship Manager at fedex.ca to FedEx Ship Manager Software

Personal Cloud. Support Guide for Mac Computers. Storing and sharing your content 2

How to test and debug an ASP.NET application

SECTION 5: Finalizing Your Workbook

Excel 2007 A Beginners Guide

Windows XP Pro: Basics 1

SAS Visual Analytics 7.2 for SAS Cloud: Quick-Start Guide

Web Content Management Training Manualv3

How to Create and Send a Froogle Data Feed

Embroidery Fonts Plus ( EFP ) Tutorial Guide Version

How To Read Data Files With Spss For Free On Windows (Spss)

Access Tutorial 3 Maintaining and Querying a Database. Microsoft Office 2013 Enhanced

Reading Management Software. Software Manual. Motivate Your Students to Read More and Better Books!

TABLE OF CONTENTS. Creating an Account Why Use enewsletters. Setting Up an enewsletter Account. Create/Send Logging In.

Strategic Information Reporting Initiative (SIRI) User Guide for Student Dashboard

Tutorial 3 Maintaining and Querying a Database

Using an external style sheet with Dreamweaver (CS6)

ECDL. European Computer Driving Licence. Spreadsheet Software BCS ITQ Level 2. Syllabus Version 5.0

FRONTPAGE FORMS

RESEARCH. Figure 14-1 Research Options on Main Menu. All 4 Catalogs will search the Objects, Photos, Archives, and Library catalogs.

How To Understand How To Use A Computer On A Macintosh (Apple) Computer With A Mouse And Mouse (Apple Macintosh)

February 2013 Copyright 2013 by CTB McGraw-Hill Education. 1

Website Development Komodo Editor and HTML Intro

ITS ebilling. User s Training Manual

STATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc.

Coding for Posterity

Increasing Productivity and Collaboration with Google Docs. Charina Ong Educational Technologist

1. Open EndNote - The first time you open EndNote, you may be asked whether you want to integrate with EndNote Web > select Cancel.

File Management Windows

Setting up a basic database in Access 2003

Excel 2003 A Beginners Guide

Excel 2007 Basic knowledge

Managing your Episcopal School My BackPack Account Online

Using Excel As A Database

EXCEL PIVOT TABLE David Geffen School of Medicine, UCLA Dean s Office Oct 2002

Getting Started on the Computer With Mouseaerobics! Windows XP

Transcription:

SAS Lesson 2: More Ways to Input Data In the previous lesson, the following statements were used to create a dataset. DATA oranges; INPUT state $ 1-10 early 12-14 late 16-18; DATALINES; Florida 130 90 California 37 26 Texas 1.3.15 Arizona.65.85 ; This is an example of column input, in which the columns of text in which data are stored are explicitly furnished to SAS. It may be impractical to follow this example for all datasets. SAS has facilities for reading data stored in many different ways; some of these facilities are described below. Reading data from a text file Suppose that the data above are stored in a folder on a drive. Blank spaces, not tabs, are used to separate the values. Also, the file is a simple ASCII text file. There are no hidden codes in the file pertaining to word processors (such as margins and font sizes) or spreadsheets (such as formulas and graphs). The following lines of code could be used to read the dataset: DATA oranges; INFILE 'D:\yields\oranges.dat' FIRSTOBS=3 OBS=6; INPUT state $ 1-10 early 12-14 late 16-18; The INFILE statement replaces DATALINES. INFILE gives the location of the external text file, including the drive name and any subdirectories. The FIRSTOBS option tells SAS to skip the first two lines of the file and to begin reading data on line 3. The OBS option tells SAS that Line 6 is the last line which contains legitimate data. If the data were edited in a text file by removing the top two lines and the bottom two lines, so that the text file contained only data. Then, the INFILE statement listed below would be sufficient. Reading data from the Internet SAS like R offer the capability to read data from files available on the Internet. For example, the eggs dataset contains data on the yearly average number of eggs produced by female king crabs near Kodiak Island, Alaska. The following statements could be used to create a SAS dataset. FILENAME kodiak url 'http://lib.stat.cmu.edu:80/crab/eggs'; DATA eggdata; INFILE kodiak; INPUT year 1-2 numeggs 4-9;

In these statements, kodiak is a nickname used by SAS to refer to the longer Internet address. The statements create a dataset called eggdata which contains two variables, year and numeggs. SAS can also read data from FTP sites, but you must supply the appropriate information about the FTP site address, subdirectories, user names, and passwords. This example shows how to obtain the eggs dataset by anonymous FTP. FILENAME kodiak ftp 'eggs' cd='/crab/' user='anonymous' pass='guest' host='lib.stat.cmu.edu'; DATA eggdata; INFILE kodiak; INPUT year 1-2 numeggs 4-9; Of course, it would be easy to find the eggs dataset in a Web browser, save the file as a text file on your computer, and use the methods for reading data from a text file. However, this method would be useful if the data files are very large. It is also convenient if data files are continually updated; example include stock market data, weather data, and batting averages. Creating and reading permanent SAS datasets In all of the previous examples, the SAS datasets that have been created were temporary. They remain in working memory and can be used throughout the SAS session, but they disappear when the SAS session ends. Permanent SAS datasets can be created; these are stored on a disk and can be recalled easily in future SAS sessions. Permanent SAS datasets are convenient to use when the amount of data is large. Also, if you have to go through several steps to create a SAS dataset, you only need to do those steps once if you create a permanent dataset. A LIBNAME statement is needed to create a permanent dataset or to read one that has already been created. The LIBNAME is a surrogate name for the location on a disk where the permanent dataset is or will be stored. For example, consider the following statement: This prepares SAS to look in Drive D for permanent datasets. The name college is called a libref (library reference). The names that you can supply for librefs follow the same rules as dataset and variable names; however, you should not use the names LIBRARY, WORK, USER, or anything starting with the three letters SAS, since these are reserved for special uses within SAS. For example, the following statements create a permanent SAS dataset. DATA original; INPUT dept $ 1-8 count 10-13 class $ 15-21; DATALINES; FineArts 449 day Science 1411 day Music 259 evening Language 759 day ;

DATA college.enrolled;set original; IF class='evening' THEN DELETE; PROC PRINT; RUN; This creates the new file ENROLLED.SD in your drive. This is the new permanent SAS dataset. Only SAS will be able to interpret this file; you will not be able to see its contents by using a word processor or spreadsheet program. The two statements after data work like the following: "DATA name1; SET name2;" means to create a data set name1 from dataset name2. The "IF... THEN DELETE;" statement does the obvious operation. We will spend more time on them in Lesson 5. Thus, the data set college.enrolled has only three observations with class='day'. In order to use the dataset that was just created, you must refer to it with its full name, college.enrolled. To retrieve this data set from your diskette: DATA tempenrl;set college.enrolled; PROC PRINT DATA=tempenrl; RUN; Delimited files With list input, blank spaces are delimiters, or special characters used to separate the values of variables in a line. SAS can also interpret other characters as delimiters. For example, suppose that the dataset in the previous example was stored in the text file A:\GRADES.TXT as follows: Ann/84/90/A-/0 Bill/78/84/B/0 Cathy/95/89/A/1 David/84/88/B+/1 Then, the following statements could be used to create the GRADES dataset. INFILE 'a:\grades.txt' delimiter='/'; INPUT name $ quiz test project $ absences; The phrase 'dlm=' can be used in place of 'delimiter='. This option is used if a keyboard character, such as a comma, slash, or asterisk, separates values in a line. Of course, the character used to separate variables should not appear within a data value. For example, in a comma-delimited file, the number 125,000 would have to be written as 125000; otherwise, SAS would try to break it apart into two variables with values 125 and 000. Tabs are an exception to the use of the DELIMITER option. In a tab-delimited file, the EXPANDTABS option replaces the DELIMITER option. In the example, if tabs had been used in place of slashes, the proper statement for reading the data would be: INFILE 'a:\grades.txt' expandtabs;

Column pointers Suppose that the data set with student grades is stored in D:\GRADES.TXT as follows: Ann 84 90 A- 0 Bill 78 84 B 0 Cathy 95 89 A 1 David 84 88 B+ 1 Creating a dataset from this file would be easy to read with column input and even easier with list input. However, suppose that you only needed to use the students' names and project grades. You can use column pointers to skip over undesired data. Column pointers use the @ symbol to tell SAS to begin reading data at a specified column. In this example, we need to know that the names start in Column 1; project grades, in Column 13. The following statements could be used. INFILE 'D:\grades.txt'; INPUT @1 name $ @13 project $; The quiz grade, test grade, and absences do not appear in this dataset. As shown above, column pointers can be used to skip over unneeded data. Mixed input You may occasionally find it necessary or convenient to use a combination of input techniques for a particular dataset. For example, suppose that the dataset of grades appears as follows: Ann 84 90 A- 0 Bill 78 84 B 0 Catherine 95 89 A 1 David 84 88 B+ 1 Recall that list input can be used only when character variables have 8 or fewer characters, with no blanks. Catherine has 9 letters, so simple list input cannot be used. However, the absences are not neatly aligned in a column, and the last four variables would be easy to read with list input. The following statements could be used: INFILE 'A:\grades.txt'; INPUT name $ 1-9 quiz test project $ absences; It is also possible to use list input, column input, and column pointers simultaneously. For example, if you only needed the name, project grade, and absences, you could use the following INPUT statement: INPUT name $ 1-9 @17 project $ absences;

Line pointers So far, all of the data for each observation have appeared in one line. You may occasionally encounter data in which the variables for one observation appear in two or more consecutive lines, as shown below: Ann 84 90 A- 0 Bill 78 84 B 0 Cathy 95 89 A 1 David 84 88 B+ 1 You may need to use a line pointer to read such data. A line pointer is like a column pointer, except that it specifies the line on which SAS should begin reading the data. A slash (/) tells SAS to skip to the next line, and #number tell SAS to go to that line of an observation's data to resume reading data. If the data above were stored in D:\GRADES.TXT, a dataset could be created in SAS as follows: INFILE 'a:\grades.txt'; INPUT name $ / quiz test project $ absences; Equivalently, you could use the following INPUT statement: INPUT name $ #2 quiz test project $ absences; Since the data consist of simple characters and numbers, the following INPUT statement could also be used. Notice that there are no line pointers. INPUT name $ quiz test project $ absences; SAS will automatically go to the next line of data to complete the set of variables listed in the INPUT statement. However, if the data are irregular (missing values, blanks in character variables, etc.), then line pointers may be necessary. Multiple observations on one line Discussed in Lesson 1 with the @@ symbol.