Using the SQL Procedure



Similar documents
Chapter 1 Overview of the SQL Procedure

Simple Rules to Remember When Working with Indexes Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California

AN INTRODUCTION TO THE SQL PROCEDURE Chris Yindra, C. Y. Associates

SQL. Short introduction

Using DATA Step MERGE and PROC SQL JOIN to Combine SAS Datasets Dalia C. Kahane, Westat, Rockville, MD

Demystifying PROC SQL Join Algorithms Kirk Paul Lafler, Software Intelligence Corporation

A basic create statement for a simple student table would look like the following.

Introduction to Microsoft Jet SQL

Financial Data Access with SQL, Excel & VBA

Effective Use of SQL in SAS Programming

Information Systems SQL. Nikolaj Popov

Oracle SQL. Course Summary. Duration. Objectives

A Brief Introduction to MySQL

Performing Queries Using PROC SQL (1)

P_Id LastName FirstName Address City 1 Kumari Mounitha VPura Bangalore 2 Kumar Pranav Yelhanka Bangalore 3 Gubbi Sharan Hebbal Tumkur

SAS Programming Tips, Tricks, and Techniques

SQL - QUICK GUIDE. Allows users to access data in relational database management systems.

CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases

Introduction to Proc SQL Steven First, Systems Seminar Consultants, Madison, WI

CSI 2132 Lab 3. Outline 09/02/2012. More on SQL. Destroying and Altering Relations. Exercise: DROP TABLE ALTER TABLE SELECT

Database Query 1: SQL Basics

Learning MySQL! Angola Africa SELECT name, gdp/population FROM world WHERE area > !

Oracle Database: SQL and PL/SQL Fundamentals NEW

Using Multiple Operations. Implementing Table Operations Using Structured Query Language (SQL)

DATA Step versus PROC SQL Programming Techniques

Oracle Database: SQL and PL/SQL Fundamentals

3.GETTING STARTED WITH ORACLE8i

Katie Minten Ronk, Steve First, David Beam Systems Seminar Consultants, Inc., Madison, WI

Access Queries (Office 2003)

9.1 SAS. SQL Query Window. User s Guide

Retrieving Data Using the SQL SELECT Statement. Copyright 2006, Oracle. All rights reserved.

Unit 10: Microsoft Access Queries

Each Ad Hoc query is created by making requests of the data set, or View, using the following format.

Oracle Database 12c: Introduction to SQL Ed 1.1

Using AND in a Query: Step 1: Open Query Design

How To Create A Table In Sql (Ahem)

Database Administration with MySQL

Welcome to the topic on queries in SAP Business One.

Conditional Processing Using the Case Expression in PROC SQL Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California

The Query Builder: The Swiss Army Knife of SAS Enterprise Guide

Introduction to Using PROC SQL Thomas J. Winn Jr., Texas State Comptroller s Office, Austin, Texas

David M. Kroenke and David J. Auer Database Processing: Fundamentals, Design and Implementation

A table is a collection of related data entries and it consists of columns and rows.

Oracle Database: SQL and PL/SQL Fundamentals

Database: SQL, MySQL

SQL Simple Queries. Chapter 3.1 V3.0. Napier University Dr Gordon Russell

Structured Query Language. Telemark University College Department of Electrical Engineering, Information Technology and Cybernetics

History of SQL. Relational Database Languages. Tuple relational calculus ALPHA (Codd, 1970s) QUEL (based on ALPHA) Datalog (rule-based, like PROLOG)

Using Ad-Hoc Reporting

Using SAS With a SQL Server Database. M. Rita Thissen, Yan Chen Tang, Elizabeth Heath RTI International, RTP, NC

Relational Database: Additional Operations on Relations; SQL

Alternatives to Merging SAS Data Sets But Be Careful

CSC 443 Data Base Management Systems. Basic SQL

Five Little Known, But Highly Valuable, PROC SQL Programming Techniques. a presentation by Kirk Paul Lafler

Efficient Techniques and Tips in Handling Large Datasets Shilong Kuang, Kelley Blue Book Inc., Irvine, CA

Structured Query Language (SQL)

Oracle Database: Introduction to SQL

Oracle Database 10g: Introduction to SQL

Lab 9 Access PreLab Copy the prelab folder, Lab09 PreLab9_Access_intro

Microsoft Access 3: Understanding and Creating Queries

Lab # 5. Retreiving Data from Multiple Tables. Eng. Alaa O Shama

Creating HTML Output with Output Delivery System

Databases in Microsoft Access David M. Marcovitz, Ph.D.

SQL Injection. SQL Injection. CSCI 4971 Secure Software Principles. Rensselaer Polytechnic Institute. Spring

Apache Cassandra Query Language (CQL)

Introduction to Proc SQL Katie Minten Ronk, Systems Seminar Consultants, Madison, WI

2874CD1EssentialSQL.qxd 6/25/01 3:06 PM Page 1 Essential SQL Copyright 2001 SYBEX, Inc., Alameda, CA

Managing Tables in Microsoft SQL Server using SAS

Chapter 5. Microsoft Access

Introduction to SQL and database objects

CS 2316 Data Manipulation for Engineers

Click to create a query in Design View. and click the Query Design button in the Queries group to create a new table in Design View.

Oracle Database 10g Express

Oracle Database: SQL and PL/SQL Fundamentals NEW

Oracle Database: Introduction to SQL

SQL Server Table Design - Best Practices

Guide to Performance and Tuning: Query Performance and Sampled Selectivity

Paper An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois

Using SAS Views and SQL Views Lynn Palmer, State of California, Richmond, CA

6 CHAPTER. Relational Database Management Systems and SQL Chapter Objectives In this chapter you will learn the following:

Microsoft Access 2010

SQL Injection. The ability to inject SQL commands into the database engine through an existing application

MS Access Lab 2. Topic: Tables

Exploring DATA Step Merges and PROC SQL Joins

Outline. SAS-seminar Proc SQL, the pass-through facility. What is SQL? What is a database? What is Proc SQL? What is SQL and what is a database

Utilizing Clinical SAS Report Templates Sunil Kumar Gupta Gupta Programming, Thousand Oaks, CA

Darshan Institute of Engineering & Technology PL_SQL

Programming with SQL

SQL Basics. Introduction to Standard Query Language

14 Triggers / Embedded SQL

ODBC Chapter,First Edition

Introduction to SQL and SQL in R. LISA Short Courses Xinran Hu

ORACLE 10g Lab Guide

Managing Objects with Data Dictionary Views. Copyright 2006, Oracle. All rights reserved.

The prerelease version of SQL was called SEQUEL (for Structured English Query Language), and some people still pronounce SQL as sequel.

Database Applications Microsoft Access

DBF Chapter. Note to UNIX and OS/390 Users. Import/Export Facility CHAPTER 7

Chapter 22 Database: SQL, MySQL,

Choosing a Data Model for Your Database

Transcription:

Using the SQL Procedure Kirk Paul Lafler Software Intelligence Corporation Abstract The SQL procedure follows most of the guidelines established by the American National Standards Institute (ANSI). In its current form, PROC SQL provides many features available in the DATA step and the MEANS, PRINT, and SORT procedures. An advantage of using PROC SQL is that it can often result in fewer and shorter statements than using existing DATA step and procedure methods. This hands-on workshop illustrates numerous examples of how PROC SQL and its many statements can be used. In particular, participants will learn how to create and modify tables and views, retrieve data using the SELEcr statement, and perform efficient queries against SAS System data sets. Brief History of Structured Query Language (SQL) Structured Query Language (SQL) has an interesting history: It is a universal language that was developed to easily access data that is stored in relational databases or tables. A relation is represented as a two-dimensional table consisting of rows and columns. A series of guidelines were developed by Dr. E. F. Codd, an ffim mathematician, through the use of relational mathematics. SQL evolved over time to access data regardless of the application software (e.g., COBOL, PI./l, etc.) being used. Basic SQL Concepts SQL boasts the ability to define, manipulate, and control relational databases or tables as well as providing easy user access. The concept behind SQL is that the user does not have to specify physical attributes about the data such as data structure, location, and/or data type. The user will concentrate on what data should be selected, but not how to select them. It also provides methods of making changes to tables without the need to change application programs. Co,nsequently, data independence is one of the design goals of the relational model. The SQL Procedure in the Base SAS System PROC SQL has the following features: Can run interactively as well as batch (noninteractive). Uses the Structured Query Language to create, modify, and retrieve data from tables. Can be augmented through the use of Global statements such as TITLE and FOOTNOTE. Accesses tables via a two-level name where the first level is the Ubref and the second level name is the name of the table. 430

Terminology The following terminology is provided to help relate SQL and SAS System terms and concepts. The same as a variable in the SAS Acts as the alias or nickname. Points to a SAS data IlDf,iifV. A database system that forms relationships between data items. The same as an observation in the SAS A highly standardized high-level language used in relational database Q :. (1 systems to create and alter objects within a database. The same as a SAS System data set. Contains a definition or of data stored in another location. Introduction To retrieve and display data, the SELECT statement is used. To display data in a specific order, list the columns (variables) in the order desired on the SELECT statement. To sum and display data in groups, SQL uses the GROUP BY clause. The GROUP BY clause is used when a summary function is specified in the query. The resulting output is grouped by the column specified in the GROUP BY clause. To arrange results in a particular order (Ascending or Descending), SQL uses the ORDER BY clause. One or more columns can be selected for sorting. The default sort order is ascending (lowest to highest). To order data from highest to lowest specify the DESC (Descending) option following the column-name in the ORDER BY clause. Retrieve and Display Data To extract and retrieve data, we will use the SQL Procedure's SELECT statement. You will see in the following examples that the desired column(s) or variable(s) are displayed in the order indicated in the SELECT statement. Exercise 1: SELECT SSN, SEX FROM libref.patients; In Exercise 1, the columns SSN (Social Security Number) and SEX from the PATIENTS data set are selected and displayed. 431

Output: SSN 123456789 043667543 153675389 932456132 SEX M F M M Exercise 2: SELECf SSN, LASTNAME FROM libref.patients; In exercise 2, the columns SSN and Patient's Last Name from the PATIENTS data set are selected and displayed. Output: SSN 123456789 043667543 153675389 932456132 LASTNAME Smith Jones Cranberry Henderson Sum and Display Data in Groups To sum and display data in groups, we will use the SQL statement GROUP BY. In the next example the GROUP BY clause is used when a summary function is used in a query. The column being summed is WEIGHT (Patient's Weight) with the results of the query being grouped by SEX (Patient's Gender). Exercise 3: SELECf SEX, SUM(WEIGHT) AS TOTWEIGH FROM libref.patients f* Shortness of Breath */ WHERE SYMPTOM='lO' GROUP BY SEX; 432

We are requesting the SQL procedure to perform several operations in exercise 3. Let's first examine the significance of the WHERE clause. It tells the SQL processor to extract only those rows (records) that contain a value of '10' (Shortness of Breath) in the SYMPTOM column. Rows not meeting this criteria are automatically excluded from the query. Then, the columns SEX (gender) and WEIGHT are selected from the PATIENTS data set. Next, the SQL processor groups rows (in ascending order) by the value found in the column SEX and totals each patient's weight storing the results in TOTWEIGH. Finally, the columns SEX and TOTWEIGH from the PATIENTS data set are selected and displayed. Output: SEX F M TOTWEIGH 1800 3155 Arrange Results in Ascending Order To arrange results in ascending order, we will direct SQL by using the ORDER BY clause. One or more columns can be selected for sorting. One or more columns can be ordered in either ascending and/or descending order. The default sort order is ascending (lowest to highest). If you want to override the default order (arrange in descending order), you need to specify DESC following the column-name that is specified. Exercise 4: SELECT LASTNAME, EDUC FROM libref.patients ORDER BY EDUC; In exercise 4, all rows are first arranged in ascending order by the column EDUC (patient's years of education). Then the columns LASTNAME and EDUC from the PATIENTS data set are selected and displayed. Output: LASTNAME EDUC Candle Robertson Cranberry 10 12 13. 433

Syntax Requirements The Structured Query Language (SQL) is directed by the statements, options, and components within the SQL procedure to create, retrieve, and modify data from tables and views. The syntax requirements for using the SQL procedure within the SAS System follow. PROC SQL < option(s) > ; CREATE create-statement; DELETE delete-statement; DROP drop-statement; SELECT select-statement; UPDATE update-statement; SQL Procedure Statements SQL procedure statements are presented on the following pages. SELECf Statement The SELECf statement tells the SQL processor what columns of data to use in the query, formats desired information, and displays it as output. Ge1U!ral Format - SELECT: SELECf query-expression; Exercise s: SELECf LAS1NAME, DOB FROM libref.patients; SELECf CLINIC, SYMPTOM, DIAGNOSE label='patient"s Diagnosis' FROM libref.patients; CREATE Statement The CREAm statement provides a way to create tables, views, and indexes for a data set (table). There are multiple ways of creating tables, views, and indexes. This workshop will illustrate the statement that creates a new table and the column definitions within the table. 434

General Formal - CREATE TABLE: CREATE TABLE table-name (column-definition( s»; Data Types and Wulths for Columns: Data Types: Widths: CHARActER, INTEGER, DECIMAL Character columns default to 8 characters, Numeric columns are created using the maximum precision possible by the SAS System. A LENGTH statement can be specified in the DATA step. Exercise 6: CREATE TABLE CUNICS(CllN_NO char(6), DIRECTOR char(20»; PROC CONTENTS DATA=CUNICS; RUN; In exercise 6, a table called CUNICS is defined with two columns, CUN NO and DIRECTOR. The CONTENTS procedure is used to display data set (table) information. - General Formal - CREATE VIEW: CREATE VIEW view-name AS query-expression; ti}e view-name should be kept to a maximum of seven characters due to constraints of some host systems. A view is not the same thing as a table. A view represents a stored query-expression (sometimes known as a virtual table) where a table represents stored data. 435

Exercise 7: CREA1E VIEW libref.contacts AS SELECT CllN_NO, LASlNAME FROM libref.patients; SELECT * FROM libref.contacts; In exercise 7, a view is created called CONTACTS using the data derived from the PATIENTS data set. A query is then performed on the new view. DELETE Statement The DELE1E statement removes one or more rows from a table as indicated in the WHERE clause. It is typically used with a WHERE clause in order to inform the SQL processor what row(s) to exclude. General Formal - DELETE FROM: DELE1E FROM table-name WHERE sql-expression; Exercise 8: DELE1E FROM CLINICS WHERE CllN_NO='011234'; /* Displays all fields in CLINICS * / SELECT CllN_NO, REGION FROM CLINICS; In exercise 8, rows containing the value of '011234' in the CUN_NOcolumn are removed from the CLINICS data set. A query is then performed to select and display the columns CllN_NO and REGION from CLINICS. 436

DROP Statement The DROP statement deletes a table, view, or index. You must specify the libref when tables, views, or indexes are stored permanently. General Formal - DROP: DROP TABLE table-name; DROP VIEW view-name; DROP INDEX index-name FROM table-name; Exercise 9: DROP TABLE libref.cllnics; DROP VIEW libref.contacts; In exercise 9, the table (data set) CllNICS is first deleted. Then the view CONTACTS is deleted. UPDATE Statement The UPDATE statement allows for columns within existing rows (observations) of a table or view to be changed. Caution should be exercised while updating tables. Make sure you back-up your tables prior to changing data since accidents can happen. General Formal - UPDATE: UPDATE table-name I libref.view-name SET set-clause < WHERE where-expression >; Exercise 10: UPDATE libref.patcopy r Backup Copy */ SET WEIGHf=WEIGHf + 1; SELECT LASlNAME, WEIGHf FROM libref.patcopy; 437

In exercise 10, an UPDATE statement is used to increment the WEIGIIT column because of improper calibration of the scale. A que!)' is then performed to select and display the columns LASTNAME.and WEIGHT from the PATCOPY data set. SQL Procedure Statement Components The SQL procedure statement components provide a way to further enhance the selection and/or update criteria. The following components are available. BETWEEN -- searches for data lying within certain parameters. Column-Definition -- defines data types and widths. Column-Modifier -- establishes column attributes. From-List -- indicates what table or view to use in a FROM clause. Group-by-ltem -- indicates the groups of variables values processed in a GROUP BY clause. Order-by-ltem -- indicates the order in which observations are. displayed in an ORDER BY clause. SQL-Expression -- identifies functions, expressions, and operators that are used to connect them. Conclusion Structured Que!)' Language (SQL) is a universal language that is available within the base SAS System product. It was developed to easily access data that is stored in relational databases or tables. Relations can be thought of as two-dimensional tables consisting of rows and columns (like a SAS System data set). Through the use of relational mathematics a series of guidelines for defining, manipulating, and controlling tables were developed by Dr. E. F. Codd.. The concept behind SQL is to free the user from having to specify physical attributes about the data such as data structure, location, and/or data type. The user concentrates on what data should be selected, but not how to select it. The SQL Procedure provides a standardized way to retrieve and display data, sum and display data in groups, and arrange results in ascending or descending order by columns. PROC SQL can run in both interactive and batch modes. It can be used to create, modify, and retrieve data from tables. Global statements such as TITLE and OPTIONS can be used with PROC SQL. Tables are accessed via two-level names where the first is the Ubref and the second level is the name of the table. 438

Acknowledgments My sincere thanks are extended to everyone involved with the SUGI Conference especially Michael Murphy and Paul Grant, Section Chairs of the Hands-on Workshops. SAS is a registered trademark of SAS Institute Inc., Cary, NC, USA Author Contact The author will be happy to answer questions and accept suggestions at the following address: Kirk Paul LaDer Software Intelligence Corporation P.O. Box 1390 Spring Valley, CA 91979 1390 Tel: (619) 67O-S0Ff or (619) 670-7638 439