Performing Queries Using PROC SQL (1)

Similar documents
Chapter 1 Overview of the SQL Procedure

Chapter 9 Joining Data from Multiple Tables. Oracle 10g: SQL

Handling Missing Values in the SQL Procedure

AN INTRODUCTION TO THE SQL PROCEDURE Chris Yindra, C. Y. Associates

Paper Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation

Duration Vendor Audience 5 Days Oracle End Users, Developers, Technical Consultants and Support Staff

Oracle SQL. Course Summary. Duration. Objectives

Introduction to Proc SQL Steven First, Systems Seminar Consultants, Madison, WI

Oracle Database: SQL and PL/SQL Fundamentals

Oracle Database 12c: Introduction to SQL Ed 1.1

Relational Database: Additional Operations on Relations; SQL

Katie Minten Ronk, Steve First, David Beam Systems Seminar Consultants, Inc., Madison, WI

Introduction to Microsoft Jet SQL

Oracle Database: SQL and PL/SQL Fundamentals NEW

C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods

Oracle Database: SQL and PL/SQL Fundamentals

Foundations & Fundamentals. A PROC SQL Primer. Matt Taylor, Carolina Analytical Consulting, LLC, Charlotte, NC

Information Systems SQL. Nikolaj Popov

Structured Query Language (SQL)

Outline. SAS-seminar Proc SQL, the pass-through facility. What is SQL? What is a database? What is Proc SQL? What is SQL and what is a database

Retrieving Data Using the SQL SELECT Statement. Copyright 2006, Oracle. All rights reserved.

Advanced Query for Query Developers

2874CD1EssentialSQL.qxd 6/25/01 3:06 PM Page 1 Essential SQL Copyright 2001 SYBEX, Inc., Alameda, CA

Access Queries (Office 2003)

Programming with SQL

Introduction to Proc SQL Katie Minten Ronk, Systems Seminar Consultants, Madison, WI

Lab # 5. Retreiving Data from Multiple Tables. Eng. Alaa O Shama

Microsoft Access 3: Understanding and Creating Queries

Lecture 4: SQL Joins. Morgan C. Wang Department of Statistics University of Central Florida

Oracle Database 10g: Introduction to SQL

MOC 20461C: Querying Microsoft SQL Server. Course Overview

Introduction to SQL and SQL in R. LISA Short Courses Xinran Hu

Using SQL Queries in Crystal Reports

SQL - QUICK GUIDE. Allows users to access data in relational database management systems.

Displaying Data from Multiple Tables

3.GETTING STARTED WITH ORACLE8i

Using DATA Step MERGE and PROC SQL JOIN to Combine SAS Datasets Dalia C. Kahane, Westat, Rockville, MD

Paper An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois

9.1 SAS. SQL Query Window. User s Guide

SQL SELECT Query: Intermediate

COMP 5138 Relational Database Management Systems. Week 5 : Basic SQL. Today s Agenda. Overview. Basic SQL Queries. Joins Queries

Database Query 1: SQL Basics

Effective Use of SQL in SAS Programming

Alternatives to Merging SAS Data Sets But Be Careful

Introduction to SAS Mike Zdeb ( , #122

ICAB4136B Use structured query language to create database structures and manipulate data

Information and Computer Science Department ICS 324 Database Systems Lab#11 SQL-Basic Query

Database Administration with MySQL

Using the SQL Procedure

CHAPTER 12. SQL Joins. Exam Objectives

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification

SQL. Short introduction

Using Multiple Operations. Implementing Table Operations Using Structured Query Language (SQL)

SQL Programming. Student Workbook

Oracle Database: SQL and PL/SQL Fundamentals NEW

1 Structured Query Language: Again. 2 Joining Tables

Click to create a query in Design View. and click the Query Design button in the Queries group to create a new table in Design View.

A basic create statement for a simple student table would look like the following.

Where? Originating Table Employees Departments

Advanced Subqueries In PROC SQL

SQL Server for developers. murach's TRAINING & REFERENCE. Bryan Syverson. Mike Murach & Associates, Inc. Joel Murach

Creating QBE Queries in Microsoft SQL Server

Inquiry Formulas. student guide

GET DATA FROM MULTIPLE TABLES QUESTIONS

Netezza SQL Class Outline

Lab 9 Access PreLab Copy the prelab folder, Lab09 PreLab9_Access_intro

Financial Data Access with SQL, Excel & VBA

PROC SQL for SQL Die-hards Jessica Bennett, Advance America, Spartanburg, SC Barbara Ross, Flexshopper LLC, Boca Raton, FL

Displaying Data from Multiple Tables. Copyright 2006, Oracle. All rights reserved.

CS 338 Join, Aggregate and Group SQL Queries

SQL Basics. Introduction to Standard Query Language

SQL Server 2008 Core Skills. Gary Young 2011

Introduction to SQL: Data Retrieving

Innovative Techniques and Tools to Detect Data Quality Problems

Using AND in a Query: Step 1: Open Query Design

A Comparison of Database Query Languages: SQL, SPARQL, CQL, DMX

Introduction to SQL and database objects

Boats bid bname color 101 Interlake blue 102 Interlake red 103 Clipper green 104 Marine red. Figure 1: Instances of Sailors, Boats and Reserves

How To Create A Table In Sql (Ahem)

DBMS / Business Intelligence, SQL Server

Chapter 8. SQL-99: SchemaDefinition, Constraints, and Queries and Views

Dongfeng Li. Autumn 2010

IT2305 Database Systems I (Compulsory)

The Relational Algebra

IT2304: Database Systems 1 (DBS 1)

Microsoft' Excel & Access Integration

SAS: A Mini-Manual for ECO 351 by Andrew C. Brod

A Brief Introduction to MySQL

Welcome to the topic on queries in SAP Business One.

Five Little Known, But Highly Valuable, PROC SQL Programming Techniques. a presentation by Kirk Paul Lafler

Oracle 10g PL/SQL Training

Paper TU_09. Proc SQL Tips and Techniques - How to get the most out of your queries

White Paper. Blindfolded SQL Injection

SAS Software to Fit the Generalized Linear Model

The prerelease version of SQL was called SEQUEL (for Structured English Query Language), and some people still pronounce SQL as sequel.

Performance Tuning for the Teradata Database

Transcription:

SAS SQL

Contents Performing queries using PROC SQL Performing advanced queries using PROC SQL Combining tables horizontally using PROC SQL Combining tables vertically using PROC SQL 2

Performing Queries Using PROC SQL (1) Introduction PROC SQL uses statements that are written in Structured Query Language (SQL), which is a standardized language that is widely used to retrieve and update data in tables and in views that are based on those tables. When you want to examine relationships between data values, subset your data, or compute values, the SQL procedure provides an easy, flexible way to analyze your data. PROC SQL differs from most other SAS procedures in several ways: Many statements in PROC SQL, such as the SELECT statement, are composed of clauses. The PROC SQL step does not require a RUN statement. PROC SQL continues to run after you submit a step. To end the procedure, you must submit another PROC step, a DATA step, or a QUIT statement. proc sql outobs=10; select idnum, jobcode, salary, salary*.06 as bonus from sql.payroll where salary<32000 order by jobcode; quit; 3

Performing Queries Using PROC SQL (2) Writing a PROC SQL Step Before creating a query, you must assign a libref to the SAS data library in which the table to be used is stored. Then you submit a PROC SQL step. You use the PROC SQL statement to invoke the SQL procedure. General form, basic PROC SQL step to perform a query: PROC SQL; SELECT column-1<,...column-n> FROM table-1 view-1<,...table-n view-n> <WHERE expression> <GROUP BY column-1<,... column-n>> <ORDER BY column-1<,... column-n>>; where PROC SQL: invokes the SQL procedure SELECT: specifies the column(s) to be selected FROM: specifies the table(s) to be queried WHERE: subsets the data based on a condition GROUP BY: classifies the data into groups based on the specified column(s) ORDER BY: sorts the rows that the query returns by the value(s) of the specified column(s). 4

Performing Queries Using PROC SQL (3) Selecting Columns To specify which column(s) to display in a query, you write a SELECT clause as the first clause in the SELECT statement. In the SELECT clause, you can specify existing columns and create new columns that contain either text or a calculation. Specifying the Table You specify the table to be queried in the FROM clause. Specifying Subsetting Criteria To subset data based on a condition, write a WHERE clause that contains an expression. Ordering Rows The order of rows in the output of a PROC SQL query cannot be guaranteed, unless you specify a sort order. To sort rows by the values of specific columns, use the ORDER BY clause. proc sql outobs=10; select idnum, jobcode, salary, salary*.06 as bonus from sql.payroll where salary<32000 order by jobcode; quit; 5

Performing Queries Using PROC SQL (4) Querying multiple tables (1) You can use a PROC SQL step to query data that is stored in two or more tables. In SQL terminology, this is called joining tables. Follow these steps to join multiple tables: Specify column names from one or both tables in the SELECT clause and, if you are selecting a column that has the same name in multiple tables, prefix the table name to that column name. Specify each table name in the FROM clause. Use the WHERE clause to select rows from two or more tables, based on a condition. Use the ORDER BY clause to sort rows that are retrieved from two or more tables by the values of the selected column(s). 6

Performing Queries Using PROC SQL (5) Querying multiple tables (2) select payroll.idnum, payroll.salary, payroll2.salary as newsalary from sql.payroll, sql.payroll2 where payroll.idnum=payroll2.idnum order by payroll.idnum; quit; 7

Performing Queries Using PROC SQL (6) Summarizing groups of data You can use a GROUP BY clause in your PROC SQL step to summarize data in groups. The GROUP BY clause is used in queries that include one or more summary functions. Summary functions produce a statistical summary for each group that is defined in the GROUP BY clause. quit; select date, sum(boarded) as DayCarry from sql.march group by date; 8

Performing Queries Using PROC SQL (7) Summary Functions AVG, MEAN COUNT, FREQ, N CSS CV MAX MIN NMISS PRT RANGE STD STDERR SUM SUMWGT T USS VAR Definition mean or average of values number of nonmissing values corrected sum of squares coefficient of variation (percent) largest value smallest value number of missing values probability of a greater absolute value of Student s t range of values standard deviation standard error of the mean sum of values sum of the WEIGHT variable values Student s t value for testing the hypothesis that the population mean is zero uncorrected sum of squares variance 9

Performing Queries Using PROC SQL (8) Creating output tables (1) To create a new table from the results of your query, you can use the CREATE TABLE statement in your PROC SQL step. This statement enables you to store your results in a table instead of displaying the query results as a report. General form, basic PROC SQL step for creating a table from a query result: PROC SQL; CREATE TABLE table-name AS SELECT column-1<,...column-n> FROM table-1 view-1<,...table-n view-n> <WHERE expression> <GROUP BY column-1<,... column-n>> <ORDER BY column-1<,... column-n>>; where table-name: specifies the name of the table to be created. 10

Performing Queries Using PROC SQL (9) Creating output tables (2) create table marchdc as quit; select date, sum(boarded) as DayCarry from sql.march group by date; Because the CREATE TABLE statement is used, this query does not create a report. The SAS log verifies that the table was created and indicates how many rows and columns the table contains. 11

Performing Queries Using PROC SQL (10) Additional features To further refine a PROC SQL query that contains a GROUP BY clause, you can use a HAVING clause. A HAVING clause works with the GROUP BY clause to restrict the groups that are displayed in the output, based on one or more specified conditions. select jobcode, avg(salary) as Avg from sql.payroll group by jobcode having avg(salary)>40000 order by jobcode; quit; 12

Performing Advanced Queries Using PROC SQL (1) Viewing SELECT statement syntax When you construct a SELECT statement, you must specify the clauses in the following order: 1.SELECT 2.FROM 3.WHERE 4.GROUP BY 5.HAVING 6.ORDER BY Note: Only the SELECT and FROM clauses are required. 13

Performing Advanced Queries Using PROC SQL (2) Displaying all columns To display all columns in the order in which they are stored in the table, use an asterisk (*) in the SELECT clause. To write the expanded list of columns to the SAS log, use the FEEDBACK option in the PROC SQL statement. from sql.payroll2; Using the FEEDBACK Option proc sql feedback; from sql.payroll2; 14

Performing Advanced Queries Using PROC SQL (3) Limiting the number of rows displayed To limit the number of rows that PROC SQL displays as output, use the OUTOBS=n option in the PROC SQL statement. General form, PROC SQL statement with OUTOBS= option: PROC SQL OUTOBS= n; Where n : specifies the number of rows. proc sql outobs=10; select idnum, salary from sql.payroll; 15

Performing Advanced Queries Using PROC SQL (4) Eliminating duplicate rows from output To eliminate duplicate rows from your query results, use the keyword DISTINCT in the SELECT clause. select distinct flight, miles from sql.march order by 1; quit; 16

Performing Advanced Queries Using PROC SQL (5) Subsetting rows by using conditional operators (1) In a PROC SQL query, use the WHERE clause with any valid SAS expression to subset data. The SAS expression can contain one or more operators, including the following conditional operators: the BETWEEN-AND operator selects within an inclusive range of values where salary between 70000 and 80000 the CONTAINS or? operator selects a character string where name contains ER where name? ER the IN operator selects from a list of fixed values where code in ( PT, NA, FA ) the IS MISSING or IS NULL operator selects missing values where dateofbirth is missing where dateofbirth is null 17

Performing Advanced Queries Using PROC SQL (6) Subsetting rows by using conditional operators (2) Using the LIKE Operator to Select a Pattern Special Character underscore ( _ ) percent sign (%) quit; from sql.staff where lname like '%SON'; The pattern %SON specifies the following sequence: any number of characters (%) the string PLACE. Represents any single character any sequence of zero or more characters LIKE Pattern LIKE 'D_an' LIKE 'D_an_' LIKE 'D_an ' LIKE 'D_an%' Name(s) Selected Dyan Diana, Diane Dianna all names from the list 18

Performing Advanced Queries Using PROC SQL (7) Subsetting rows by using calculated values It is important to understand how PROC SQL processes calculated columns. When you use a column alias in the WHERE or the HAVING clause to refer to a calculated value, you must also use the keyword CALCULATED along with the alias. proc sql outobs=10; quit; select idnum, jobcode, salary, salary*.06 as bonus from sql.payroll where calculated bonus < 1000 order by jobcode; 19

Performing Advanced Queries Using PROC SQL (8) Enhancing query output You can enhance PROC SQL query output by using SAS enhancements such as column formats and labels, titles and footnotes, and character constraints. proc sql outobs=15; title 'Current Bonus Information'; title2 'Employees with Salaries > $75,000'; select idnum label='employee ID', jobcode label='job Code', salary, 'bonus is:', salary *.10 format=dollar12.2 from sql.payroll where salary>75000 order by salary desc; quit; 20

Performing Advanced Queries Using PROC SQL (9) Points to remember (1) COUNT values: Counting Nonmissing Values: select count (Country) as Number Counting All Rows: select count(*) as Number Table aliases are usually optional. However, there are two situations that require their use. a table is joined to itself (called a self-join or reflexive join) from airline.staffmaster as s1, airline.staffmaster as s2 you need to reference columns from same-named tables in different libraries from airline.flightdelays as af, work.flightdelays as wf where af.delay > wf.delay 21

Performing Advanced Queries Using PROC SQL (10) Points to remember (2) Unlike in PROC SORT, desc is instead of descending and is used behind the variable. order by Type desc, Name; You can sort by any column within the SELECT clause by specifying its numerical position. By specifying a position instead of a name, you can sort by a calculated column that has no alias. select Name, Population, Area, Population/Area label= Density from countries order by 4; 22

Performing Advanced Queries Using PROC SQL (11) Points to remember (3) If you specify a GROUP BY clause in a query that does not contain a summary function, your clause is changed to an ORDER BY clause, and a message to that effect is written to the SAS log. from sql.payroll2 group by jobcode; quit; WARNING: A GROUP BY clause has been transformed into an ORDER BY clause because neither the SELECT clause nor the optional HAVING clause of the associated table-expression referenced a summary function. 23

Combining Tables Horizontally Using PROC SQL Contents Understanding joins Generating a cartesian product Using inner joins Using outer joins Comparing SQL joins and DATA step match-merges Understanding the advantages of PROC SQL joins 24

Understanding Joins Joins combine tables horizontally (side by side) by combining rows. The tables being joined are not required to have the same number of rows or columns. 25

Generating a Cartesian Product When you specify multiple tables in the FROM clause but do not include a WHERE statement to subset data, PROC SQL returns the Cartesian product of the tables. In a Cartesian product, each row in the first table is combined with every row in the second table. 3 3 9 from one, two; 26

Using Inner Joins An inner join combines and displays only the rows from the first table that match rows from the second table, based on the matching criteria (also known as join conditions) that are specified in the WHERE clause. Note: An inner join is sometimes called a conventional join. from one, two where one.x = two.x; You can combine a maximum of 32 tables in a single inner join. 27

Using Outer Joins (1) An outer join combines and displays all rows that match across tables, based on the specified matching criteria (also known as join conditions), plus some or all of the rows that do not match. Type of outer join Left: All matching rows plus nonmatching rows from the first table specified in the FROM clause (the left table) Right: All matching rows plus nonmatching rows from the second table specified in the FROM clause (the right table) Full: All matching rows plus nonmatching rows in both tables 28

Using Outer Joins (2) Examples of outer joins Left join: Right join: from one left join two on one.x=two.x; from one right join two on one.x=two.x; Full join: from one full join two on one.x=two.x; Note: An inner join that uses this syntax can be performed on only two tables or views at a time. When an inner join uses the syntax presented earlier, up to 32 tables or views can be combined at once. 29

Comparing SQL Joins and DATA Step Match-Merges (1) Let s compare the use of SQL joins and DATA step match-merges in the following situations: when all of the values of the selected variable (column) match when only some of the values of the selected variable (column) match. 30

Comparing SQL Joins and DATA Step Match-Merges (2) When all of the values match When all of the values of the BY variable match, you can use a PROC SQL inner join to produce the same results as a DATA step match-merge. DATA step match-merge data merged; merge one two; by x; run; PROC SQL inner join select one.x, a, b from one, two where one.x = two.x order by x; 31

Comparing SQL Joins and DATA Step Match-Merges (3) When only some of the values match Unlike the DATA step match-merge, however, a PROC SQL outer join does not overlay the two common columns by default. DATA step match-merge data merged; merge three four; by x; run; PROC SQL full outer join select three.x, a, b from three full join four on three.x = four.x order by x; 32

Comparing SQL Joins and DATA Step Match-Merges (4) When only some of the values match: using the COALESCE function When you add the COALESCE function to the SELECT clause of the PROC SQL outer join, the PROC SQL outer join can produce the same result as a DATA step match-merge. PROC SQL full outer join select coalesce(three.x, four.x) as X, a, b from three full join four on three.x = four.x; 33

Understanding the Advantages of PROC SQL Joins PROC SQL joins do not require sorted tables. PROC SQL joins do not require that the columns in join expressions have the same name. select table1.x, lastname, status from table1, table2 where table1.id = table2.custnum; PROC SQL joins can use comparison operators other than the equal sign (=). select a.itemnumber, cost, price from table1 as a, table2 as b where a.itemnumber = b.itemnumber and a.cost>b.price; 34

Combining Tables Vertically Using PROC SQL Contents Introducing set operators Using the EXCEPT set operator Using the INTERSECT set operator Using the UNION set operator Using the OUTER UNION set operator Producing rows from the first query or the second query 35

Introducing Set Operators (1) Understanding set operations A set operation is a SELECT statement that contains two groups of query clauses (each group beginning with a SELECT clause) a set operator optionally, one or both of the keywords ALL and CORR (CORRESPONDING). 36

Introducing Set Operators (2) EXCEPT: Selects unique rows from the first table that are not found in the second table. INTERSECT: Selects unique rows that are common to both tables. UNION: Selects unique rows from one or both tables. OUTER UNION: Selects all rows from both tables. EXCEPT INTERSECT UNION OUTER UNION 37

Introducing Set Operators (3) Processing unique vs. duplicate rows When processing a set operation that displays only unique rows (a set operation that contains the set operator EXCEPT, INTERSECT, or UNION), PROC SQL makes two passes through the data, by default: PROC SQL eliminates duplicate (nonunique) rows in the tables. PROC SQL selects the rows that meet the criteria and, where requested, overlays columns. 38

Using the EXCEPT Set Operator (1) Using the EXCEPT operator alone (1) Suppose you want to display the unique rows in table One that are not found in table Two. from one except from two; 39

Using the EXCEPT Set Operator (2) Using the EXCEPT operator alone (2) The set operator EXCEPT overlays columns by their position. In this output, the following columns are overlaid: the first columns, One.X and Two.X, both of which are numeric the second columns, One.A and Two.B, both of which are character. The column names from table One are used, so the second column of output is named A rather than B. 40

Using the EXCEPT Set Operator (3) Using the EXCEPT operator alone (3) Let s take a closer look at this example to see exactly how PROC SQL selects rows from table One to display in output. In the first pass, PROC SQL eliminates any duplicate rows from the tables. There is one duplicate row: in table One, the second row is a duplicate of the first row. All remaining rows in table One are still candidates in PROC SQL s selection process. In the second pass, PROC SQL identifies any rows in table One for which there is a matching row in table Two and eliminates them. There is one matching row in the two tables, as shown below, which is eliminated. 41

Using the EXCEPT Set Operator (4) Combining and overlaying columns By default, the set operators EXCEPT, INTERSECT, and UNION overlay columns based on the relative position of the columns in the SELECT clause. Column names are ignored. You control how PROC SQL maps columns in one table to columns in another table by specifying the columns in the appropriate order in the SELECT clause. The first column specified in the first query s SELECT clause and the first column specified in the second query s SELECT clause are overlaid, and so on. 42

Using the EXCEPT Set Operator (5) Modifying results by using keywords ALL Makes only one pass through the data and does not remove duplicate rows. CORR (or CORRESPONDING) Compares and overlays columns by name instead of by position: When used with EXCEPT, INTERSECT, and UNION, removes any columns that do not have the same name in both tables. When used with OUTER UNION, overlays same-named columns and displays columns that have nonmatching names without overlaying. 43

Using the EXCEPT Set Operator (6) Using the keyword ALL with the EXCEPT operator (1) To select all rows in the first table (both unique and duplicate) that do not have a matching row in the second table, add the keyword ALL after the EXCEPT set operator. from one except all from two; 44

Using the EXCEPT Set Operator (7) Using the keyword ALL with the EXCEPT operator (2) PROC SQL has again eliminated the one row in table One (the fifth row) that has a matching row in table Two (the fourth row). Remember that when the keyword ALL is used with the EXCEPT operator, PROC SQL does not make an extra pass through the data to remove duplicate rows within table One. Therefore, the second row in table One, which is a duplicate of the first row, is now included in the output. 45

Using the EXCEPT Set Operator (8) Using the keyword CORR with the EXCEPT operator (1) To display both of the following, add the keyword CORR after the set operator. only columns that have the same name all unique rows in the first table that do not appear in the second table. from one except corr from two; 46

Using the EXCEPT Set Operator (9) Using the keyword CORR with the EXCEPT operator (2) In the first pass, PROC SQL eliminates the second and third rows of table One from the output because they are not unique within the table; they contain values of X that duplicate the value of X in the first row of table One. Note that column A is not used. In the second pass, PROC SQL eliminates the first, fourth, and fifth rows of table One because each contains a value of X that matches a value of X in a row of table Two. The output displays the two remaining rows in table One, the rows that are unique in table One and that do not have a row in table Two that has a matching value of X. 47

Using the EXCEPT Set Operator (10) Using the keywords ALL and CORR with the EXCEPT operator (1) If the keywords ALL and CORR are used together, the EXCEPT operator will display all unique and duplicate rows in the first table that do not appear in the second table, and will overlay and display only columns that have the same name. from one except all corr from two; 48

Using the EXCEPT Set Operator (11) Using the keywords ALL and CORR with the EXCEPT operator (2) Because the ALL keyword is used, PROC SQL does not eliminate any duplicate rows in table One. PROC SQL does eliminate the first, fourth, and fifth rows in table One from the output because for each one of these three rows there is a corresponding row in table Two that has a matching value of X. Table One contains three rows in which the value of X is 1, but table Two contains only one row in which the value of X is 1. That one row in table Two only causes the first of the three rows in table One that have a matching value of X to be eliminated from the output. 49

Using the INTERSECT Set Operator (1) Using the INTERSECT operator Alone The INTERSECT operator compares and overlays columns in the sameway as the EXCEPT operator, by column position instead of column name. However, INTERSECT selects rows differently and displays in output the unique rows that are common to both tables. from one intersect from two; 50

Using the INTERSECT Set Operator (2) Using the keyword ALL with the INTERSECT operator Adding the keyword ALL to the preceding PROC SQL query prevents PROC SQL from making an extra pass through the data. If there were any rows common to tables One and Two that were duplicates of other common rows, they would also be included in output. from one intersect all from two; 51

Using the INTERSECT Set Operator (3) Using the keyword CORR with the INTERSECT operator To display the unique rows that are common to the two tables based on the column name instead of the column position, add the CORR keyword to the PROC SQL set operation. Note that column A and B are not used. from one intersect corr from two; 52

Using the INTERSECT Set Operator (4) Using the keywords ALL and CORR with the INTERSECT operator If the keywords ALL and CORR are used together, the INTERSECT operator will display all unique and nonunique (duplicate) rows that are common to the two tables, based on columns that have the same name. from one intersect all corr from two; 53

Using the UNION Set Operator (1) Using the UNION operator alone To display all rows from the tables One and Two that are unique in the combined set of rows from both tables, use a PROC SQL set operation that includes the UNION operator: from one union from two; 54

Using the UNION Set Operator (2) Using the keyword ALL with the UNION operator When the keyword ALL is added to the UNION operator, the output displays all rows from both tables, both unique and duplicate. from one union all from two; 55

Using the UNION Set Operator (3) Using the keyword CORR with the UNION operator To display all rows from the tables One and Two that are unique in the combined set of rows from both tables, based on columns that have the same name rather than the same position, add the keyword CORR after the set operator. from one union corr from two; 56

Using the UNION Set Operator (4) Using the keywords ALL and CORR with the UNION operator If the keywords ALL and CORR are used together, the UNION operator will display all rows in the two tables both unique and duplicate, based on the columns that have the same name. from one union all corr from two; 57

Using the OUTER UNION Set Operator (1) The set operator OUTER UNION concatenates the results of the queries by selecting all rows (both unique and nonunique) from both tables not overlaying columns. Let s see how OUTER UNION works when used alone and with the keyword CORR. The ALL keyword is not used with OUTER UNION because this operator s default action is to include all rows in output. 58

Using the OUTER UNION Set Operator (2) Using the OUTER UNION operator alone Suppose you want to display all rows from both of the tables One and Two, without overlaying columns. from one outer union from two; In the output, the columns have not been overlaid. Instead, all four columns from both tables are displayed. Each row of output contains missing values in the two columns that correspond to the other table. 59

Using the OUTER UNION Set Operator (3) Using the keyword CORR with the OUTER UNION operator The output from the preceding set operation contains two columns with the same name. To overlay the columns with a common name, add the CORR keyword to the set operation: from one outer union corr from two; The output from the modified set operation contains only three columns, because the two columns named X are overlaid. 60

Using the OUTER UNION Set Operator (4) Comparing Outer Unions and other SAS techniques (1) A PROC SQL set operation that uses the OUTER UNION operator is just one SAS technique that you can use to concatenate tables. Program 1: PROC SQL OUTER UNION Set Operation with CORR quit; create table three as from one outer union corr from two; Program 2: DATA Step, SET Statement, and PROC PRINT Step data three; run; set one two; proc print data=three noobs; run; These two programs create the same table as output. 61

Using the OUTER UNION Set Operator (5) Comparing Outer Unions and other SAS techniques (2) When tables have a same-named column, the PROC SQL outer union will not produce the same output unless the keyword CORR is also used. CORR causes the same-named columns to be overlaid; without CORR, the OUTER UNION operator will include both of the samenamed columns in the result set. The DATA step program will generate only one column X. The two concatenation techniques shown above also vary in efficiency. A PROC SQL set operation generally requires more computer resources but may be more convenient and flexible than the DATA step equivalent. 62

Producing Rows from the First Query or the Second Query There is no keyword in PROC SQL that returns unique rows from the first and second table, but not rows that occur in both. Here is one way you can simulate this operation: (query1 except query2) union (query2 except query1) This example shows how to use this operation. ( from sql.a except from sql.b) union ( from sql.b except from sql.a); The first EXCEPT returns one unique row from the first table (table A) only. The second EXCEPT returns one unique row from the second table (table B) only. The middle UNION combines the two results. 63