Efficient Techniques and Tips in Handling Large Datasets Shilong Kuang, Kelley Blue Book Inc., Irvine, CA
|
|
- Bethany Marjorie Allison
- 8 years ago
- Views:
Transcription
1 Efficient Techniques and Tips in Handling Large Datasets Shilong Kuang, Kelley Blue Book Inc., Irvine, CA ABSTRACT When we work on millions of records, with hundreds of variables, it is crucial how we are processing our data. To make SAS really ROCK, we need to pay more attention to SAS program efficiency, since a single data step or some SQL query may take a few hours in dealing with such large datasets. In this paper, we present a few practical efficient techniques and hands-on tips in handling large datasets, including the application of INDEX, separating one single step into multi-step to improve efficiency, the classic Where vs. If statement, some tips in joining large datasets in PROC SQL etc. To see the efficiency of those techniques, we also provide for each case with experimental example output, how much for the time-resource consuming, "apple-to-apple" comparison between the processes with and without those techniques. With those tips in our large data practice, we can save a lot of space and time, SAS ROCKS! Keywords: data analysis, large data manipulation, efficient techniques tips, create index, data mining INTRODUCTION Efficiency in SAS programming, has been traditionally defined as the optimization of space (computer resources etc.) and time (cpu process time, data I/O time, programmer time etc). It has been more and more crucial since large datasets are all over the place nowadays. When we are sitting in front of a big dataset, with millions of records, hundreds of variables, how do we play with it? Every single step may take a few hours to complete if we don t deal with it carefully. In particular, during the data preparation, or model testing stage, we are torturing ourselves if a single testing process takes hours and we have to go back and forth testing several times. In this paper, we provide a few efficient techniques to help handle those situations carefully, making our SAS program the most efficient. WORK ON ONLY WHAT YOU NEED Example: we want to sort a large dataset with 10 million of records, there are altogether 20 variables (in fact we just need 5 variables). Select only those 5 variables needed proc sort data=data_in(keep=var1-var5); by var1-var4; Real Time: seconds CPU Time: seconds Include some unnecessary variables proc sort data=data_in(keep=var1-var10); by var1-var4; Real Time: 8:59.72 minutes CPU Time: 1:54.64 minutes Furthermore, if we include all those 20 variables to sort, there is still no response after waiting 25 minutes. We can see the processing time is not linearly proportional to the number of variables. With more variables included in sort procedure, it takes multiple more processing time. MULTI-STEP V.S. SINGLE STEP Example: we want to sort a bigger dataset with 50 million records, with all 10 variables (var1-var10) needed. In our sort procedure, we need to sort with nodupkey on var1-var3. In a single sort step, the code is relatively simpler than the multi-step, in which we need to split the big dataset into two smaller parts first, sort with nodupkey separately, then combine together and sort with nudupkey again. 1
2 Method I: Single-Step sort with nodupkey Single-Step Sort proc sort data=data_in nodupkey; Time-Consuming Real Time: 73:31.08 minutes CPU Time: 8:44.89 minutes Method II: Multi-Step sort with nodupkey Multi-Step Sort proc sort data=data_in(firstobs=1 obs= ) out=out1 nodupkey; proc sort data=data_in(firstobs= ) out=out2 nodupkey; proc append base=out1 data=out2 force; proc sort data=out1 out=data_out nodupkey; Total Time Consuming: Time-Consuming Real Time: 18:20.50 minutes CPU Time: 3:18.21 minutes Real Time: 20:02.22 minutes CPU Time: 3:51.10 minutes Real Time: seconds CPU Time: seconds Real Time: 1:17.91 minutes CPU Time: seconds Real Time: < 41 minutes CPU Time: < 9 minutes We can easily see the big time difference between the multi-step and single-step, instead of waiting 74 minutes in single-step, we can finish the same work within 41 minutes in multi-step. What a difference! We believe an efficient programmer should not be stingy on SAS codes, the extra coding work can be easily traded off by saving us a lot of time. INDEX & WHERE > WHERE > IF Example: we are still using the same previous dataset with 50 million records, 10 variables, and we want to find a subset satisfying certain condition (var1= key ). IF statement data data_out1; set data_in; if var1= key ; Real Time: seconds CPU Time: seconds Where statement data data_out2; set data_in (where=(var1= key )); Real Time: seconds CPU Time: seconds We can see the significant time reducing in the where-statement. INDEX is usually applied in optimization with where-statement, or by-statement. To see why the where-statement is faster than the if-statement: by where-statement in data step, if the condition (var1= key ) is not satisfied, the record will not read into Program Data Vector (PDV), therefore it saves us a lot of unnecessary reading time. We can create INDEX by using the simple code as the following: proc sql; create index keyvar1 on data_in; 2
3 quit; To check whether the INDEX has played a role in optimization, we can use the following option to check the log output: options msglevel=i; To understand when to use index, the rule of thumb is, the subset data should only be a small portion of the whole dataset, as long as the subset data is less than 20% of the whole dataset, it will improve the performance. WORK ON SMALL SAMPLE FIRST TO TEST THE WHOLE PROCESS After we have tested each process (either a data step, a procedure or some SQL query), and there are several of those in the whole process, we want to test our program as a whole, for instance, for the following process flow: 1. data one; set two(where=(var1= key )); 2. proc sql 3. proc glmselect To test the whole process, we can simply choose a smaller subset for the testing purpose. The SAS procedure SURVEYSELECT can help us get a well-distributed sample subset. proc surveyselect data=data_in seed= method=srs n=100 out=data_out; To make life easier, we can just use obs options in the data step: data one; set two (obs=100 where=(var1= key )); SAVING SPACES FOR LARGE DATASETS There are quite a few SUGI papers with detailed investigations for how we save data storage spaces. Since we don t want to get into too much theory, we provide a few more practical techniques to save spaces in particular for large datasets. options compress=yes; This trivial option set-up can save us a lot of spaces. Especially for temporary files in work folder, a 5 GB compressed dataset can easily take more than 20 GB to store, a few of those uncompressed files can easily take altogether more than 100 GB of our disk spaces. Some people may argue for the disadvantage of more cpu processing time by using this compress option. The key is to consider which one is more worthwhile: one side gives us a little bit more processing time: from 30 seconds to 1 minute; the other side gives us the risk of running out of temporary folder spaces. In fact, we can even set up the compress option in the SAS configuration file: sasv9.cfg. The configuration file is usually located at :\Program Files\SAS\SASFoundation\9.2\nls\en\SASV9.CFG, we can open that file in a plain text editor like Notepad, and add a line of code: -COMPRESS=YES Then all the files generated in the work folder(and also any permanent file) will be in compressed format. This way we can easily control the temporary work folder spaces, especially when there are several SAS programmers working on the same server, we don t need to check them one by one. Change numerical variables to character variables. This option is useful if we have a large dataset with large numerical values in some column. Due to the binary (0/1) expression for the numerical numbers, even the shortest length number takes 2 or 3 bytes; meanwhile, a single character takes only 1 byte of storage space. The other useful situation for this conversion is when we have a huge numerical value, say over 32 integer digits, there will usually be some rounding issue for SAS to handle those huge values (SAS has a limitation for number of integer digits to display). After switching to character variables, you can easily get rid of this concern. 3
4 SEND NOTICE WHEN IT S DONE The last but not the least, in some cases that we have to wait some time for our program to finish running, either during the initial data preparation, model testing, or final product running, we can simply put the following SAS code at the end of the program, asking SAS to send us an whenever it s done. Checking is much easier than logging into SAS server to check if the program finish or not, especially with help from the mobile technology, checking on various mobile phones is much more convenient than before. filename mymail "your company address" cc="your gmail address" subject="sas Task finished"; attach= directory\any file'; data _null_; file mymail; Note: you may need to set up your account appropriately if it s on remote SAS server. If interested, go to recommended readings for more details. CONCLUSION In this paper, we provide a few practical techniques in dealing with large data analysis. For each technique mentioned, it might seem trivial to many SAS programmers; but combining them altogether, they can be very powerful! Keeping those techniques in mind, we will enjoy more fun, less painful experience, in our long SAS journey (there is no shortcut to be a SAS expert!) ACKNOWLEDGMENTS We would like to thank our colleagues Alice Xie, Bisser Roussanov, Richard Umstaetter, Roger Yeh and Shelly Teh et al. for the various help in our daily SAS large data practice. Also we would like to thank our Vice-President Shawn Hushman for the trust and various SAS training support. REFERENCE 1. The Use and Abuse of the Program Data Vector, Jim Johnson, Proceedings of the 2003 Conference of the Pharmaceutical Industry SAS Users Group, Cary, NC: SAS Institute, Inc., THE BASICS OF USING SAS INDEXES, MICHAEL A. RAITHEL, SAS Users Group International (SUGI), Proceedings 30, Tutorials. 3. KIRK S KORNER, Quick & Simple tips, Kirk Paul Lafler, Software Intelligence Corporation. 4. A FASTER INDEX FOR SORTED SAS DATASETS, Mark Keintz, SAS Global Forum 2009, Applications Development. 5. USING SAS INDEXES WITH LARGE DATABASES, Alex Vinokurov, Lawrence Helbers, NESUG 15, Beginning Tutorials. 6. KEEPING YOUR DATA IN STEP - UTILIZING EFFICIENCIES, Michael G. Sadof, SUGI 24, Advanced Tutorials. 7. ARE YOUR SAS PROGRAMS RUNNING YOU? Marje Fecht, Larry Stewart, Proceedings of the 2008 SAS Global Forum, Paper RECOMMENDED READING SAS Tips I learnt while at Oxford, Philip Mason, SUGI 26, Advanced Tutorials. You ve Got Mail ing Messages and Output Using SAS Engine, Jeanina Worden, Philip Jones, SUGI 29, Posters, to cover the syntax of the FILENAME and FILE statements to automatically send custom e- mails and files, using the filename access method. Tutorial to learn more details how to send via SAS step by step: How to send an in SAS part-i and part-ii. SAS Coding Tips and Techniques, 4
5 CONTACT INFORMATION Your comments and questions are very valued and encouraged. Please contact our author at: Name: Dr. Shilong Kuang Enterprise: Kelley Blue Book, Inc. Address: 195 Technology Drive City, State ZIP: Irvine, CA, Web: SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 5
Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.
Paper 23-27 Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA. ABSTRACT Have you ever had trouble getting a SAS job to complete, although
More informationSimple Rules to Remember When Working with Indexes Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California
Simple Rules to Remember When Working with Indexes Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California Abstract SAS users are always interested in learning techniques related
More informationTips for Constructing a Data Warehouse Part 2 Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA
Tips for Constructing a Data Warehouse Part 2 Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA ABSTRACT Ah, yes, data warehousing. The subject of much discussion and excitement. Within the
More informationTable Lookups: From IF-THEN to Key-Indexing
Paper 158-26 Table Lookups: From IF-THEN to Key-Indexing Arthur L. Carpenter, California Occidental Consultants ABSTRACT One of the more commonly needed operations within SAS programming is to determine
More informationFoundations & Fundamentals. A PROC SQL Primer. Matt Taylor, Carolina Analytical Consulting, LLC, Charlotte, NC
A PROC SQL Primer Matt Taylor, Carolina Analytical Consulting, LLC, Charlotte, NC ABSTRACT Most SAS programmers utilize the power of the DATA step to manipulate their datasets. However, unless they pull
More informationSAS Views The Best of Both Worlds
Paper 026-2010 SAS Views The Best of Both Worlds As seasoned SAS programmers, we have written and reviewed many SAS programs in our careers. What I have noticed is that more often than not, people who
More informationSAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board
SAS PROGRAM EFFICIENCY FOR BEGINNERS Bruce Gilsen, Federal Reserve Board INTRODUCTION This paper presents simple efficiency techniques that can benefit inexperienced SAS software users on all platforms.
More informationDemystifying PROC SQL Join Algorithms Kirk Paul Lafler, Software Intelligence Corporation
Paper TU01 Demystifying PROC SQL Join Algorithms Kirk Paul Lafler, Software Intelligence Corporation ABSTRACT When it comes to performing PROC SQL joins, users supply the names of the tables for joining
More informationUNIX Comes to the Rescue: A Comparison between UNIX SAS and PC SAS
UNIX Comes to the Rescue: A Comparison between UNIX SAS and PC SAS Chii-Dean Lin, San Diego State University, San Diego, CA Ming Ji, San Diego State University, San Diego, CA ABSTRACT Running SAS under
More informationYou have got SASMAIL!
You have got SASMAIL! Rajbir Chadha, Cognizant Technology Solutions, Wilmington, DE ABSTRACT As SAS software programs become complex, processing times increase. Sitting in front of the computer, waiting
More informationEssential Project Management Reports in Clinical Development Nalin Tikoo, BioMarin Pharmaceutical Inc., Novato, CA
Essential Project Management Reports in Clinical Development Nalin Tikoo, BioMarin Pharmaceutical Inc., Novato, CA ABSTRACT Throughout the course of a clinical trial the Statistical Programming group is
More informationThe SET Statement and Beyond: Uses and Abuses of the SET Statement. S. David Riba, JADE Tech, Inc., Clearwater, FL
The SET Statement and Beyond: Uses and Abuses of the SET Statement S. David Riba, JADE Tech, Inc., Clearwater, FL ABSTRACT The SET statement is one of the most frequently used statements in the SAS System.
More informationAdvanced Tutorials. Numeric Data In SAS : Guidelines for Storage and Display Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD
Numeric Data In SAS : Guidelines for Storage and Display Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD ABSTRACT Understanding how SAS stores and displays numeric data is essential
More informationTechniques for Managing Large Data Sets: Compression, Indexing and Summarization Lisa A. Horwitz, SAS Institute Inc., New York
Techniques for Managing Large Data Sets: Compression, Indexing and Summarization Lisa A. Horwitz, SAS Institute Inc., New York Abstract Storage space and accessing time are always serious considerations
More informationTop Ten SAS Performance Tuning Techniques
Paper AD39 Top Ten SAS Performance Tuning Techniques Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California Abstract The Base-SAS software provides users with many choices for accessing,
More informationSAS Programming Tips, Tricks, and Techniques
SAS Programming Tips, Tricks, and Techniques A presentation by Kirk Paul Lafler Copyright 2001-2012 by Kirk Paul Lafler, Software Intelligence Corporation All rights reserved. SAS is the registered trademark
More informationSubsetting Observations from Large SAS Data Sets
Subsetting Observations from Large SAS Data Sets Christopher J. Bost, MDRC, New York, NY ABSTRACT This paper reviews four techniques to subset observations from large SAS data sets: MERGE, PROC SQL, user-defined
More informationUsing the SQL Procedure
Using the SQL Procedure Kirk Paul Lafler Software Intelligence Corporation Abstract The SQL procedure follows most of the guidelines established by the American National Standards Institute (ANSI). In
More informationUsing DATA Step MERGE and PROC SQL JOIN to Combine SAS Datasets Dalia C. Kahane, Westat, Rockville, MD
Using DATA Step MERGE and PROC SQL JOIN to Combine SAS Datasets Dalia C. Kahane, Westat, Rockville, MD ABSTRACT This paper demonstrates important features of combining datasets in SAS. The facility to
More informationBig Data, Fast Processing Speeds Kevin McGowan SAS Solutions on Demand, Cary NC
Big Data, Fast Processing Speeds Kevin McGowan SAS Solutions on Demand, Cary NC ABSTRACT As data sets continue to grow, it is important for programs to be written very efficiently to make sure no time
More informationParallel Data Preparation with the DS2 Programming Language
ABSTRACT Paper SAS329-2014 Parallel Data Preparation with the DS2 Programming Language Jason Secosky and Robert Ray, SAS Institute Inc., Cary, NC and Greg Otto, Teradata Corporation, Dayton, OH A time-consuming
More informationPerformance Test Suite Results for SAS 9.1 Foundation on the IBM zseries Mainframe
Performance Test Suite Results for SAS 9.1 Foundation on the IBM zseries Mainframe A SAS White Paper Table of Contents The SAS and IBM Relationship... 1 Introduction...1 Customer Jobs Test Suite... 1
More informationSQL SUBQUERIES: Usage in Clinical Programming. Pavan Vemuri, PPD, Morrisville, NC
PharmaSUG 2013 Poster # P015 SQL SUBQUERIES: Usage in Clinical Programming Pavan Vemuri, PPD, Morrisville, NC ABSTRACT A feature of PROC SQL which provides flexibility to SAS users is that of a SUBQUERY.
More informationSwitching from PC SAS to SAS Enterprise Guide Zhengxin (Cindy) Yang, inventiv Health Clinical, Princeton, NJ
PharmaSUG 2014 PO10 Switching from PC SAS to SAS Enterprise Guide Zhengxin (Cindy) Yang, inventiv Health Clinical, Princeton, NJ ABSTRACT As more and more organizations adapt to the SAS Enterprise Guide,
More informationLost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole
Paper BB-01 Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole ABSTRACT Stephen Overton, Overton Technologies, LLC, Raleigh, NC Business information can be consumed many
More informationCHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases
3 CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases About This Document 3 Methods for Accessing Relational Database Data 4 Selecting a SAS/ACCESS Method 4 Methods for Accessing DBMS Tables
More informationAmadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator
WHITE PAPER Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com SAS 9 Preferred Implementation Partner tests a single Fusion
More informationPublished. Technical Bulletin: Use and Configuration of Quanterix Database Backup Scripts 1. PURPOSE 2. REFERENCES 3.
Technical Bulletin: Use and Configuration of Quanterix Database Document No: Page 1 of 11 1. PURPOSE Quanterix can provide a set of scripts that can be used to perform full database backups, partial database
More informationTransferring vs. Transporting Between SAS Operating Environments Mimi Lou, Medical College of Georgia, Augusta, GA
CC13 Transferring vs. Transporting Between SAS Operating Environments Mimi Lou, Medical College of Georgia, Augusta, GA ABSTRACT Prior to SAS version 8, permanent SAS data sets cannot be moved directly
More informationReducing Big Data to Manageable Proportions Sigurd W. Hermansen, Westat, Rockville, MD, USA
Paper IT-02 Reducing Big Data to Manageable Proportions Sigurd W. Hermansen, Westat, Rockville, MD, USA ABSTRACT Billions and billions of observations now fall within the scope of SAS data Libraries; that
More informationSAS University Edition: Installation Guide for Linux
SAS University Edition: Installation Guide for Linux i 17 June 2014 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2014. SAS University Edition: Installation Guide
More informationMAS 500 Intelligence Tips and Tricks Booklet Vol. 1
MAS 500 Intelligence Tips and Tricks Booklet Vol. 1 1 Contents Accessing the Sage MAS Intelligence Reports... 3 Copying, Pasting and Renaming Reports... 4 To create a new report from an existing report...
More informationPharmaSUG2011 - Paper AD11
PharmaSUG2011 - Paper AD11 Let the system do the work! Automate your SAS code execution on UNIX and Windows platforms Niraj J. Pandya, Element Technologies Inc., NJ Vinodh Paida, Impressive Systems Inc.,
More informationRelease 2.1 of SAS Add-In for Microsoft Office Bringing Microsoft PowerPoint into the Mix ABSTRACT INTRODUCTION Data Access
Release 2.1 of SAS Add-In for Microsoft Office Bringing Microsoft PowerPoint into the Mix Jennifer Clegg, SAS Institute Inc., Cary, NC Eric Hill, SAS Institute Inc., Cary, NC ABSTRACT Release 2.1 of SAS
More informationOverview. NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT
177 CHAPTER 8 Enhancements for SAS Users under Windows NT Overview 177 NT Event Log 177 Sending Messages to the NT Event Log Using a User-Written Function 178 Examples of Using the User-Written Function
More informationPaper 2917. Creating Variables: Traps and Pitfalls Olena Galligan, Clinops LLC, San Francisco, CA
Paper 2917 Creating Variables: Traps and Pitfalls Olena Galligan, Clinops LLC, San Francisco, CA ABSTRACT Creation of variables is one of the most common SAS programming tasks. However, sometimes it produces
More informationProject Request and Tracking Using SAS/IntrNet Software Steven Beakley, LabOne, Inc., Lenexa, Kansas
Paper 197 Project Request and Tracking Using SAS/IntrNet Software Steven Beakley, LabOne, Inc., Lenexa, Kansas ABSTRACT The following paper describes a project request and tracking system that has been
More informationOne problem > Multiple solutions; various ways of removing duplicates from dataset using SAS Jaya Dhillon, Louisiana State University
One problem > Multiple solutions; various ways of removing duplicates from dataset using SAS Jaya Dhillon, Louisiana State University ABSTRACT In real world, analysts seldom come across data which is in
More informationPaper 109-25 Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation
Paper 109-25 Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation Abstract This paper discusses methods of joining SAS data sets. The different methods and the reasons for choosing a particular
More informationAlternative Methods for Sorting Large Files without leaving a Big Disk Space Footprint
Alternative Methods for Sorting Large Files without leaving a Big Disk Space Footprint Rita Volya, Harvard Medical School, Boston, MA ABSTRACT Working with very large data is not only a question of efficiency
More informationFun with PROC SQL Darryl Putnam, CACI Inc., Stevensville MD
NESUG 2012 Fun with PROC SQL Darryl Putnam, CACI Inc., Stevensville MD ABSTRACT PROC SQL is a powerful yet still overlooked tool within our SAS arsenal. PROC SQL can create tables, sort and summarize data,
More informationManaging Clinical Trials Data using SAS Software
Paper DM08 Managing Clinical Trials Data using SAS Software Martin J. Rosenberg, Ph.D., MAJARO InfoSystems, Inc. ABSTRACT For over five years, one of the largest clinical trials ever conducted (over 670,000
More informationFive Little Known, But Highly Valuable, PROC SQL Programming Techniques. a presentation by Kirk Paul Lafler
Five Little Known, But Highly Valuable, PROC SQL Programming Techniques a presentation by Kirk Paul Lafler Copyright 1992-2014 by Kirk Paul Lafler and Software Intelligence Corporation. All rights reserved.
More informationProducing Listings and Reports Using SAS and Crystal Reports Krishna (Balakrishna) Dandamudi, PharmaNet - SPS, Kennett Square, PA
Producing Listings and Reports Using SAS and Crystal Reports Krishna (Balakrishna) Dandamudi, PharmaNet - SPS, Kennett Square, PA ABSTRACT The SAS Institute has a long history of commitment to openness
More informationThe Essentials of Finding the Distinct, Unique, and Duplicate Values in Your Data
The Essentials of Finding the Distinct, Unique, and Duplicate Values in Your Data Carter Sevick MS, DoD Center for Deployment Health Research, San Diego, CA ABSTRACT Whether by design or by error there
More informationData Presentation. Paper 126-27. Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs
Paper 126-27 Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs Tugluke Abdurazak Abt Associates Inc. 1110 Vermont Avenue N.W. Suite 610 Washington D.C. 20005-3522
More informationTHE POWER OF PROC FORMAT
THE POWER OF PROC FORMAT Jonas V. Bilenas, Chase Manhattan Bank, New York, NY ABSTRACT The FORMAT procedure in SAS is a very powerful and productive tool. Yet many beginning programmers rarely make use
More informationARIS Education Package Process Design & Analysis Installation Guide. Version 7.2. Installation Guide
ARIS Education Package Process Design & Analysis Installation Guide Version 7.2 Installation Guide March 2012 This publication is protected by international copyright law. All rights reserved. No part
More informationCreating HTML Output with Output Delivery System
Paper CC07 Creating HTML Output with Output Delivery System Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, CA ABSTRACT Are you looking for ways to improve the way your SAS output appears?
More informationSAS Data Set Encryption Options
Technical Paper SAS Data Set Encryption Options SAS product interaction with encrypted data storage Table of Contents Introduction: What Is Encryption?... 1 Test Configuration... 1 Data... 1 Code... 2
More informationAn email macro: Exploring metadata EG and user credentials in Linux to automate email notifications Jason Baucom, Ateb Inc.
SESUG 2012 Paper CT-02 An email macro: Exploring metadata EG and user credentials in Linux to automate email notifications Jason Baucom, Ateb Inc., Raleigh, NC ABSTRACT Enterprise Guide (EG) provides useful
More informationImporting Excel File using Microsoft Access in SAS Ajay Gupta, PPD Inc, Morrisville, NC
ABSTRACT PharmaSUG 2012 - Paper CC07 Importing Excel File using Microsoft Access in SAS Ajay Gupta, PPD Inc, Morrisville, NC In Pharmaceuticals/CRO industries, Excel files are widely use for data storage.
More informationSAS University Edition: Installation Guide for Windows
SAS University Edition: Installation Guide for Windows i 17 June 2014 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS University Edition: Installation Guide
More informationNew Tricks for an Old Tool: Using Custom Formats for Data Validation and Program Efficiency
New Tricks for an Old Tool: Using Custom Formats for Data Validation and Program Efficiency S. David Riba, JADE Tech, Inc., Clearwater, FL ABSTRACT PROC FORMAT is one of the old standards among SAS Procedures,
More information# or ## - how to reference SQL server temporary tables? Xiaoqiang Wang, CHERP, Pittsburgh, PA
# or ## - how to reference SQL server temporary tables? Xiaoqiang Wang, CHERP, Pittsburgh, PA ABSTRACT This paper introduces the ways of creating temporary tables in SQL Server, also uses some examples
More informationPreparing Real World Data in Excel Sheets for Statistical Analysis
Paper DM03 Preparing Real World Data in Excel Sheets for Statistical Analysis Volker Harm, Bayer Schering Pharma AG, Berlin, Germany ABSTRACT This paper collects a set of techniques of importing Excel
More informationIt s not the Yellow Brick Road but the SAS PC FILES SERVER will take you Down the LIBNAME PATH= to Using the 64-Bit Excel Workbooks.
Pharmasug 2014 - paper CC-47 It s not the Yellow Brick Road but the SAS PC FILES SERVER will take you Down the LIBNAME PATH= to Using the 64-Bit Excel Workbooks. ABSTRACT William E Benjamin Jr, Owl Computer
More informationBe a More Productive Cross-Platform SAS Programmer Using Enterprise Guide
Be a More Productive Cross-Platform SAS Programmer Using Enterprise Guide Alex Tsui Independent Consultant Business Strategy, Analytics, Software Development ACT Consulting, LLC Introduction As a consultant
More informationPharmaSUG 2015 - Paper QT26
PharmaSUG 2015 - Paper QT26 Keyboard Macros - The most magical tool you may have never heard of - You will never program the same again (It's that amazing!) Steven Black, Agility-Clinical Inc., Carlsbad,
More informationManaging Tables in Microsoft SQL Server using SAS
Managing Tables in Microsoft SQL Server using SAS Jason Chen, Kaiser Permanente, San Diego, CA Jon Javines, Kaiser Permanente, San Diego, CA Alan L Schepps, M.S., Kaiser Permanente, San Diego, CA Yuexin
More informationSAS Client-Server Development: Through Thick and Thin and Version 8
SAS Client-Server Development: Through Thick and Thin and Version 8 Eric Brinsfield, Meridian Software, Inc. ABSTRACT SAS Institute has been a leader in client-server technology since the release of SAS/CONNECT
More informationImproving Maintenance and Performance of SQL queries
PaperCC06 Improving Maintenance and Performance of SQL queries Bas van Bakel, OCS Consulting, Rosmalen, The Netherlands Rick Pagie, OCS Consulting, Rosmalen, The Netherlands ABSTRACT Almost all programmers
More informationMore Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board
More Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board INTRODUCTION In 20 years as a SAS consultant at the Federal Reserve Board, I have seen SAS users make
More informationA Method for Cleaning Clinical Trial Analysis Data Sets
A Method for Cleaning Clinical Trial Analysis Data Sets Carol R. Vaughn, Bridgewater Crossings, NJ ABSTRACT This paper presents a method for using SAS software to search SAS programs in selected directories
More informationAn Oracle White Paper December 2013. Advanced Network Compression
An Oracle White Paper December 2013 Advanced Network Compression Disclaimer The following is intended to outline our general product direction. It is intended for information purposes only, and may not
More informationChoosing the Best Method to Create an Excel Report Romain Miralles, Clinovo, Sunnyvale, CA
Choosing the Best Method to Create an Excel Report Romain Miralles, Clinovo, Sunnyvale, CA ABSTRACT PROC EXPORT, LIBNAME, DDE or excelxp tagset? Many techniques exist to create an excel file using SAS.
More informationNormalizing SAS Datasets Using User Define Formats
Normalizing SAS Datasets Using User Define Formats David D. Chapman, US Census Bureau, Washington, DC ABSTRACT Normalization is a database concept used to eliminate redundant data, increase computational
More informationUsing Pharmacovigilance Reporting System to Generate Ad-hoc Reports
Using Pharmacovigilance Reporting System to Generate Ad-hoc Reports Jeff Cai, Amylin Pharmaceuticals, Inc., San Diego, CA Jay Zhou, Amylin Pharmaceuticals, Inc., San Diego, CA ABSTRACT To supplement Oracle
More informationDynamic Decision-Making Web Services Using SAS Stored Processes and SAS Business Rules Manager
Paper SAS1787-2015 Dynamic Decision-Making Web Services Using SAS Stored Processes and SAS Business Rules Manager Chris Upton and Lori Small, SAS Institute Inc. ABSTRACT With the latest release of SAS
More informationIntroduction to Criteria-based Deduplication of Records, continued SESUG 2012
SESUG 2012 Paper CT-11 An Introduction to Criteria-based Deduplication of Records Elizabeth Heath RTI International, RTP, NC Priya Suresh RTI International, RTP, NC ABSTRACT When survey respondents are
More informationLet SAS Modify Your Excel File Nelson Lee, Genentech, South San Francisco, CA
ABSTRACT PharmaSUG 2015 - Paper QT12 Let SAS Modify Your Excel File Nelson Lee, Genentech, South San Francisco, CA It is common to export SAS data to Excel by creating a new Excel file. However, there
More informationSAS 9.3 Foundation for Microsoft Windows
Software License Renewal Instructions SAS 9.3 Foundation for Microsoft Windows Note: In this document, references to Microsoft Windows or Windows include Microsoft Windows for x64. SAS software is licensed
More informationNormalized EditChecks Automated Tracking (N.E.A.T.) A SAS solution to improve clinical data cleaning
Normalized EditChecks Automated Tracking (N.E.A.T.) A SAS solution to improve clinical data cleaning Frank Fan, Clinovo, Sunnyvale, CA Ale Gicqueau, Clinovo, Sunnyvale, CA WUSS 2010 annual conference November
More informationConnect with SAS Professionals Around the World with LinkedIn and sascommunity.org
Connect with SAS Professionals Around the World with LinkedIn and sascommunity.org Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California Charles Edwin Shipp, Consider Consulting
More informationSEO - Access Logs After Excel Fails...
Server Logs After Excel Fails @ohgm Prepare for walls of text. About Me Former Senior Technical Consultant @ builtvisible. Now Freelance Technical SEO Consultant. @ohgm on Twitter. ohgm.co.uk for my webzone.
More informationMWSUG 2011 - Paper S111
MWSUG 2011 - Paper S111 Dealing with Duplicates in Your Data Joshua M. Horstman, First Phase Consulting, Inc., Indianapolis IN Roger D. Muller, First Phase Consulting, Inc., Carmel IN Abstract As SAS programmers,
More informationCleaning Up Your Outlook Mailbox and Keeping It That Way ;-) Mailbox Cleanup. Quicklinks >>
Cleaning Up Your Outlook Mailbox and Keeping It That Way ;-) Whether you are reaching the limit of your mailbox storage quota or simply want to get rid of some of the clutter in your mailbox, knowing where
More informationPaper FF-014. Tips for Moving to SAS Enterprise Guide on Unix Patricia Hettinger, Consultant, Oak Brook, IL
Paper FF-014 Tips for Moving to SAS Enterprise Guide on Unix Patricia Hettinger, Consultant, Oak Brook, IL ABSTRACT Many companies are moving to SAS Enterprise Guide, often with just a Unix server. A surprising
More informationDownloading, Configuring, and Using the Free SAS University Edition Software
PharmaSUG 2015 Paper CP08 Downloading, Configuring, and Using the Free SAS University Edition Software Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California Charles Edwin Shipp,
More informationA Gentle Introduction to Hash Tables. Kevin Martin, Kevin.Martin2@va.gov Dept. of Veteran Affairs July 15, 2009
A Gentle Introduction to Hash Tables Kevin Martin, Kevin.Martin2@va.gov Dept. of Veteran Affairs July 15, 2009 ABSTRACT Most SAS programmers fall into two categories. Either they ve never heard of hash
More informationSAS Office Analytics: An Application In Practice
PharmaSUG 2016 - Paper AD19 SAS Office Analytics: An Application In Practice Monitoring and Ad-Hoc Reporting Using Stored Process Mansi Singh, Roche Molecular Systems Inc., Pleasanton, CA Smitha Krishnamurthy,
More informationEnterpriseLink Benefits
EnterpriseLink Benefits GGY AXIS 5001 Yonge Street Suite 1300 Toronto, ON M2N 6P6 Phone: 416-250-6777 Toll free: 1-877-GGY-AXIS Fax: 416-250-6776 Email: axis@ggy.com Web: www.ggy.com Table of Contents
More informationStoring and Using a List of Values in a Macro Variable
Storing and Using a List of Values in a Macro Variable Arthur L. Carpenter California Occidental Consultants, Oceanside, California ABSTRACT When using the macro language it is not at all unusual to need
More informationE-Mail OS/390 SAS/MXG Computer Performance Reports in HTML Format
SAS Users Group International (SUGI29) May 9-12,2004 Montreal, Canada E-Mail OS/390 SAS/MXG Computer Performance Reports in HTML Format ABSTRACT Neal Musitano Jr Department of Veterans Affairs Information
More informationABSTRACT THE ISSUE AT HAND THE RECIPE FOR BUILDING THE SYSTEM THE TEAM REQUIREMENTS. Paper DM09-2012
Paper DM09-2012 A Basic Recipe for Building a Campaign Management System from Scratch: How Base SAS, SQL Server and Access can Blend Together Tera Olson, Aimia Proprietary Loyalty U.S. Inc., Minneapolis,
More informationA Performance Analysis of Distributed Indexing using Terrier
A Performance Analysis of Distributed Indexing using Terrier Amaury Couste Jakub Kozłowski William Martin Indexing Indexing Used by search
More informationSimply Accounting Intelligence Tips and Tricks Booklet Vol. 1
Simply Accounting Intelligence Tips and Tricks Booklet Vol. 1 1 Contents Accessing the SAI reports... 3 Running, Copying and Pasting reports... 4 Creating and linking a report... 5 Auto e-mailing reports...
More informationSAS Grid Manager Testing and Benchmarking Best Practices for SAS Intelligence Platform
SAS Grid Manager Testing and Benchmarking Best Practices for SAS Intelligence Platform INTRODUCTION Grid computing offers optimization of applications that analyze enormous amounts of data as well as load
More informationNeed for Speed in Large Datasets The Trio of SAS INDICES, PROC SQL and WHERE CLAUSE is the Answer, continued
PharmaSUG 2014 - Paper CC16 Need for Speed in Large Datasets The Trio of SAS INDICES, PROC SQL and WHERE CLAUSE is the Answer ABSTRACT Kunal Agnihotri, PPD LLC, Morrisville, NC Programming on/with large
More informationMake it SASsy: Using SAS to Generate Personalized, Stylized, and Automated Email Lisa Walter, Cardinal Health, Dublin, OH
Paper 89-2010 Make it SASsy: Using SAS to Generate Personalized, Stylized, and Automated Email Lisa Walter, Cardinal Health, Dublin, OH Abstract Email is everywhere! With the continuously growing number
More informationConditional Processing Using the Case Expression in PROC SQL Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California
Conditional Processing Using the Case Expression in PROC SQL Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California Abstract The SQL procedure supports conditionally selecting result
More informationFlat Pack Data: Converting and ZIPping SAS Data for Delivery
Flat Pack Data: Converting and ZIPping SAS Data for Delivery Sarah Woodruff, Westat, Rockville, MD ABSTRACT Clients or collaborators often need SAS data converted to a different format. Delivery or even
More informationSAS ODS HTML + PROC Report = Fantastic Output Girish K. Narayandas, OptumInsight, Eden Prairie, MN
SA118-2014 SAS ODS HTML + PROC Report = Fantastic Output Girish K. Narayandas, OptumInsight, Eden Prairie, MN ABSTRACT ODS (Output Delivery System) is a wonderful feature in SAS to create consistent, presentable
More information2015 Workshops for Professors
SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market
More informationCatalog Creator by On-site Custom Software
Catalog Creator by On-site Custom Software Thank you for purchasing or evaluating this software. If you are only evaluating Catalog Creator, the Free Trial you downloaded is fully-functional and all the
More informationDup, Dedup, DUPOUT - New in PROC SORT Heidi Markovitz, Federal Reserve Board of Governors, Washington, DC
CC14 Dup, Dedup, DUPOUT - New in PROC SORT Heidi Markovitz, Federal Reserve Board of Governors, Washington, DC ABSTRACT This paper presents the new DUPOUT option of PROC SORT and discusses the art of identifying
More informationEstablishing Environmental Best Practices. Brendan Law Blaw@td.com.au @FlamerNZ Flamer.co.nz/spag/
Establishing Environmental Best Practices Brendan Law Blaw@td.com.au @FlamerNZ Flamer.co.nz/spag/ Agenda Active Directory Service Accounts Database Platform Windows Platform Data Storage Planning Virtualisation
More informationWeb Service for SKF @ptitude Observer. Installation Manual. Part No. 32179700 Revision A
Web Service for SKF @ptitude Observer Part No. 32179700 Revision A Copyright 2009 by SKF Reliability Systems All rights reserved. Aurorum 30, 977 75 Luleå Sweden Telephone: +46 (0) 920 758 00, Fax: +46
More informationIntelligent Query and Reporting against DB2. Jens Dahl Mikkelsen SAS Institute A/S
Intelligent Query and Reporting against DB2 Jens Dahl Mikkelsen SAS Institute A/S DB2 Reporting Pains Difficult and slow to get information on available tables and columns table and column contents/definitions
More informationV16 Pro - What s New?
V16 Pro - What s New? Welcome to the V16 Pro. If you re an experienced V16+ and WinScript user, the V16 Pro and WinScript Live will seem like old friends. In fact, the new V16 is designed to be plug compatible
More information