Introduction to. NonStop SQL/MP D30

Transcription

1 Data Management Library Introduction to NonStop SQL/MP Abstract Part Number Edition This manual introduces the basic features of NonStop SQL/MP and is intended for nontechnical and technical readers. Readers do not have to be familiar with Tandem systems. Fourth Published March 1995 Product Version Release ID D30.01 Supported Releases NonStop SQL/MP D30 This manual supports D30.00 and all subsequent releases until otherwise indicated in a new edition. Tandem Computers Incorporated

2 Document History Edition Part Number Product Version Earliest Supported Release Published Second NonStop SQL C30 C30.07 December 1991 Third NonStop SQL C30 C30.07 December 1994 Fourth NonStop SQL/MP D30 D30.01 April 1995 New editions incorporate any updates issued since the previous edition. A plus sign (+) after a release ID indicates that this manual describes function added to the base release, either by an interim product modification (IPM) or by a new product version on a.99 site update tape (SUT). Ordering Information Document Disclaimer Export Statement Examples U.S. Government Customers For manual ordering information: domestic U.S. customers, call ; international customers, contact your local sales representative. Information contained in a manual is subject to change without notice. Please check with your authorized Tandem representative to make sure you have the most recent information. Export of the information contained in this manual may require authorization from the U.S. Department of Commerce. Examples and sample programs are for illustration only and may not be suited for your particular purpose. Tandem does not warrant, guarantee, or make any representations regarding the use or the results of the use of any examples or sample programs in any documentation. You should verify the applicability of any example or sample program before placing the software into productive use. FOR U.S. GOVERNMENT CUSTOMERS REGARDING THIS DOCUMENTATION AND THE ASSOCIATED SOFTWARE: These notices shall be marked on any reproduction of this data, in whole or in part. NOTICE: Notwithstanding any other lease or license that may pertain to, or accompany the delivery of, this computer software, the rights of the Government regarding its use, reproduction and disclosure are as set forth in Section of the FARS Computer Software-Restricted Rights clause. RESTRICTED RIGHTS NOTICE: Use, duplication, or disclosure by the Government is subject to the restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS RESTRICTED RIGHTS LEGEND: Use, duplication or disclosure by the Government is subject to restrictions as set forth in paragraph (b)(3)(b) of the rights in Technical Data and Computer Software clause in DAR (a). This computer software is submitted with restricted rights. Use, duplication or disclosure is subject to the restrictions as set forth in NASA FAR SUP (April 1985) Commercial Computer Software Restricted Rights (April 1985). If the contract contains the Clause at Rights in Data General then the Alternate III clause applies. U.S. Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract. Unpublished All rights reserved under the Copyright Laws of the United States.

3 Contents About This Manual vii Section 1 Introduction to NonStop SQL/MP What Is a Relational Database? 1-1 Why Use NonStop SQL/MP? 1-2 Scenario: High Availability Online Transaction Processing (OLTP) 1-2 Solution 1-3 Scenario: Decision Support Systems (DSS) 1-3 Solution 1-4 Scenario: Scalable Electronic Commerce 1-4 Solution 1-5 A High Performance DBMS 1-6 Scalability 1-8 Tandem s Parallel Hardware Architecture 1-8 Full Integration With System Software 1-9 Table Partitioning and Parallel Execution 1-10 A Highly Available Database 1-11 Availability Features of the NonStop Kernel 1-11 Availability Features of NonStop SQL/MP 1-11 Distributed Database Architecture 1-12 Support for Open Standards 1-14 Easy-to-Use ANSI SQL 1-14 Open System Services (OSS) 1-14 Accessing Data With Desktop Software 1-14 Additional Features 1-15 Cost-Based Query Optimization 1-15 Locking 1-15 Mixed Workload Environment 1-16 Support for National Languages 1-16 Active Data Dictionary 1-16 Constraints 1-17 Performance Features for DSS 1-17 Tools for Database Administration 1-17 Parallel Sorting Utility 1-18 Summary Tandem Computers Incorporated iii

4 Contents Section 2 How to Use NonStop SQL/MP Querying the Database 2-1 Sources of Queries 2-1 Writing Queries 2-1 Selecting Data 2-2 Joining Tables 2-4 Using the UNION Operator 2-5 Using Views 2-5 Shorthand Views 2-6 Protection Views 2-7 Modifying Data 2-8 Concurrent Updates and Locking 2-8 Transaction Management 2-9 Summary 2-9 Section 3 NonStop SQL/MP Architecture Physical Database Structure 3-1 Logical Schema 3-1 Physical Structure 3-1 Table Organization 3-3 Indexes 3-4 Partitions 3-7 What Happens When a Query Is Submitted? 3-9 The SQL Compiler and SQL Optimizer 3-10 The SQL Executor 3-13 Parallel Execution 3-15 The Master Executor and ESPs 3-16 Parallel Index Maintenance 3-18 Other Architectural Features 3-18 The Data Access Manager (DP2 Disk Process) 3-18 Sequential Block Buffering 3-21 Cache Optimizations for Sequential Access 3-23 Summary 3-23 Index Index 1 iv Tandem Computers Incorporated

5 Contents Figures Figure 1-1. A Table in a Relational Database 1-2 Figure 1-2. Comparing Traditional DBMS Architecture With NonStop SQL/MP Architecture 1-6 Figure 1-3. NonStop SQL/MP Processing a Local and a Remote Request in a Scalable Network 1-9 Figure 1-4. Distributed Data Dictionary 1-13 Figure 2-1. Personnel Database Example 2-2 Figure 2-2. A Simple Query 2-3 Figure 2-3. Selecting Data From Two Tables 2-4 Figure 2-4. A Shorthand View Derived From Two Tables 2-6 Figure 2-5. A Protection View 2-7 Figure 3-1. Database Structure 3-2 Figure 3-2. An Index for Faster Sorted Access 3-6 Figure 3-3. A Partitioned Table 3-8 Figure 3-4. Components of NonStop SQL/MP 3-10 Figure 3-5. Selecting an Access Plan With the Lowest Execution Cost 3-12 Figure 3-6. NonStop SQL/MP Components That Execute a Query 3-14 Figure 3-7. Parallel Execution of a Query Using an Aggregate Function 3-17 Figure 3-8. Single-Table Query Evaluation Performed by the Data Access Manager 3-20 Figure 3-9. Virtual Sequential Block Buffering (VSBB) Access 3-22 Tables Table 2-1. Comparison of Protection Views and Shorthand Views Tandem Computers Incorporated v

6 Contents (This page left intentionally blank) vi Tandem Computers Incorporated

7 About This Manual This manual introduces the NonStop SQL/MP relational database management system, which uses the standard Structured Query Language (SQL) approved by the American National Standards Institute (ANSI) and the International Standards Organization (ISO) to describe and manipulate data. NonStop SQL/MP can be used for online transaction processing (OLTP), decision support systems (DSS), scalable electronic commerce applications, and other business applications. NonStop SQL/MP provides high performance, high availability, and scalability for production applications, open access to popular personal computer software tools, and the ability to distribute data over geographically distributed Tandem NonStop systems. Manual Organization Audience This manual contains a general overview of NonStop SQL/MP. If you are a nontechnical reader, you will learn how NonStop SQL/MP can help your business or organization manage its data more effectively. If you are interested in the technical aspects of database management, this manual will help you understand how NonStop SQL/MP works. The manual is organized as follows: Section 1 summarizes NonStop SQL/MP features and explains how a business or organization can benefit from using NonStop SQL/MP. This section is meant for nontechnical readers. Section 2 explains how to develop simple queries and briefly describes the relational model and the logical database architecture. This section is meant for nontechnical and technical readers. Section 3 describes the physical architecture of a NonStop SQL/MP database, the basic internal components of NonStop SQL/MP, and run-time features and benefits such as parallel query execution. This section is intended for technical readers. This manual is intended for anyone interested in a broad overview of NonStop SQL/MP. Only a minimal understanding of computer software technology and database management techniques is required to understand the concepts presented in this manual. For a general understanding of Tandem software, you should read Introduction to Tandem NonStop Systems. For a list of definitions of the terms used in NonStop SQL/MP, you should read NonStop SQL/MP Glossary Tandem Computers Incorporated vii

8 About This Manual Related Manuals Related Manuals This manual is part of the NonStop SQL/MP library of manuals. The library includes the following manuals: NonStop SQL/MP Glossary describes the terminology used in NonStop SQL/MP documentation. NonStop SQL Quick Start describes how to run SQLCI, how to execute simple queries on a database, how to modify data, and how to produce a formatted report. NonStop SQL/MP Reference Manual describes the SQL language elements, expressions, functions, and statements. (See this manual for the complete description and syntax of the report writer commands and the SELECT statement.) NonStop SQL/MP Installation and Management Guide explains how to plan, install, create, and manage a NonStop SQL database; describes the syntax of installation and management commands; and describes NonStop SQL catalogs and file structures. NonStop SQL/MP Query Guide describes how to retrieve and modify data in a NonStop SQL/MP database and how to analyze and improve query performance. NonStop SQL/MP Report Writer Guide describes how to use report writer commands and SQLCI options to design and produce reports. NonStop SQL/MP Version Management Guide describes the rules governing version management for the NonStop SQL software, catalogs, objects, messages, programs, and data structures. NonStop SQL/MP Programming Manual for C and COBOL85 and the NonStop SQL Programming Manual for TAL and Pascal describes the programmatic interface for the particular language. NonStop SQL/MP Messages Manual describes NonStop SQL/MP messages for the conversational interface, the application programming interface, and utilities. The following document is also helpful to users of NonStop SQL/MP: Query Processing Using NonStop SQL/MP describes how queries are executed within the DBMS. Figure 1 shows the manuals in the NonStop SQL/MP library. viii Tandem Computers Incorporated

9 About This Manual Related Manuals Figure 1. NonStop SQL/MP Library Introductory Manuals Introduction to NonStop SQL/MP NonStop SQL Quick Start (C30.07)* Usage Guides Programming Manuals NonStop SQL/MP Install and Management Guide NonStop SQL/MP Version Management Guide NonStop SQL/MP Programming Manual for C NonStop SQL/MP Programming Manual for COBOL85 NonStop SQL/MP Query Guide NonStop SQL/MP Report Writer Guide NonStop SQL Programming Manual for Pascal (C30.07)* NonStop SQL Programming Manual for TAL (C30.07)* Reference Manuals NonStop SQL/MP Messages Manual NonStop SQL/MP Reference Manual * C30-level documentation - does not include information about D30 enhancements Tandem Computers Incorporated ix

10 About This Manual Related Manuals The following manuals provide overviews of other software systems used with NonStop SQL and discussed in this manual: FastSort Manual describes FastSort, the Tandem sort-merge product for NonStop systems. Introduction to Data Management introduces the Tandem data management products. Introduction to Tandem Data Communications presents an overview of the data communications products provided by Tandem. Introduction to Transaction Manager/MP (TM/MP) introduces the transaction management facility used to monitor transactions and manage backup and recovery. NonStop ODBC Server Manual describes the interface product that allows a variety of client applications to communicate with a NonStop SQL/MP database. x Tandem Computers Incorporated

11 1 Introduction to NonStop SQL/MP The NonStop SQL/MP relational database management system combines the standard features of a relational database with additional features that provide data integrity, high availability, high performance, and excellent scalability. NonStop SQL/MP is specifically designed for large data processing environments that maintain a variety of applications performing critical business tasks. This section briefly describes a few types of business applications and shows how NonStop SQL/MP satisfies the needs of those applications. It then discusses the features that allow NonStop SQL/MP to provide high performance and availability.as well as parallel processing, distributed processing, and support for open standards. What Is a Relational Database? A relational database, like any database management system, is used to manage the storage and retrieval of data. A relational system differs from other database management systems in the way the end user or programmer accesses the data. In a relational database, all data can be treated as two-dimensional tables, sometimes called relations, consisting of rows and columns, similar in form to a personal computer spreadsheet. Each row contains pieces of related data such as one employee s name, identification number, and salary. Each column contains data of the same type. For example, one column contains all the employee names; another contains all the salaries. When you retrieve data from a table, the result of your operation is returned as another two-dimensional table. You can therefore perform sophisticated data analysis by passing the table resulting from one simple operation to another operation, continuing until the resultant table contains exactly the information you need. In addition, you can access a whole set of data with a single command. A single SQL statement can retrieve or modify a set of records matching selection criteria. Thus, it can often replace several lines of code that read and process one record at a time. In a relational database, data combined from multiple tables is joined by columns of common data rather than by links and pointers to physical locations. A relational database separates the application and the user view of data from the physical organization and storage of data. Ideally, you can reorganize the physical data storage without changing existing applications, or you can recombine the data for a new application by creating a logical view that joins data from multiple tables. Figure 1-1 shows a relational table that makes data easy to visualize and easy to access. For example, you can select from the EMPLOYEE table only the rows describing employees in department 1000 or only the columns containing employee names and salaries Tandem Computers Incorporated 1 1

12 Introduction to NonStop SQL/MP Why Use NonStop SQL/MP? Figure 1-1. A Table in a Relational Database Table Name Column Names Rows EMPLOYEE Table EMPNUM FIRST_NAME LAST_NAME DEPTNUM JOBCODE SALARY 1 23 ROGER JERRY GREEN HOWARD JANE RAYMOND JESSICA CRINER Columns 001 Why Use NonStop SQL/MP? Scenario: High Availability Online Transaction Processing (OLTP) Many vendors offer relational databases. All these database systems support basic two-dimensional tables; they differ in the additional features they offer. NonStop SQL/MP contains features that make it especially appropriate for three general types of applications: high availability online transaction processing (OLTP), decision support systems (DSS), and scalable electronic commerce. Suppose your company provides financial services. Customers call your customer service representatives to inquire about account balances, transfer funds from one account to another, and authorize electronic payments. The account balances must be accurate and each transaction against an account must fully complete before the account balance can be updated. You want to provide services whenever customers want them, not only during traditional banking hours. Many customers call at 9 a.m., but some call at midnight. Your services must be available 24 hours a day, 7 days a week, 365 days a year, and you must be able to handle the peak calling loads that occur every morning. Because your customers usually call in their requests, they want fast responses. They cannot wait for several minutes while you process their requests. You need very fast response time for both inquiries and updates to the database. Because of your excellent record in customer service, your business has grown rapidly. Originally, you had only 50 customer service representatives. You now employ over 1,000 customer service representatives who are all entering customer requests at the same time. You need a system that can handle a large volume of simultaneous transactions Tandem Computers Incorporated

13 Introduction to NonStop SQL/MP Why Use NonStop SQL/MP? Solution Reliability and transaction integrity. NonStop SQL/MP works with the NonStop Kernel operating system and the NonStop Transaction Manager/MP (TM/MP) subsystem to ensure that every piece of data exactly matches what was originally input and that a transaction either completes or the database is returned to its state just before the transaction started. If a disk drive fails, NonStop SQL/MP automatically uses the mirrored copy of the disk drive, thus ensuring that no data is lost or misinterpreted because of the hardware failure. Availability. NonStop SQL/MP is designed so that it can be in use continuously. You can perform all administration and database management functions while end users are updating the database. (Some operations that change the structure of the database can cause a very brief downtime for OLTP transactions.) Scalability. You can easily increase the size of a NonStop SQL/MP database to accommodate peak loads and the demands of a growing application. The same data can be accessed by multiple processors, and no extra programming is required to support the additional processors. Therefore, you can add processors and disk drives to a system without having to change the application. Moreover, benchmarks have shown over 99 percent scalability for systems that scale to more than 100 processors. That is, when you add a processor, over 99 percent of that processor s power is available to your application. Fast response. NonStop SQL/MP is tuned for OLTP performance. NonStop SQL Release 1.0 was the first relational database to provide acceptable OLTP performance. Since then, enhancements in parallel processing techniques and operations such as joining tables and sorting data continue to improve response time. Concurrency. NonStop SQL/MP can lock data at the individual row level. A lock ensures that while one user updates data, no other users can access or update it. Thus, many users can access the same table with minimal contention, because each user is likely to access a different record. In addition, NonStop SQL/MP offers special locks that allow users to read the data without affecting concurrent writes or updates. Most OLTP applications also involve batch updates and reports. Through the mixed workload feature of NonStop SQL/MP, batch jobs or long queries can be run at low priority while the OLTP tasks are executing, without adversely affecting the response time of the OLTP tasks. Scenario: Decision Support Systems (DSS) Suppose your company operates a chain of retail stores. You are responsible for ordering products to be sold in these stores. You must offer the products customers want to buy and have the right amount of product on your shelves--enough to satisfy demand, but not so much that you will have to sell outdated stock below your cost. You have a separate system that manages your point-of-sale (POS) devices and tracks each item sold in each store. Your stores sell over a million items a day. Some items such as toothpaste turn over regularly. However, other items such as Valentine s Day gifts have dramatically different sales volumes in different stores and different weeks leading up to the holiday. You monitor each day s sales and place Tandem Computers Incorporated 1 3

14 Introduction to NonStop SQL/MP Why Use NonStop SQL/MP? orders for additional product to be delivered from the manufacturer directly to your stores. You work best with visual images, using a mouse to point and click on icons and menu items. You want to access data from your PC and do not want to have to issue SQL commands. Solution Technology for large tables. NonStop SQL/MP is especially appropriate for the large tables used by decision support applications. For example, NonStop SQL/MP includes features that eliminate the need to sort large tables in order to join them to other tables or to compute aggregate values such as SUM or MAXIMUM. Fast loading. By using parallel processing and buffers, NonStop SQL/MP can load large amounts of data into a table at high rates. If the data is coming from environments outside of Tandem, Tandem and third-party vendors can provide tools to convert and edit the data before it is loaded into the database; these tools can also facilitate parallel loading operations. In addition, NonStop SQL/MP allows you to create indexes in parallel, which contributes to the fast loading of DSS data. Availability. Although availability requirements are not as high for decision support as for OLTP, many businesses now rely on their data warehouses (decision support databases) for quick decisions. These businesses therefore require their data warehouses to be available at all times. In addition, some queries can need several hours or even days to complete. NonStop SQL/MP is designed to remain available, even during long-running tasks, so it does not need to be restarted if a hardware failure occurs. Client tools. Query tools available on the PC and the Macintosh computer allow users to select data using point-and-click technology (such as a mouse) and display data in graphical format. NonStop SQL/MP supports the most popular methods for accessing data from PC client programs. (These data access methods include ODBC, DAL, EDA/SQL, SQL Server, and SQL*Connect.) In addition, Tandem has a Verification Center to ensure that PC applications work as they are supposed to with NonStop SQL/MP. Scenario: Scalable Electronic Commerce Your company provides a service for manufacturers by accepting electronic orders and routing them to the appropriate supplier. Billing and shipment acknowledgements are also handled electronically. Basically, your company receives an Electronic Data Interchange (EDI) message, examines the message to determine its destination, and ensures that the message gets delivered to its destination. Messages can come in bursts, and you need to make sure you do not lose any messages. Your service has been so popular that business has doubled every six months. You need to handle this increased volume with the same speed and reliability that got you the business in the first place. You have expanded your operations and maintain databases at six locations around the world. Some messages include data in European languages, some in Asian Tandem Computers Incorporated

15 Introduction to NonStop SQL/MP Why Use NonStop SQL/MP? languages. You need to store these messages in their original languages and forward them to the appropriate geographical location. Because your business is growing rapidly, you do not know how many geographical locations you will have in the future. Solution Reliability and transaction integrity. As in an OLTP environment, NonStop SQL/MP uses architectural features such as mirrored disks as well as extensive data checking to ensure that data is stored and retrieved accurately and that transactions are not lost even if they are executed on remote systems. Availability. NonStop SQL/MP is designed for continuous availability. Through the use of Tandem technology, application requests are automatically resubmitted if a processor failure occurs. In addition, SQL statements are automatically recompiled if a change in the environment renders the execution plans invalid. Also, you can increase the size of the database and build new indexes while the database is in full use. Scalability. By adding processor boards and disks, you enable the Tandem system to keep pace with growing demand and maintain a response time equivalent to the one you had originally. Distributed database. You can partition NonStop SQL/MP tables so that some partitions reside on a local system and some on remote systems. Programs operate as if all the data were local and, therefore, do not need to specify the location of the data. Thus, you can add new partitions at remote locations without changing any applications. In addition, if a query requires only local data, you do not need to access remote systems. NonStop SQL/MP ensures transaction integrity across system boundaries by using a two-phase commit protocol, which commits changes to the database only when all systems are able to complete their portions of the transaction. Multiple character set support (MCSS). NonStop SQL/MP supports the use of a variety of character sets, including the ISO 8859 family of character sets as well as double-byte character sets. You can save and retrieve data in its original character set, allowing you to maintain different portions of the data in many different languages. Online database configuration. NonStop SQL/MP allows the database administrator (DBA) to redistribute data while the database is in use. For example, the Move Partition operation moves a portion of the database from one location to another. This feature allows you to open a new data center for a particular part of the world; you can move some data from a central location to the new data center Tandem Computers Incorporated 1 5

16 Introduction to NonStop SQL/MP A High Performance DBMS A High Performance DBMS The NonStop SQL/MP database management system (DBMS) achieves high performance because it operates at the lowest levels of the NonStop server architecture. Instead of operating as a software layer above the operating system, NonStop SQL/MP is integrated with NonStop Kernel system processes such as the file system, message system, and data access manager (DP2 disk process). Figure 1-2 compares the architecture of a traditional DBMS with that of NonStop SQL/MP. In a traditional DBMS, the DBMS must perform most of the query processing, communicating with low-level system processes only to retrieve the data. With NonStop SQL/MP, the low-level system processes themselves perform most of the query processing. The NonStop SQL/MP architecture considerably shortens the path length to the data residing on disk, significantly enhancing performance. Figure 1-2. Comparing Traditional DBMS Architecture With NonStop SQL/MP Architecture Application Application DBMS File System SQL File System Message System DBMS Disk Access SQL Data Access Manager I/O Drivers I/O Drivers Data Data Traditional DBMS NonStopSQL/MP 002 The data access manager (also called the DP2 disk process) evaluates SQL requests made on a single table or table partition. Because the data access manager is fully responsible for data access to a single disk volume, it can be said that the access is encapsulated that is, no other process can access the data and thus subvert the DBMS Tandem Computers Incorporated

17 Introduction to NonStop SQL/MP A High Performance DBMS A data access manager can retrieve data from the table based on the selection criteria in the query; it can also delete and update data. The data access manager executes SQL requests as close as possible to the data, eliminating message traffic to and from a higher-level DBMS layer, as happens in most DBMS systems. (See Figure 1-2.) To optimize performance, NonStop SQL/MP is distributed within the NonStop server so multiple processes can execute separate SQL requests simultaneously or divide a large request into separate tasks and process the tasks in parallel. NonStop SQL/MP uses the message system to pass requests to the data access managers and return data to the processors in which the requests originated. When the desired data resides on a remote system in a distributed network, the data access managers on the remote system filter the data, returning only what you requested. This design is especially beneficial for network performance because it reduces the size and number of messages sent across the communication lines. NonStop SQL/MP organizes special memory pages, called cache, where data can be stored temporarily in memory. Data buffers retrieved by the data access manager are read into cache so that the most frequently used data is in memory. This strategy reduces the frequency with which data must be retrieved from disk, thereby improving performance for both transactional (random) and sequential access to data. When database records need to be read sequentially to satisfy an SQL request, the data access manager can buffer the records, sending blocks of data rather than a single record at a time. A valuable function of the data access manager is that it can filter the data as soon as it is retrieved from disk and then send buffered blocks of data. Both types of buffering reduce the messages and data sent back to the SQL file system, improving performance for sequential access. In addition, the data access manager can perform bulk (56-kilobyte) I/O transfers during sequential retrievals of data. It can also prefetch data into cache while the application is processing the previous block of data so that the new block is available in memory as soon as the application requires it Tandem Computers Incorporated 1 7

18 Introduction to NonStop SQL/MP Scalability Scalability Tandem s Parallel Hardware Architecture A NonStop SQL/MP database is scalable you can expand the size of the database simply by adding processors and disk volumes to the system. You do not have to change application code when you scale up the database. As the database and application grow, you can maintain the high performance of the original, smaller database. Moreover, you can speed up the performance of a database that is not growing again, simply by adding processors and disk volumes. The scalability of a NonStop SQL/MP database is founded on the parallel hardware architecture of Tandem systems. A Tandem system contains from 2 to 16 processors, each with its own memory and disk storage, linked by a pair of high-speed interprocessor buses (IPBs). Because the processors do not share memory or disks, this is sometimes called a shared-nothing architecture. The processors communicate by sending messages over the IPB, following a model similar to the client/server model. A client process executing in any processor submits a request for a service. A server process executing in the same processor or another processor responds to the request. A typical transaction or request for database services comprises several client/server interactions. Suppose an end user running a program on a PC requests data. An application process on the Tandem host serves the request by calling NonStop SQL/MP, which passes the request to the data access manager (DP2), which retrieves the data from disk and passes it back. These processes can be distributed across the system (or across a network) so that no single processor is excessively burdened by a request. When you add a processor to the system, the expansion in performance is linear that is, the performance you get from the added processor is exactly the same as that of the first processor in the system. Linear expandability is possible because the processors do not share resources (memory and disks). A shared resource can cause a performance bottleneck when you expand the system, as the additional components all contend for use of the shared resource. The Tandem TorusNet networking technology allows you to expand beyond a single system and connect up to 4,080 processors in a single network. A process executing anywhere in the network has easy access to any other process or resource in the network. Because of the message-based operating system, it makes no difference whether the process serving a request is located in the local system or in a system elsewhere on the TorusNet network. Figure 1-3 shows how an application process in one system requests data from NonStop SQL/MP, which uses the message system to process the request, retrieving data from both a local and a remote system Tandem Computers Incorporated

19 Introduction to NonStop SQL/MP Scalability Figure 1-3. NonStop SQL/MP Processing a Local and a Remote Request in a Scalable Network Application SQL File System Message System SQL Data Access Manager I/O Drivers SQL File System Message System SQL Data Access Manager I/O Drivers SQL File System Message System SQL Data Access Manager I/O Drivers NonStop SQL/MP Table 1 Partition 1 NonStop SQL/MP Table 1 Partition 2 NonStop SQL/MP Table 1 Partition Full Integration With System Software NonStop SQL/MP is fully integrated with the NonStop Kernel operating system so that it can take advantage of the distributed processing made possible by the Tandem hardware architecture. Through the NonStop Kernel message system, a NonStop SQL/MP process accesses a remote resource simply by qualifying the resource with a node name. The NonStop Kernel, together with the TorusNet network software, present a single system image to every process in the system. (A separate copy of the NonStop Kernel runs in each processor.) NonStop SQL/MP is also fully integrated with other Tandem software to ensure that high performance as well as the integrity and availability of the data is maintained. Integration with other parallel products is especially important in a large, distributed system in which many NonStop SQL/MP processes execute in parallel. For example, the NonStop Transaction Services/MP (TS/MP) application management system ensures that application server processes are evenly distributed among processors in a system. NonStop SQL/MP processes, which serve requests sent by the application servers, are also distributed among processors. As system activity changes, these products dynamically create (or delete) copies of the required processes Tandem Computers Incorporated 1 9

20 Introduction to NonStop SQL/MP Scalability NonStop SQL/MP uses the the NonStop Transaction Manager/MP (TM/MP) subsystem to maintain data consistency amid concurrent transactions. The NonStop TM/MP subsystem ensures that data modifications are either completely applied to the database or not performed at all. NonStop TM/MP automatically backs out any incomplete transactions and manages recovery in case of system or media failures. A NonStop SQL/MP database thus remains in a consistent state even if, for example, a communication line failure interrupts a transaction. NonStop SQL/MP is also integrated with system utilities that help configure and manage software in a Tandem system, with the Measure performance measurement tool, which collects performance statistics on processes and tables, and with system monitoring and event management tools. Table Partitioning and Parallel Execution NonStop SQL/MP allows you to partition the data in a table across many disk volumes. You can configure your hardware system so that the disk volumes (and, thus, the table partitions) are managed by different processors within a system or different processors across a distributed network. Partitioning the database allows NonStop SQL/MP to take full advantage of the parallel processing provided by the Tandem multiprocessor architecture. Multiple NonStop SQL/MP executor processes can execute in parallel in separate processors. The SQL executors can process multiple small queries in parallel. For a large query, an SQL master executor process can create executor server processes that divide up the query into smaller tasks and process the tasks in parallel. The SQL executors can use separate data access managers to retrieve data from the partitions associated with their processor. This strategy balances the I/O workload among the multiple processors, greatly improving both throughput (the total data the system can process at one time) and response time (the elapsed time it takes to complete one user transaction). In addition, you can partition indexes across disk volumes to provide faster access to the data. NonStop SQL/MP also performs the following tasks in parallel: Parallel join operations by the SQL executor during query processing Parallel index maintenance, which reduces the effect of multiple indexes on performance Parallel sorting of large amounts of data by the FastSort utility These features make it possible for a Tandem system to perform massively parallel processing against a very large database. Moreover, these features allow you to scale a database to a larger and larger size as your application requirements grow, without affecting performance Tandem Computers Incorporated

21 Introduction to NonStop SQL/MP A Highly Available Database A Highly Available Database Availability Features of the NonStop Kernel Availability Features of NonStop SQL/MP A NonStop SQL/MP database is highly available, remaining online for current users even when, for example, you scale up the database to accommodate growing business. NonStop SQL/MP provides availability, just as it provides high performance and scalability, by combining features inherent in the Tandem system architecture and NonStop Kernel with features specific to NonStop SQL/MP. The NonStop Kernel operating system uses process pairs to provide fault tolerance. A system process, or an application process that requests database services from an application server process, can have a backup process executing in another CPU. If a fault occurs in the primary CPU, the backup process can take over without interrupting the application. Moreover, process pairs do not require special programming. For transactions that access the database, NonStop TM/MP helps to ensure availability by recovering the transactions in case of a system failure. You can enhance the availability of the database by storing the data for each disk volume on two identical disks. This feature, called disk mirroring, protects data from hardware faults that could affect any individual disk. If one disk becomes inaccessible, the mirrored disk continues to be available, and there is no impact on the application. Tandem also provides software to protect against a disaster affecting an entire site. NonStop SQL/MP works with the Remote Duplicate Database Facility (RDF) product, which lets you maintain a duplicate database on a separate system at a remote location. As users modify the primary database, RDF replicates the changes on the duplicate (backup) database. You can configure RDF so that multiple remote systems maintain backup databases for one another, providing data replication. Each system can then support primary application activity (such as OLTP or query processing) as well as providing for disaster recovery for another system. NonStop SQL/MP provides several features that make it possible to keep the database continuously available to your application. Process pairs (mentioned earlier) are especially valuable when applied to the data access manager. You can configure the data access manager to have a backup process in another processor. If a failure should occur in the CPU in which the primary data access manager resides, the alternate data access manager can immediately take over the job of retrieving data so that the inprocess query can complete successfully. In addition, NonStop SQL/MP ensures that application programs never have to fail when they access database objects that have been changed by the DBA. The active data dictionary, which records all changes made to database objects and reflects the current status of the database, flags each application program that accesses a modified object. (Database objects include tables, indexes, views, programs, and collations.) NonStop SQL/MP automatically recompiles an SQL query if the existing execution plan (the compiled, executable code) has been invalidated because of a modified database object. If a modification does not significantly alter a database object, the existing execution plan may execute acceptably, if perhaps inefficiently. NonStop SQL/MP provides an option to allow the existing plan to execute even after a table or other object the plan Tandem Computers Incorporated 1 11

22 Introduction to NonStop SQL/MP Distributed Database Architecture accesses is changed. For example, if a table is moved from one disk volume to another or if a new index is created, application programs will still run. However, some SQL queries might benefit by using the new index. You can choose when to schedule a recompilation of the SQL queries to take advantage of the new database environment. In a distributed database, NonStop SQL/MP can continue to process a query that accesses several table partitions even if a partition is unavailable. That is, the database remains available to the user even when a portion of the database is unavailable. The user can optionally skip unavailable data. If some of the requested data resides on the unavailable partition, the user still gets partial results. NonStop SQL/MP also allows you to manage and reconfigure physical components of the database online; that is, users can continue full read and write access to the database while these management operations take place. For example, users can continue to access the database while the DBA performs backup and restore operations, loads data, and reorganizes files to maintain efficient access. Also, these operations can be executed at a lower priority than user queries so that they have a minimal impact on the performance of user applications. You can perform the following tasks while the database is available for read and write access: Backing up the database Restoring a part of the database from backup Loading a portion of the database Adding a partition to increase database size The database is also available to user applications while you perform the following tasks: Moving a portion of a table s data to another disk (possibly a larger or faster disk) to balance I/O performance. This task is accomplished by splitting partitions or moving partition boundaries. Creating an index on a table or reorganizing an existing table or index for optimal performance At the end of these operations, however, the database is unavailable very briefly while database file labels are updated to reflect the changed structure. You can schedule the downtime so that it does not affect the execution of user queries. Distributed Database Architecture A NonStop SQL/MP database can be fully distributed. You can partition tables and indexes across systems in a network. (A system is called a node when it is part of a network.) Moreover, NonStop SQL/MP provides location independence. You can store data where it is used most frequently, and your applications can retrieve and change the data regardless of where it is located. Accessing distributed data is as simple as accessing local data: the applications provide the names of the tables that contain the data. NonStop SQL/MP determines the location of the data as well as the best way to retrieve it Tandem Computers Incorporated

23 Introduction to NonStop SQL/MP Distributed Database Architecture Applications can also be distributed. A client (or requester) process can run on one system, while the server processes run on other systems and perform I/O operations on the distributed database. When a transaction updates distributed data, you want to be sure that an interruption at a remote node does not leave the data in an inconsistent state. The NonStop TM/MP subsystem protects all transactions, local and remote, for consistency and automatic recovery. NonStop TM/MP uses a two-phase commit protocol to ensure that changes are committed to the database only when all systems are able to complete their portions of the transaction. NonStop SQL/MP maintains a distributed data dictionary that describes the objects in a distributed database. If a table or index is partitioned among several nodes, NonStop SQL/MP duplicates the data descriptions in catalogs residing on each node, so that each partition is described on its local system. Each catalog also identifies the location of all other partitions. The data dictionary thus provides a unified logical database schema that gives applications complete access to the distributed data. Figure 1-4 shows an example of a distributed data dictionary. At each location, the dictionary maintains information about all the nodes in the distributed database. Figure 1-4. Distributed Data Dictionary New York Los Angeles Montreal TABLE 1 (PARTITION 1) NEW YORK TABLE 1 (PARTITION 2) LOS ANGELES TABLE 1 (PARTITION 3) MONTREAL TABLE 1 (PARTITION 1) NEW YORK TABLE 1 (PARTITION 2) LOS ANGELES TABLE 1 (PARTITION 3) MONTREAL TABLE 1 (PARTITION 1) NEW YORK TABLE 1 (PARTITION 2) LOS ANGELES TABLE 1 (PARTITION 3) MONTREAL Catalog Tables TABLE 1 (PARTITION 1) TABLE 1 (PARTITION 2) TABLE 1 (PARTITION 3) Database Tables In addition, NonStop SQL/MP provides local autonomy. In a distributed database, all NonStop SQL/MP systems in the network cooperate with one another, yet each system can manage and access its own local data independently of the other systems Tandem Computers Incorporated 1 13

24 Introduction to NonStop SQL/MP Support for Open Standards Support for Open Standards Easy-to-Use ANSI SQL Open System Services (OSS) Accessing Data With Desktop Software By using standard programming tools and network protocols, you can minimize training for multiplatform application development, and you can take advantage of a wide variety of tools that adhere to industry standards. Access to NonStop SQL/MP is through ANSI-standard and ISO-standard SQL, a language specifically designed for easy access to relational databases. SQL is easy to learn. It is a concise language comprising a limited number of Data Definition Language (DDL) and Data Manipulation Language (DML) statements. You use only four DML statements to manipulate data SELECT, INSERT, DELETE, and UPDATE. Yet you can use these statements to write complex queries to satisfy a variety of application needs. Moreover, you can write DML statements that focus on the logical organization of the data without having to worry about the specifics of the physical database. For example, you do not need to be overly concerned about the order of WHERE clauses in a query, nor do you need to specify that an index should be used. NonStop SQL/MP automatically optimizes your queries to execute efficiently against the physical NonStop SQL/MP database. You can develop applications using either the traditional Tandem Guardian services or using industry-standard services as provided by Tandem s Open System Services (OSS). Programs in COBOL using traditional Guardian calls and programs in C using POSIX calls (in the OSS environment) can concurrently access the same data and be governed by the same concurrency constraints and transaction controls. The active data dictionary dynamically manages both Guardian-based programs and OSS-based programs. Programs from either environment have equal access to NonStop SQL/MP data. You can also develop applications using the latest in PC and workstation client tools. NonStop SQL/MP supports a variety of industry-standard client/server application program interfaces (APIs) such as ODBC, EDA/SQL, DAL, SQL Server, and SQL*Connect. Because many popular client applications are compatible with these APIs, you can, in many cases, use software tools you are already familiar with to access NonStop SQL/MP data. Further, business-oriented end users do not have to know anything about NonStop SQL/MP, because Tandem API products such as the NonStop ODBC Server automatically translate their requests into NonStop SQL/MP queries. With these packaged software tools, end users can easily create queries by pointing and clicking on icons and menu items. In addition, a number of application development tools provide a standardized environment in which client programmers can develop more specialized queries to satisfy their company s requirements. Both types of tools have open access to NonStop SQL/MP Tandem Computers Incorporated

25 Introduction to NonStop SQL/MP Additional Features Additional Features Cost-Based Query Optimization Locking The following additional features help make NonStop SQL/MP a powerful tool for large applications requiring high performance, high availability, scalability, and data integrity: Cost-based query optimization Locking Mixed workload environment Support for national languages Active data dictionary Constraints Performance features for DSS Tools for database administration FastSort sorting utility The SQL optimizer, a component of the SQL compiler, determines the best (and quickest) way to retrieve the data you want. The optimizer examines the SQL query you have given it and combines this information with information about the physical database configuration. For example, the optimizer evaluates the size of tables, whether the tables are local or remote, and which indexes on the tables are available. Based on this information, it evaluates several alternatives and chooses the execution plan that will use the fewest system resources. As an end user, you do not need to worry about the order in which you refer to tables and conditions, or whether an index exists that might improve the performance of the query. You simply issue the query, and the optimizer develops the plan that will execute your request most efficiently. If an application has hundreds or thousands of users, it is likely that several users will want to access or modify the same data at the same time. In such cases it is important to protect the integrity of the data so that unexpected changes do not occur. Suppose you send your bank a change of address form and one clerk updates your address while another clerk is updating your account balance. Without some controls, the clerk who is updating your account balance could accidently overwrite your new address with the old one. Now suppose a bank manager simultaneously queries the customer account database to compare levels of business activity at various branches. If possible, the database should remain available to this read-only query even while update controls are in effect. NonStop SQL/MP provides several mechanisms to ensure that multiple users do not create surprises for one another. Moreover, these locking mechanisms allow NonStop SQL/MP to sustain both OLTP and DSS workloads that is, both updates and read-only access Tandem Computers Incorporated 1 15

26 Introduction to NonStop SQL/MP Additional Features First, NonStop SQL automatically issues a locking statement whenever you update, insert, or delete data. The lock ensures that while you are updating the data, no one else can access it or update it. Second, NonStop SQL/MP locks individual records rather than a page of records, allowing more users to access the database concurrently. Users only wait on a lock if they want the particular record being locked. They do not have to wait if the record they want is adjacent to a record that is locked. Third, you can control the characteristics of locks. For example, you can specify exclusive or shared access to data. You can also specify whether a lock will be released after you finish accessing an individual record or held until you are finished with all the records you are accessing. Mixed Workload Environment Support for National Languages Active Data Dictionary Most applications perform a variety of tasks concurrently. Some tasks can require immediate responses and others can be less critical. For example, the database activity performed by an order entry clerk needs a high priority, whereas the database retrievals needed to produce the nightly status report can execute at a lower priority. Or, possibly, the CEO needs a report for a board meeting in 15 minutes, whereas an end user might need a report by the next business day. The data access manager (DP2), the system process that accesses the data on each disk volume, contains a scheduling algorithm through which it can temporarily suspend work on low-priority tasks in order to satisfy high-priority tasks. Increasingly, companies operate around the globe, and end users want to access data in their own languages, whether it is English, French, German, or Japanese. NonStop SQL/MP supports a variety of character sets so that data entered in the major European and Asian languages can be represented in the database. In addition, you can define your own collation sequence to determine the order in which characters appear when you sort data. (The collation sequence differs in different languages.) Moreover, different users can define and employ their own collation sequences on the same system. The NonStop SQL/MP data dictionary contains descriptions of all the tables, views, indexes, collations, and SQL object programs that make up the database. Whenever you change any of these objects, NonStop SQL/MP immediately updates the data dictionary so that all database operations use consistent definitions. Also, if your tables are distributed on remote systems, NonStop SQL/MP ensures that the data dictionaries on all systems have consistent definitions of the tables. Even currently running programs can be flagged when you change any database objects. NonStop SQL/MP can then recompile the invalid (flagged) programs and produce new SQL execution plans if the current plans are inconsistent with the altered environment Tandem Computers Incorporated

27 Introduction to NonStop SQL/MP Additional Features Constraints Performance Features for DSS Tools for Database Administration Constraints are conditions stored in the data dictionary to control the data values added to a database. You specify constraints to keep invalid data out of the database. For example, you can ensure that a particular code in the database is limited to values 1 or 2. By defining constraints in the data dictionary, you can avoid duplicating program code to check for invalid data, because the data is checked by NonStop SQL/MP rather than by the program. A database used for decision support needs special features not required for online transaction processing. Typically, users examine large amounts of data rather than accessing a single record at a time, as in OLTP. NonStop SQL/MP provides several types of query execution plans that speed up scanning of large tables, take full advantage of the Tandem parallel architecture, and efficiently calculate aggregate values such as SUM, MINIMUM, MAXIMUM, or AVERAGE. Often, DSS queries combine data from a very large table with one or more small tables. For example, you may join data from a sales history table with a table containing store names and addresses. NonStop SQL/MP provides three methods for joining tables: nested-loop, sort-merge, and hash join. Hash joins are particularly efficient for largescale DSS queries for which sorting is inefficient. A hash join works by building a hash table in memory to perform the join. Another feature, cross product join, prejoins several small tables so you only need a single join operation with the big table. In addition, DSS queries frequently request summarized data derived by grouping and aggregating the data. In NonStop SQL/MP, aggregate values can be calculated directly as data is being read from the disk or by using special hash algorithms to eliminate the need to sort data. In most decision support systems, you load data from another operational database into a separate DSS database. NonStop SQL/MP provides a utility for loading data. This utility uses special techniques to process large blocks of data rather than single rows at a time. You can use the utility to load data into multiple partitions in parallel to further improve the performance of the load operation. The database administrator (DBA) can use SQL Data Definition Language (DDL) statements to create, alter, rename, and delete SQL objects. These statements can create new indexes, add columns to tables, and change security. In addition, NonStop SQL/MP provides database operations that can move data from one disk volume to another while users are updating the database. These partitionconfiguration operations help to optimize system performance and can be executed online; they do not affect application availability. NonStop SQL/MP and third-party software products also provide utilities that help to convert data, manage a database in a distributed environment, reorganize a table or database, and examine which programs use which tables in the database. NonStop SQL/MP is integrated with system administration tools for the NonStop Kernel. These tools measure performance and manage backup and recovery of database files Tandem Computers Incorporated 1 17

28 Introduction to NonStop SQL/MP Summary Parallel Sorting Utility Summary If a query requires that the data be sorted, NonStop SQL/MP determines whether to sort the data in the virtual memory allocated to the user or to call an external sort routine. If you need to sort more than 32,767 rows, NonStop SQL/MP sends your data to the FastSort utility. FastSort can perform the sort in parallel so that multiple processors can work on the sort concurrently. Also, if a sort that is started in the user process space runs out of memory, the partial results are gracefully migrated to the FastSort utility, which completes the task. NonStop SQL/MP is ideally suited for database applications requiring easy access to the data, high performance, high availability, and the ability to scale up the database as the business grows. Conforming to industry-standard programming tools and protocols and accessible from popular PC software tools, NonStop SQL/MP combines open access with the parallel, distributed processing environment made possible by Tandem s multiprocessor architecture Tandem Computers Incorporated

29 2 How to Use NonStop SQL/MP This section describes the basic tools for querying a NonStop SQL/MP database. It shows how the relational model makes it easy to retrieve data by letting you use joins and views. Finally, it discusses how to modify data and maintain data integrity and consistency by using NonStop SQL/MP locking mechanisms and transaction management statements. Querying the Database Sources of Queries Writing Queries The NonStop SQL/MP relational structure is simple and flexible. Data appears in the familiar form of tables with rows and columns. To retrieve data, you can select columns and rows from one or more tables. The SQL language includes simple statements, called Data Manipulation Language (DML) statements, that you use for querying the database. In NonStop SQL/MP, a query can modify data as well as retrieve it. There are several ways to send a query to the NonStop SQL/MP subsystem. Many PC-based or workstation-based query tools can easily build and export SQL queries to a back-end server using popular interfaces such as ODBC or SQL Server. A business professional can build a query simply by clicking icons and menu items in this type of packaged client application. The application will automatically translate the request into an SQL query and send it to the necessary interface such as the NonStop ODBC Server, which delivers the query to NonStop SQL/MP. Thus, this user can access the database without knowing anything about the SQL language. A database administrator (DBA) or programmer can submit a query directly to NonStop SQL/MP by using the SQL Conversational Interface (SQLCI). These two types of queries are likely to use dynamic SQL. NonStop SQL/MP compiles and executes a dynamic SQL statement as soon as it is submitted. A programmer can also embed SQL statements in an application program written in a high-level language such as C or COBOL. This method, called static SQL, allows NonStop SQL/MP to store the compiled query with the compiled program object file. The compiled query can be executed quickly, multiple times, with no overhead for compilation. Static SQL is ideal for large business applications such as high-volume OLTP. The rest of this section describes how to write queries to retrieve and modify data. It focuses on the logical functions of SQL statements, without regard to the method you use to access NonStop SQL/MP or the format of the returned data. (Where and how the returned data is displayed, printed, or manipulated depends on the application.) You write a query by using a SELECT statement in which you specify columns and rows in a particular table or tables. To do this, you need to know the logical structure of the database the way information is distributed in tables and the relationships among the tables. Generally, a database administrator (DBA) creates and manages the tables and other SQL objects, which are cataloged in the data dictionary. The DBA furnishes the logical picture of the database Tandem Computers Incorporated 2 1

30 How to Use NonStop SQL Querying the Database The DBA also manages the physical configuration of database files on the Tandem system. Because NonStop SQL/MP keeps the logical database structure independent of the underlying physical database structure, the query writer does not need to know anything about the physical structure. Selecting Data The SELECT statement enables you to write queries that select particular rows and columns, join two or more tables into a single result, and even make further selections from the result. In a SELECT statement, you can specify the result you want in the following ways: Select columns by naming the columns (this is called projection) Select rows by specifying conditions for selection (this is called restriction) Join tables by naming the tables, specifying the columns in the result, specifying the join conditions (the columns on which the join is performed), and specifying conditions for selection Specify a union operation between the results of two SELECT statements Figure 2-1 shows a sample personnel database. The examples in this manual use the personnel database. Figure 2-1. Personnel Database Example EMPLOYEE Table EMPNUM FIRST_NAME LAST_NAME DEPTNUM JOBCODE SALARY 1 ROGER GREEN JERRY HOWARD JESSICA CRINER DEPT Table JOB Table DEPTNUM DEPTNAME MANAGER RPTDEPT LOCATION JOBCODE JOBDESC 1000 FINANCE CHICAGO JOB Table 100 MANAGER 1500 PERSONNEL CHICAGO 250 ASSEMBLER 9000 CORPORATE CHICAGO 900 SECRETARY Tandem Computers Incorporated

31 How to Use NonStop SQL Querying the Database Suppose you want to know the names and employee numbers of all employees who work in department To answer this request, you can enter the following statement: >> SELECT FIRST_NAME, LAST_NAME, EMPNUM +> FROM EMPLOYEE +> WHERE DEPTNUM = 4000 ; The SELECT statement specifies the columns, the FROM clause identifies the table, and the WHERE clause specifies the desired rows. Figure 2-2 shows the sample query and the list of employees returned by NonStop SQL/MP. Figure 2-2. A Simple Query Simple Query SELECT FNAME, FIRST_NAME, LNAME, LAST_NAME, EMPNUM EMPNUM FROM EMPLOYEE WHERE DEPTNUM = = ; ; FIRST_NAME LAST_NAME EMPNUM LNAME EMPNUM RACHEL MCKAY ERIC BROWN HOWARD Data 005 In a large table, it is important to use the WHERE clause to limit the amount of returned data (called the result). If the sample query did not have a WHERE clause, the result would include every employee name and number in the organization. The application would then have to manipulate the result to give you only the employees in department By providing the means to write highly specific queries, NonStop SQL/MP relieves the application of extra work and improves application performance Tandem Computers Incorporated 2 3

32 How to Use NonStop SQL Joining Tables Joining Tables When a query selects data from two or more tables, the SELECT statement effectively joins the tables to form a single, combined result table. For example, to list the managers of each department and the department name, you need information from the EMPLOYEE and DEPT tables. Figure 2-3 illustrates a SELECT statement that joins these two tables using the employee number, a value common to both tables. (The MANAGER column in the DEPT table lists the employee number of each department manager.) Figure 2-3. Selecting Data From Two Tables SELECT FIRST_NAME, LAST_NAME, DEPTNAME FROM EMPLOYEE, DEPT WHERE EMPLOYEE.EMPNUM = DEPT.MANAGER ORDER BY DEPT.DEPTNUM ; Select Columns Tables to Be Joined Select Rows in Which the Employee Is a Manager EMPLOYEE Table EMPNUM FIRST_NAME LAST_NAME DEPTNUM JOBCODE ROGER JERRY JANE SHERRIE MARY GREEN HOWARD RAYMOND WONG MILLER DEPT Table DEPTNUM DEPTNAME FINANCE PERSONNEL SHIPPING MANAGER MARKETING ASIA SALES CORPORATE 1 FIRST_NAME LAST_NAME DEPTNAME JERRY HOWARD FINANCE MARY MILLER SHIPPING JANE SHERRIE RAYMOND WONG MARKETING ASIA SALES ROGER GREEN CORPORATE Tandem Computers Incorporated

33 How to Use NonStop SQL Using Views Although the join shown in Figure 2-3 occurs on the employee number column, that column does not have to be part of the result. The sample statement selects from the joined tables only those columns that contain employee names and department names and only those rows that contain manager names. The result is sorted by department number. Using the UNION Operator The UNION operator combines the end results of two SELECT statements. You can use union operations to combine data from logically similar tables. Suppose, for example, that a table called EMPLOYE1 contains information about corporate employees in North America. A separate table, EMPLOYE2, contains information about employees in Asia. To retrieve the names and employee numbers of both groups of employees who work in department 4000, you can specify the following query: >> SELECT FIRST_NAME, LAST_NAME, EMPNUM +> FROM \NY.$VOL1.PERSNL.EMPLOYE1 +> WHERE DEPTNUM = 4000 >> UNION >> SELECT FIRST_NAME, LAST_NAME, EMPNUM +> FROM \TOKYO.$VOL2.PERSNL.EMPLOYE2 +> WHERE DEPTNUM = 4000 ; To specify a union operation, you must specify the same number of columns in each select list. Also, columns in corresponding positions must have compatible data types. Using Views A view is a specification of columns and rows from one or more base tables. NonStop SQL/MP does not store the data in a view separately but retrieves the data from the underlying base tables. Thus, a view is a virtual table. The database administrator (DBA) stores a view definition in the data dictionary with the CREATE VIEW statement, which assigns a view name to a SELECT statement. You can select columns and rows from a view to retrieve part of the data represented by that view. Selecting from a predefined view is simpler and less error prone than writing a new SELECT statement each time you want to see a particular view of the data. Views let you customize the database to suit your business needs. When you use one database for different applications, each application can access the database through views that make the database seem designed for that application. In addition, views require no replication of data. The DBA does not have to keep many copies of the same data for different users. If you frequently query the same table or related tables, creating a view saves time and effort. Instead of repeatedly rewriting the query, you can refer to the named view. In addition to enhancing productivity, views can improve performance because the DBA can save the most efficient form of the query as a view Tandem Computers Incorporated 2 5

34 How to Use NonStop SQL Using Views Shorthand Views A shorthand view (the type of view discussed so far) provides convenient data selection from any number of tables and other views. (You cannot update data using a shorthand view.) Figure 2-4 shows a shorthand view derived from the EMPLOYEE and DEPT tables. Figure 2-4. A Shorthand View Derived From Two Tables CREATE VIEW MGRLIST (FIRST_NAME, LAST_NAME, DEPTNAME) AS SELECT FIRST_NAME, LAST_NAME, DEPTNAME FROM EMPLOYEE, DEPT WHERE EMPLOYEE.EMPNUM = DEPT.MANAGER CATALOG PERSNL ; SELECT * FROM MGRLIST ; View defines new column name. View contains only manager rows. Asterisk selects all columns and rows in the view. MGRLIST View FIRST_NAME LAST_NAME DEPTNAME ERIC ROGER JERRY RACHEL MARY JANE THOMAS BROWN GREEN HOWARD MCKAY MILLER RAYMOND RUDLOFF PLANNING CORPORATE FINANCE PERSONNEL SHIPPING MARKETING INVENTORY 007 The CREATE VIEW statement in Figure 2-4 stores a join as a view named MGRLIST. (The SELECT statement in Figure 2-4, while similar to the one shown in Figure 2-3, does not order the result by department number.) After you create a view, you can refer to it as if it were a table. For example, you can request the name of the personnel manager by entering the following SELECT statement: SELECT FIRST_NAME, LAST_NAME FROM MGRLIST WHERE DEPARTMENT = "PERSONNEL" ; Tandem Computers Incorporated

35 How to Use NonStop SQL Using Views Protection Views NonStop SQL/MP provides another type of view, the protection view, for security and privacy. A protection view provides column-level security by restricting access to individual columns. Consider, for example, the salary information in the EMPLOYEE table. You can prevent unauthorized users from seeing employee salaries by securing the table to prevent access to the SALARY column. For general users, you can create a protection view of the EMPLOYEE table without salary information. You can secure the view so that general users can retrieve and update data in the view columns. Only users with access to the underlying table can view or update the SALARY column. Figure 2-5 illustrates a protection view. Figure 2-5. A Protection View CREATE VIEW EMPPUB (EMPNUM, FIRST, LAST, DEPTNUM, JOBCODE) AS SELECT EMPNUM, FIRST_NAME, LAST_NAME, DEPTNUM, JOBCODE FROM EMPLOYEE FOR PROTECTION ; SELECT FIRST, LAST, DEPTNUM FROM EMPPUB ; Only one table allowed. Specifies protection view. Retrieves data from the table through the view. EMPPUB View FIRST LAST DEPTNUM ROGER JERRY JANE BEN JESSICA GREEN HOWARD RAYMOND HENDERSON CRINER Suppose a user who has access to the EMPPUB protection view, but not to the underlying table, issues the following query: SELECT FIRST, LAST, SALARY FROM EMPPUB ; NonStop SQL/MP returns an error message explaining that the SALARY column does not exist Tandem Computers Incorporated 2 7

36 How to Use NonStop SQL Modifying Data A shorthand view can refer to a protection view just as it refers to a table. That means you can simplify data access at the same time as you restrict data access. Table 2-1 compares protection views and shorthand views. Table 2-1. Comparison of Protection Views and Shorthand Views Protection View Derived from only one table Can be secured separately from the table on which it is based Allows modification of data Shorthand View Derived from any number of tables and other views Derives its security from tables or views on which it is based Allows reading but not modification of data Modifying Data NonStop SQL/MP allows you to insert, delete, and update data as well as retrieve it. For example, to add information about a new employee, you can use the INSERT statement to add a row to the EMPLOYEE table. To give certain employees a fivepercent raise, you can issue the following UPDATE statement: UPDATE EMPLOYEE SET SALARY = SALARY * 1.05 WHERE SALARY BETWEEN AND ; Concurrent Updates and Locking In large business applications, particularly OLTP applications, hundreds or thousands of users can modify the database at any time. Suppose, for example, that many customers simultaneously withdraw cash from the hundreds of ATMs belonging to a large bank. NonStop SQL/MP has the power to retrieve and modify the data for all these transactions and quickly respond to each customer. This flexibility, however, raises the issue of data integrity and consistency. Each user must be able to finish modifying a particular value (for example, debiting a checking account by a certain amount) without interference from other concurrent modifications. NonStop SQL/MP provides efficient locking mechanisms, including row-level and range locking, to ensure that only one user at a time can update a particular row (or rows). When a row is updated, NonStop SQL/MP automatically chooses an exclusive lock, preventing any other users from having access to that row. To ensure the highest performance, NonStop SQL/MP can lock a single row, a range of rows, a partition of a table, or an entire table. Normally, NonStop SQL/MP chooses the appropriate locking strategy, but in some instances you might want to impose a specific strategy. NonStop SQL/MP gives you some control over the extent of locking by providing the LOCK TABLE and CONTROL TABLE statements. For more information about locking, refer to the NonStop SQL/MP Query Guide and NonStop SQL/MP Reference Manual Tandem Computers Incorporated

37 How to Use NonStop SQL Summary Transaction Management Summary When you modify data, you must also ensure that the database does not contain corrupted data. If, for example, a communication line fails after a user has requested a cash withdrawal but before the ATM has dispensed the cash, you do not want a database system to record the withdrawal in the row containing that user's checking account balance. To ensure that the database remains consistent when you modify data, you can make each SQL query part of a transaction. A transaction is a logical unit of work that changes the database from one consistent state to another. If a transaction does not finish successfully, the NonStop TM/MP subsystem makes sure that the database returns to the state it was in before the transaction began. In the ATM example, if the ATM does not return a message to the application (and to NonStop SQL/MP) that it has successfully dispensed the money, NonStop TM/MP rolls back the transaction, and NonStop SQL/MP keeps the original value in the checking account balance. NonStop TM/MP manages transactions and protects database changes when the application is executing. You implement transaction management when you write an SQL query by defining the beginning and end of the transaction in the application code. You use the BEGIN WORK statement to indicate the beginning of the transaction. One or more queries (UPDATE statements) can be included in a transaction. If the updates finish successfully, you end the transaction with the COMMIT WORK statement. This statement commits the changes to the database and ends the transaction. If the updates do not finish successfully, you use the ROLLBACK WORK statement to return the database to its original state. NonStop SQL/MP automatically rolls back the work if a failure (such as an application abort) occurs. NonStop SQL/MP uses the ANSI-standard and ISO-standard SQL language for querying the database. NonStop SQL/MP presents data in an accessible format, makes the relations between data elements easy to visualize, and allows any tables with a column in common to be joined. Views make it easy to retrieve frequently accessed data and provide a convenient way to give different users access to different portions of the database. You can also use NonStop SQL/MP queries to update the database. NonStop SQL/MP automatically ensures that the data remains consistent and accurate even when hundreds of transactions execute concurrently Tandem Computers Incorporated 2 9

38 How to Use NonStop SQL Summary (This page left intentionally blank) Tandem Computers Incorporated

39 3 NonStop SQL/MP Architecture This section describes the physical structure of a NonStop SQL/MP database and the way the logical and physical structures are linked through the NonStop SQL/MP data dictionary. The section then describes the basic physical components: files, table organizations, indexes, and partitions. Next, it explains how an SQL query is compiled and executed, showing how NonStop SQL/MP optimizes a query plan to achieve the best performance. Finally, the section examines other architectural features such as parallel query execution, the data access manager, and sequential block buffering. Physical Database Structure Logical Schema Physical Structure A database has a logical and a physical structure. The tables and views that you see and manipulate directly through the SQL language are the logical structure (logical schema). Underlying the logical database is a set of physical files on disks managed by the SQL file system and accessed through the data access manager (DP2 disk process). The logical schema represents the business information contained in the database. The logical schema includes the database elements (such as columns, rows, tables, and views) that you can query using SQL statements. The definitions of these elements, and their relationships to one another, are stored in the NonStop SQL/MP data dictionary. The data dictionary also records the relationships between the logical elements and the corresponding physical objects. The data dictionary consists of the SQL catalogs, sets of relational SQL tables that describe the database objects, and the corresponding file labels, which describe the underlying physical files and also contain catalog information. When you insert data into a table (or an associated protection view), you are actually storing data in a physical file in the Guardian environment. Every table, and indirectly every view, is associated with one or more files. The files are the physical storage structures that hold the data in a database. Figure 3-1 illustrates this two-level structure Tandem Computers Incorporated 3 1

40 NonStop SQL Architecture Physical Database Structure Figure 3-1. Database Structure Logical Level Database Table Catalog Table EMPLOYEE Table EMPNUM FIRST_NAME LAST_NAME ROGER JERRY JANE GREEN HOWARD RAYMOND TABLES Table TABLENAME \SYS1.$VOL1.PERSNL.EMPLOYEE Physical Level $VOL1 Disk Volume File Label EMPLOYEE File Contains EMPLOYEE File Name and Location, Name of Catalog Table Describing the File, and Other Information Physical Data Although Figure 3-1 shows only one file, a single logical table can be partitioned over many physical files. As shown in Figure 3-1, a database table (or view) is defined logically in a set of entries in catalog tables and is associated with one or more disk files. Each file has a file label that describes its physical attributes such as its location on disk and the number of extents. The file label also contains the names of the associated catalog tables as well as catalog information about the logical structure of the object; thus, the physical and logical structures are always synchronized. Because the data dictionary resides in and describes both the logical and physical structures, the dictionary is active that is, whenever a change occurs in the physical environment, the logical structure reflects that change Tandem Computers Incorporated

41 NonStop SQL Architecture Physical Database Structure Moreover, the data dictionary keeps track of all programs that use the database objects described in the catalog and file labels. The catalog is sometimes called the compiletime component of the dictionary because, when programs are SQL-compiled, their relationship to the database objects they access is recorded in the catalog. The file labels are called the run-time component of the dictionary because, when the programs execute, NonStop SQL/MP can access the file labels to find the current database information required to execute the SQL access plans. If the logical or physical structure is changed, NonStop SQL/MP can, if requested, recompile the SQL access plans to execute efficiently in the altered environment. Table Organization When you create a table in NonStop SQL/MP, you specify both logical and physical attributes. Logical attributes include column names and data types. Physical attributes include block size, maximum extents, and the physical organization of the file. The physical organization determines the structure of the table s underlying files. The physical files can be organized in one of three ways: key-sequenced, relative, or entrysequenced. The organization determines how rows are stored in the physical file. Each row must be identified by a unique value, called its primary key. Your requirements for accessing and updating data determine which organization is best for a particular table. Key-Sequenced Tables Key-sequenced is the most commonly used table organization (and the default). When you create a key-sequenced table, you define the primary key as one or more columns in the table; the values in the key columns determine the order in which rows are stored. A typical key might be the employee number, social security number, or job code. You can easily access a row if you know the value of the key. You also can easily retrieve data in the same order as the key. Key-sequenced organization is best suited for random-access processing, in which the end-user or application accesses rows based on external criteria, not on the row order. This organization type is also efficient for sequential access. The EMPLOYEE table (shown in Figure 2-1) would probably use key-sequenced organization because each employee has a unique employee ID, and users typically access the rows in random order. If the information you want to organize in a table does not have a unique key value, you can create a key-sequenced table with a clustering key. A clustering key is one or more columns (such as LAST_NAME, FIRST_NAME) whose values determine the row sequence but do not uniquely identify rows. Clustering permits related rows of a table to be stored physically close to one another, which allows the system to retrieve them quickly. To identify each row, NonStop SQL/MP appends a unique eight-byte value to the end of the clustering key Tandem Computers Incorporated 3 3

42 NonStop SQL Architecture Physical Database Structure Relative Tables Relative table organization is useful when an application requires random access to the table and the row number can function as the key to the table. For relative tables, NonStop SQL/MP automatically defines a primary key based on the relative positions of the rows. Typically, the row number has no meaning for the end user and is managed by the application. Access to a specific row is very fast if you know the row number. Because rows are stored relative to the beginning of the table, gaps in row numbers cause gaps to exist in the physical file. However, newly inserted rows can reuse these gaps. You might use a relative table to store reservation records if they are listed by confirmation number. The row numbers (the key) can then have the same values as the confirmation numbers. Clerks can access a reservation directly by using the confirmation number. If a clerk deletes a reservation, the space in the file used by that reservation can be reused if someone reuses the confirmation number. Entry-Sequenced Tables Tables created using entry-sequenced organization are best suited for sequential processing. In an entry-sequenced table, NonStop SQL/MP stores rows in the order in which they are entered into the system. (NonStop SQL/MP automatically generates the primary key value for the row address.) New rows in an entry-sequenced table are appended to the end of the table. Typically, you use entry-sequenced tables for transaction logs in which entries are inserted based on the time they occurred. Indexes You can usually find a value faster in a primary-key column than in another table column. If you often need to look up table values in a sequence different from the primary-key sequence, you can create an index table. (The table on which an index is based is called the base table.) (In NonStop SQL/MP, the primary-key column is part of the structure of the base table and is the clustering column for the table. Thus, the primary key is functionally equivalent to what is called the primary index in some other SQL environments. All other indexes in NonStop SQL/MP are separate files.) Suppose, for example, that you want an alphabetical list of employees from the EMPLOYEE table. An index based on the LAST_NAME and FIRST_NAME columns can speed up access by employee name. You can always select rows in a sequence based on nonprimary-key columns, with or without an index. For example, if there is no index on employee names, NonStop SQL/MP sequentially scans the entire EMPLOYEE table and sorts the rows to retrieve the names in the requested order. However, with an index, NonStop SQL/MP can scan the index in much less time than it takes to scan the base table and then retrieve the requested data by fetching the corresponding rows in the base table Tandem Computers Incorporated

43 NonStop SQL Architecture Physical Database Structure Index-Only Access For some queries, NonStop SQL/MP achieves even greater access speed if the index columns contain all the requested data. To satisfy these queries, the system retrieves data from the index table only and does not read the base table at all. This feature is called index-only access. Suppose an index on the EMPLOYEE table includes the LAST_NAME, FIRST_NAME, and EMPNUM columns. Now suppose a query selects only those columns and requests the result to be ordered by employee name. In this case, NonStop SQL/MP uses index-only access. If the query also selects the DEPTNUM column, NonStop SQL/MP scans the index, uses it to locate the appropriate rows in the base table, and then reads the base table to retrieve the data. Figure 3-2 illustrates the index XEMPNAME, which speeds access to the EMPLOYEE table for selections sequenced by employee name. (When you create the index, you do not have to specify the EMPNUM column because it is the primary key of the EMPLOYEE table. NonStop SQL/MP automatically includes the primary key in an index.) Tandem Computers Incorporated 3 5

44 NonStop SQL Architecture Physical Database Structure Figure 3-2. An Index for Faster Sorted Access CREATE INDEX XEMPNAME ON EMPLOYEE ( LAST_NAME, FIRST_NAME ) CATALOG PERSNL ; This clause orders the index by LAST_NAME, FIRST_NAME. XEMPNAME Index LAST_NAME ALBERT BARTON BONNY BROWN FIRST_NAME EMPNUM HERB RICHARD MARLENE ERIC SELECT EMPNUM, FIRST_NAME, LAST_NAME, DEPTNUM FROM EMPLOYEE ORDER BY LAST_NAME, FIRST_NAME EMPLOYEE Table EMPNUM FIRST_NAME LAST_NAME DEPTNUM ROGER JERRY JANE ERIC RICHARD MARLENE HERB GREEN HOWARD RAYMOND BROWN BARTON BONNY ALBERT EMPNUM FIRST_NAME HERB RICHARD MARLENE ERIC LAST_NAME ALBERT BARTON BONNY BROWN DEPTNUM Index definitions are stored separately from base table definitions. Each index has a name and is stored in a key-sequenced file of the same name as the index. Index files are not tables, and you cannot query an index directly. NonStop SQL/MP uses indexes to provide faster access to tables. Each index is an access path to a table. Even when you expect the system to use index-only access, your SQL query must refer to the base table, not the index Tandem Computers Incorporated

45 NonStop SQL Architecture Physical Database Structure You can add and delete indexes as needed without having to change or redefine the base table. If you determine that a new index would improve performance, you can add the index and receive performance benefits immediately. Partitions Partitions can make data more accessible in a large table or a table used at different geographical locations. A partition of a table or index holds all the rows within a range of key values. Partitions provide the following benefits: Partitions improve transaction throughput (for small queries) and query execution (for large queries) by allowing simultaneous, parallel disk access to different partitions of the same table. Partitions also enable other system resources (such as disk cache and memory for hashing, grouping, aggregation, and sorting) to operate simultaneously on a query, further improving performance. Partitions allow you to store the data close to where it is used in a geographically distributed database. Partitions are independent of one another for access. Only the accessed partition needs to be available. Partitions require no special access procedures. NonStop SQL/MP manages partition access for you automatically. All partitions of a table or index automatically receive the same security settings. If a table is too large to fit on a single disk volume, you can partition the table over two or more volumes. Partitions provide an efficient method for performing database management functions, which can be run against single partitions, thus reducing the impact of database management on application availability. Even if a large table fits on a single volume, it is sometimes more efficient to partition the table across many disks to spread the I/O requests over many data access manager processes. Partitioning the table allows you to take advantage of parallel execution for some queries. (See Parallel Execution later in this section.) Suppose your company stores parts at warehouses in New York, Los Angeles, and Montreal, and you want to maintain the parts information at the three sites. You can create a parts description table with three partitions, one for each location. Figure 3-3 shows a table containing the location and quantity on hand of each part. The table, PARTLOC, is partitioned by its primary key: the location code and part number Tandem Computers Incorporated 3 7

46 NonStop SQL Architecture Physical Database Structure Figure 3-3. A Partitioned Table CREATE TABLE \NY.$WHS1.INVENT.PARTLOC ( LOC_CODE CHARACTER (3) NO DEFAULT NOT NULL, PARTNUM NUMERIC (4) UNSIGNED NO DEFAULT NOT NULL, QTY_ON_HAND NUMERIC (7) NO DEFAULT NOT NULL, PRIMARY KEY (LOC_CODE, PARTNUM) ) CATALOG \NY.$WHS1.INVENT PARTITION ( \LA.$WHS2.INVENT.PARTLOC CATALOG \LA.$WHS2.INVENT FIRST KEY ( "G00", 0000 ), \MONT.$WHS3.INVENT.PARTLOC CATALOG \MONT.$WHS3.INVENT FIRST KEY ( "P00", 0000 ) ) ; New York "A00" assumed first key value for LOC_CODE. Los Angeles Montreal $WHS1 (New York) PARTLOC Table (Partition 1) LOC PART QTY $WHS2 (Los Angeles) PARTLOC Table (Partition 2 ) LOC PART QTY $WHS3 (Montreal) PARTLOC Table (Partition 3 ) LOC PART QTY A10 A21 A G11 G68 G P11 P12 P First Key = "A00" First Key = "G00" First Key = "P00" 011 The first partition, located in New York, contains all rows with a location code equal to or greater than A00 but less than G00. The second partition contains location codes equal to or greater than G00 and less than P00. The third partition has location codes equal to or greater than P00. NonStop SQL/MP handles partitions for you automatically. You insert, retrieve, or update data in a partitioned table just as if the table were not partitioned. For example, you update a row in the partition on $WHS1, shown in Figure 3-3, like this: Tandem Computers Incorporated

47 NonStop SQL Architecture What Happens When a Query Is Submitted UPDATE PARTLOC SET QTY_ON_HAND = QTY_ON_HAND + 20 WHERE LOC_CODE = "A21" AND PARTNUM = 1403 ; A partitioned table in NonStop SQL/MP is highly available. Only the partitions being accessed must be available when you retrieve, update, or insert data. Even if network access became temporarily unavailable, you would still have access to your local partition. Also, the DBA can add, drop, move, and split partitions while the entire table remains online and available for querying. (In most cases, these database configuration operations will make the affected partitions unavailable for only a very short time.) What Happens When a Query Is Submitted? When you submit an SQL query to NonStop SQL/MP, the database system does three things: Determines the most efficient plan for executing the SQL statement Compiles the statement into an executable object Executes the plan The manner in which NonStop SQL/MP performs these functions depends first on whether the query uses dynamic or static SQL. Ad hoc queries submitted through an interface such as the NonStop ODBC Server and queries submitted directly through the conversational interface (SQLCI) are likely to be dynamic. NonStop SQL/MP prepares these queries for execution, compiles them, and executes them as soon as they are submitted. Host-language programs containing embedded SQL statements are likely to use static SQL. NonStop SQL/MP compiles static SQL statements when the program is developed, after the language compiler compiles the host-language source code. The compiled SQL statements are placed into production as part of the host object program. When a user invokes the application logic that calls a particular SQL statement, that statement is executed. Figure 3-4 shows a simple diagram of the main components of NonStop SQL/MP. When you submit a dynamic SQL statement, the SQL executor invokes the SQL compiler, which compiles the statement and returns it to the SQL executor. The SQL executor then manages the retrieval of data from the database tables and returns the results to the client submitting the query Tandem Computers Incorporated 3 9

48 NonStop SQL Architecture What Happens When a Query Is Submitted Figure 3-4. Components of NonStop SQL/MP SQL Optimizer Source Program Containing Static SQL Language Compiler Object Program Containing Static SQL SQL Compiler Data Dictionary Object Program File, Compiled SQL Query Plans, Static SQL Source Compiled Dynamic SQL User Application Dynamic SQL SQL Executor Low-Level System Components: SQL File System Data Access Manager Disk Files 012 The SQL Compiler and SQL Optimizer When you embed static SQL statements in a program, the language compiler transforms the host-language source code into a compiled object program. (You can also bind program modules at this stage.) As shown in Figure 3-4, the next step is SQL compilation. Using the object program as its source, the SQL compiler transforms the embedded SQL statements into executable query plans. The SQL compiler creates an object program file containing the hostlanguage object program, the compiled SQL query plans, and the SQL source statements. SQL Statements Designed for Multiple Execution Because the compiled query plans and SQL source statements are stored in the object program file, they are easy to access at run time. Several benefits result when you make both the compiled plans and source statements available to the SQL executor Tandem Computers Incorporated

49 NonStop SQL Architecture What Happens When a Query Is Submitted First, whenever an SQL statement is invoked by the application, the SQL executor can immediately execute that query plan without having to recompile it. This strategy improves performance, especially for OLTP and electronic commerce applications in which the same transaction type (and SQL statements) can execute hundreds or thousands of times a day. Second, keeping the SQL source statements in the program file makes it easy for the SQL executor to access a statement if the statement should need to be recompiled. Occasionally, the DBA might need to change a table (or another SQL object) in such a way that the existing SQL query plan would not access the changed table efficiently or properly. Because the data dictionary keeps track of programs (and their embedded query plans) that access SQL objects, the SQL executor can recognize an out-of-date query plan and can automatically recompile the SQL statement at run time. The SQL executor can retrieve the source statement directly from the object program file. If you use certain compiler options, the SQL executor checks the existing query plan against the changed table. If the change does not affect the existing plan that is, if the plan will execute properly the SQL executor reuses it, saving the cost of the recompilation. These architectural features help to provide both high performance and high availability, a main objective of NonStop SQL/MP. The SQL Optimizer When a query is compiled, the SQL optimizer, a component of the SQL compiler, generates an access plan for the query. The SQL optimizer is cost-based; it evaluates the query and determines the most efficient plan for retrieving data from the database. The optimizer estimates the number of I/O operations, CPU resources, and interprocess messages needed to execute each plan it considers for a query; it then selects the best plan. A typical query can join tables, set conditions for selection in WHERE clauses, and include subqueries (SELECT statements nested within the main SELECT statement). A query may also ask to group or order data (which may involve sorting) or calculate an aggregate value such as a sum or average. These elements in a query can be executed in different ways and in different sequences that affect the execution performance. The size and configuration of the database also affect the efficiency of a given access plan. For example, the tables referenced in the query can have indexes and be partitioned. Figure 3-5 shows a simple example of the generation and selection of an access plan. In the example, the optimizer selects Plan C, which has the lowest execution cost Tandem Computers Incorporated 3 11

50 NonStop SQL Architecture What Happens When a Query Is Submitted Figure 3-5. Selecting an Access Plan With the Lowest Execution Cost Query SQL Optimizer Plan A Read Table A Plan B Read Index to Access Table A Plan C (Parallel Execution) Read Index Partitions in Parallel Access Table A Partitions in Parallel Execution cost = 12 Execution cost = 8 Execution cost = When it chooses an access plan, the optimizer analyzes the relationships between the query elements and the database configuration. For example, the optimizer may choose to read an index if that will result in scanning fewer rows than reading the base table directly. Consider the following query: SELECT EMPNUM, LAST_NAME, FIRST_NAME,DEPTNUM FROM EMPLOYEE WHERE LAST_NAME LIKE "MACDONALD%" ; For this plan, the optimizer uses the XEMPNAME index (discussed earlier in this section) because the condition in the WHERE clause refers to an index column, LAST_NAME, and narrows the selection to a relatively few rows only those containing employees named MacDonald. (The LIKE predicate compares character strings, and the percent sign is a wild-card symbol permitting any characters in that position.) If the XEMPNAME index were not used, NonStop SQL/MP would have to read the entire EMPLOYEE table to find all employees named MacDonald. Thus, the selectivity, or percentage of rows in a table that satisfy a search condition, is important in query optimization. If a query selects from more than one table or a view derived from more than one table, the optimizer chooses a strategy for joining the tables. If a query contains subqueries, the optimizer selects the best order for executing the subqueries. If sorting the data speeds up access, the optimizer determines how to sort the data. If a table is partitioned across many disk volumes managed by different processors, the optimizer determines whether to partition work among the different processors Tandem Computers Incorporated

51 NonStop SQL Architecture What Happens When a Query Is Submitted The optimizer has access to current statistics, maintained in the data dictionary, about the tables, indexes, and other SQL objects. If data descriptions change or indexes are added or deleted, the optimizer uses the current information to choose a query plan. The SQL Executor The SQL executor is a set of system library procedures that executes compiled SQL statements against database tables, views, or the database catalogs. To execute an SQL statement, the executor uses the query plan generated by the SQL optimizer. The executor manages logical names for tables and other SQL objects. By using logical names, programmers can write embedded SQL statements without having to know about the physical characteristics of the database. Logical names make application programs easier to maintain and help keep them independent of the physical database configuration. When a statement is executed, the executor maps the logical names to the underlying physical file names. Managing Low-Level Components The executor manages the low-level components of the NonStop SQL/MP system. A complex query can consist of several individual requests, each of which contains a single set of variables and searches a single table. For each single-table request, the executor calls the SQL file system. The file system, also a set of system library procedures, manages the physical schema of the database. It opens files, indexes, and partitions, and sends requests to the appropriate data access managers. The file system also manages updates to base tables and their associated indexes. The data access manager retrieves and updates data on individual disk volumes. A separate group of data access manager processes handles each disk volume. The data access manager performs all the selection and simple evaluation of data that can be executed against a single disk volume. When possible, the data access manager also performs grouping and aggregation of data. Figure 3-6 shows the components of NonStop SQL/MP that execute a query Tandem Computers Incorporated 3 13

52 NonStop SQL Architecture What Happens When a Query Is Submitted Figure 3-6. NonStop SQL/MP Components That Execute a Query Application Server Process SQL Executor SQL File System SQL executor joins data selected from the two tables. Data Access Manager Data Access Manager Data access managers evaluate and select data from their respective disk volumes. Volume Containing Table A Volume Containing Table B 013 In Figure 3-6, a query requests a join of selected columns and rows from tables A and B. As the figure suggests, NonStop SQL/MP pushes most of the work of data access to the lowest level the data access managers and minimizes the amount of data that needs to be evaluated at higher levels. The data access managers evaluate and select the data from the tables residing on their respective disk volumes. When the data access managers return data to the file system, the file system buffers these responses before sending them to the executor. The executor manipulates all the single-table requests in the query. In the example in Figure 3-6, the executor implements the join of data retrieved from tables A and B. If the data returned by the file system needs to be sorted, the executor performs the sort. The executor uses an in-memory process, the user process sort (UPS), if the returned data is less than or equal to 32,767 rows. For larger tasks, the executor calls the FastSort utility to perform the sort. Finally, the executor returns the results of the query to the host-language variables in the application program or to the interface to the client program Tandem Computers Incorporated

53 NonStop SQL Architecture Parallel Execution Managing Transactions If the SQL query is part of an update transaction, the executor invokes the NonStop TM/MP subsystem for transaction management services. You define a transaction in an application program by placing a BEGIN WORK statement before the SQL statements that will execute within the transaction. At run time, the SQL executor processes the BEGIN WORK statement, calling NonStop TM/MP to initiate the transaction. The SQL executor then executes the SQL statements, but the updates are not committed to the database until the entire transaction has been completed successfully. When all the results are returned successfully to the application, the executor calls NonStop TM/MP to commit the transaction. (A COMMIT WORK statement following the SQL statements in the program indicates successful completion of the transaction.) If any part of the transaction does not finish successfully, NonStop TM/MP rolls back the entire transaction, leaving the database in a consistent state. Parallel Execution In most OLTP and electronic commerce applications, many small transactions execute concurrently. The Tandem multiprocessor architecture makes it possible for these transactions to execute in parallel. Different SQL queries are assigned to different processors, where copies of NonStop SQL/MP concurrently retrieve and update the data. This type of processing, in which multiple transactions are performed in parallel, is called inter-query parallelism. An executing query can access any portion of the database, no matter where it is located. Tandem's message-based operating system allows a query executing in one processor to communicate with a data access manager in another processor in the same system or even another node in a geographically distributed network. NonStop SQL can also execute large queries, which scan large portions of the database, in parallel. These queries are typical of DSS applications, in which users might need to examine large subsets of the data to make complex decisions. To perform these large queries efficiently, NonStop SQL/MP carries out parallel processing within a query. Such intra-query parallelism fully exploits the resources provided by the Tandem multiprocessor, multidisk architecture and improves performance on large queries. When you select the option to allow parallel execution for a query, NonStop SQL/MP can automatically perform intra-query parallelism by dividing a large query into smaller tasks and assigning the tasks to separate processors. When a database spans multiple disks, these separate SQL processes can access the disks in parallel, greatly improving the performance of a query. Parallel query execution can improve the response time of all the basic SQL operations, including selects, inserts, updates, deletes, joins, and aggregate functions Tandem Computers Incorporated 3 15

54 NonStop SQL Architecture Parallel Execution The Master Executor and ESPs To implement parallel execution of a query, NonStop SQL/MP uses a master executor process, which starts the query execution. The master executor invokes multiple executor server processes (ESPs), assigning one ESP to each partition that must be accessed. The ESPs process a statement or part of a statement in parallel, performing the data access against their respective partitions and returning data or status information to the master executor. The master executor then processes that information and returns the result to the user. Suppose you want to know the total number of employees in your company. You can issue the following query: SELECT COUNT(*) FROM EMPLOYEE ; The query uses an aggregate function called COUNT to count the number of rows in the EMPLOYEE table. (You can use other aggregate functions to compute the sum of values in a column or the average, minimum, or maximum value in a column.) Suppose that the EMPLOYEE table has four partitions. Because the preceding query scans the entire table, it is a good candidate for parallel query execution. Figure 3-7 shows how a master executor assigns four ESPs to manage the execution of the COUNT operation against the four partitions of the EMPLOYEE table. At run time, the master executor starts an ESP (or uses an existing ESP) in the current primary processor of each partition's disk volume. Each ESP directs the data access manager associated with its partition to count the rows in that partition and compute the partial aggregate result. The data access managers return the partial results to the ESPs, which pass them to the master executor. The master executor uses the SUM function to add up the partial COUNTs and returns the final result to the user. Figure 3-7 shows a simple example of parallel query execution. ESPs can also perform more complex operations such as parallel joins Tandem Computers Incorporated

55 NonStop SQL Architecture Parallel Execution Figure 3-7. Parallel Execution of a Query Using an Aggregate Function Application Server Process SQL Master Executor SUM of COUNTs Master executor computes final aggregate result. ESP (SQL Executor) File System ESP (SQL Executor) File System ESP (SQL Executor) File System ESP (SQL Executor) File System ESPs manage separate tasks. Data Access Manager Data Access Manager Data Access Manager Data Access Manager COUNT COUNT COUNT COUNT Data access managers compute partial aggregate results. EMPLOYEE Table Volume Containing Partition 1 Volume Containing Partition 2 Volume Containing Partition 3 Volume Containing Partition If a table is not partitioned or partitioned in a way that does not facilitate parallel execution for this query, the optimizer can request the executor to repartition (reorganize) a copy of the data at run time. During repartitioning, NonStop SQL/MP distributes the data over a set of temporary partitions using a hash algorithm. A separate ESP then processes the data in each temporary partition. NonStop SQL/MP can also repartition data in tables and indexes participating in a join. The master executor and ESPs use the repartitioned data to perform a parallel join operation Tandem Computers Incorporated 3 17

56 NonStop SQL Architecture Other Architectural Features If you choose to use the parallel execution option, the optimizer examines both parallel and nonparallel execution plans and chooses the one that will use the fewest resources, even if it is a nonparallel plan. Partitioning a table and its indexes increases the likelihood that the optimizer will choose a parallel execution plan. The partitions might reside on one system or on many nodes in a network. Parallel Index Maintenance Other Architectural Features The Data Access Manager (DP2 Disk Process) Indexes can improve the speed of data access. As you add indexes to a table, you increase the number of quick paths to the data. A table with many indexes allows you to write a variety of queries that meet requirements for high performance. However, each time a user inserts or updates a row in the base table, the system must also update the corresponding rows in all the indexes on the affected columns. To achieve high performance for update operations, NonStop SQL/MP updates multiple indexes on a table in parallel. After completing a change in the base table, the file system sends asynchronous update requests to each data access manager serving an index. Thus, NonStop SQL/MP can modify multiple indexes in approximately the same elapsed time it would take to modify a single index. To take advantage of parallel index maintenance, the DBA should define the indexes for a given table on separate disk volumes and configure them on separate processors. This strategy allows a separate data access manager to access each index, eliminating contention during parallel operations. You do not have to specify a statement or directive to achieve parallel index maintenance. The system updates indexes automatically whenever a row is inserted or deleted or values change in indexed columns. NonStop SQL/MP provides additional architectural features that enhance query performance by allowing data-access operations to execute at the lowest levels of the operating system, as close as possible to the I/O subsystems that physically retrieve the data. The integration of SQL operations with low-level operating system processes (such as the data access managers) reduces the number of messages passed to higherlevel SQL processes (such as the executor) and reduces the overall path length of the operations. Thus, low-level integration improves the efficiency and performance of data-access operations. The SQL executor communicates with the Tandem data access manager (DP2 disk process), a component of the NonStop Kernel operating system, through the file system. A group of data access manager processes performs data access to a single disk volume. The data access manager handles disk space, access paths, and a main-memory buffer pool of recently used blocks called the cache. It implements locking of rows, partitions, or tables on the disk volume. The data access manager also records database updates in the audit trail (log) used by NonStop TM/MP to roll back or recover transactions if a failure occurs. When the file system sends an OPEN request, the data access manager authorizes the application process to access the table. The data access manager also enforces table constraints specified for columns of the table being updated Tandem Computers Incorporated

57 NonStop SQL Architecture Other Architectural Features Single-Table Query Evaluation The data access manager performs single-table query evaluations (requests) against the data on its disk volume. Each evaluation is a simple selection expression consisting of ORs, ANDs, and other simple operators evaluated against literals and columns. The data access manager filters and groups the data it retrieves from disk, returning only the qualified data to the file system. It can also perform aggregations. By returning only the data needed for the result, the data access manager reduces message traffic and improves query performance. Reducing message traffic is even more important when the executor requests data residing at a remote location and the returned data must travel across the network. For example, suppose you want to know the locations and quantity on hand of a group of related parts in your inventory database. The related part numbers are between 2000 and Consider the following query: SELECT LOC, PARTNAME, QTY FROM INVENT WHERE PARTNUM BETWEEN 2000 AND 2010 ; The INVENT table is partitioned across geographically distributed systems, as is the PARTLOC table shown in Figure 3-3. The INVENT table, however, contains 30 columns instead of 3. Figure 3-8 shows how the data access manager executes its part of the preceding query, evaluating a large amount of data and efficiently returning a very small result Tandem Computers Incorporated 3 19

58 NonStop SQL Architecture Other Architectural Features Figure 3-8. Single-Table Query Evaluation Performed by the Data Access Manager INVENT Table (Partition 2) LOC PARTNUM QTY PARTNAME G COPPER G STEEL G COPPER G STEEL G STEEL LOC QTY PARTNAME G COPPER G70 90 Data access manager evaluates the table partition and returns qualified data. 18COPPER 016 In Figure 3-8, the partial result retrieved from Partition 2 is sent across the network, which enhances the value of performing query evaluation before returning data to the requesting SQL executor. Moreover, in this example, several data access managers can operate in parallel against the partitions of the INVENT table. Set-Oriented Update Operations For a query that updates or deletes data, the data access manager executes the request and returns an acknowledgment to the file system. This strategy also reduces message traffic especially when a set of rows is modified or deleted. Instead of returning messages and data to the file system for all the rows to be modified, the data access manager updates the file and sends a single message when it has completed the operation. Mixed Workload Environment The data access manager prioritizes requests to make sure that a long-running query does not monopolize access to the disk. This feature provides for a mixed workload environment, which allows a NonStop SQL/MP database to support different applications that concurrently issue different types of queries Tandem Computers Incorporated

59 NonStop SQL Architecture Other Architectural Features In a DSS environment, for example, users can issue ad hoc queries that differ greatly in size and duration, causing the DSS workload to fluctuate unpredictably. With the mixed workload feature, the DBA or system manager can establish different priorities for different queries. For example, the DBA can give a lower priority to long-running queries so that queries of shorter duration can execute quickly, thereby helping to balance resource utilization. In an OLTP environment, user queries should execute quickly to meet users expectations of low response times. A batch query that updates an OLTP table or generates a report would receive a lower priority than the OLTP queries. Essentially, the batch query runs in the background and does not interfere with the performance of the OLTP application. The mixed workload feature is especially helpful when NonStop SQL/MP uses parallel query execution for a large query. Without the mixed workload feature, data access managers in many processors could become tied up as they scan all the partitions of a table in parallel. Instead, each data access manager checks its queue at intervals while it is executing a long-running request. If a high-priority request is pending, the data access manager preempts processing the long-running request and serves the high-priority request. Moreover, the priority of a long-running query is decreased to give another query of equal priority a chance to execute. In this way, long-running queries are serviced only if the concurrently running OLTP or high-priority DSS workload does not need processor and disk resources. The mixed workload feature also ensures that a low-priority query will eventually resume executing after it has been preempted by a number of high-priority requests. A cyclical scheduling algorithm allows the low-priority query to execute a short time before it is once again preempted. Sequential Block Buffering The most efficient way to satisfy some large DSS queries and batch reports and updates is to scan the table sequentially. For sequential access, the data access manager can use an I/O method called sequential block buffering, which returns data to the file system one block at a time instead of one row at a time. (A block can be 512, 1024, 2048, or 4096 bytes long.) If a query accesses most rows in a table and most columns in each row, sequential block buffering reduces the number of messages it takes to return the qualifying data to the file system. Consider the following query: SELECT * FROM EMPLOYEE FOR BROWSE ACCESS ; If each row in the EMPLOYEE table is 96 bytes and blocks are 1024 bytes, the data access manager can return 10 rows at a time, reducing the number of messages by approximately 90 percent. This method is also called real sequential block buffering (RSBB) because the data access manager passes the actual block of rows to the file Tandem Computers Incorporated 3 21

60 NonStop SQL Architecture Other Architectural Features system without first manipulating the data. The file system then performs any projection and restriction required by the query. If a query requires extensive projection and restriction if only a few rows and columns need to be retrieved the data access manager can perform the single-table query evaluation. However, instead of returning a row at a time, the data access manager can build a virtual block containing only the qualified rows and columns. It then returns the virtual block to the file system. This method, called virtual sequential block buffering (VSBB), reduces both message traffic and the amount of data passed to the file system. The optimizer determines which I/O method to use when it builds a query plan. The optimizer might choose VSBB for the following query: SELECT FIRST_NAME,LAST_NAME,SALARY FROM EMPLOYEE WHERE EMPNUM <= 1000 AND SALARY > ; Figure 3-9 shows a sample block returned by the data access manager when it uses VSBB to execute the preceding query. Figure 3-9. Virtual Sequential Block Buffering (VSBB) Access Get Next Row Where EMPNUM <= 1000 AND SALARY > File System Data Access Manager Roger Jerry Joan Green Howard Harris Returns One Block of Rows (With Projection and Restriction) Roger Green Susan Amis Jerry Howard Jessica Peter Criner Smith 599 Joan Harris Tandem Computers Incorporated

61 NonStop SQL Architecture Summary VSBB can be as much as three times more efficient than RSBB. VSBB is especially important when data is geographically distributed across long-distance communication lines. The optimizer can also choose VSBB for insert and update operations. VSBB significantly improves the performance of these operations as well. Cache Optimizations for Sequential Access Summary If the optimizer has chosen sequential access for a query, the data access manager can read blocks of data into cache (memory) using a minimal number of I/O operations. The data access manager performs bulk I/O operations, transferring up to 56 kilobytes of data per I/O. In addition, the data access manager can asynchronously prefetch data into cache that is, transfer a new block of data into cache immediately after completing the previous transfer. Because the data access manager can anticipate the next (sequential) block of rows to be requested, it can perform the I/O operation while the application process is still processing the previous block. When the application process requests the next block, it is already in cache. Similarly, when rows are updated sequentially, the data access manager can minimize I/O operations by asynchronously writing blocks to disk and by using large bulk transfers of data. Thus, cache optimizations considerably enhance the performance of both scanning and sequential updating of tables or key ranges. Users can configure the physical components of a NonStop SQL/MP database indexes, table organizations, table partitions to maximize the performance of data access operations. When an SQL query is compiled, the SQL optimizer automatically determines the best execution plan for the query. At run time, the SQL executor uses low-level system components such as the data access manager to retrieve and update data, which further improves query performance. NonStop SQL/MP also provides parallel query execution, support for mixed workloads, virtual sequential block buffering, and other architectural features designed for high performance and high availability Tandem Computers Incorporated 3 23

62 NonStop SQL Architecture Summary (This page left intentionally blank) Tandem Computers Incorporated

63 Index A Access plan 3-11 Active data dictionary 1-16 Aggregate function example 3-16 hash algorithm 1-17 ANSI-standard SQL and NonStop SQL/MP 1-14 Application decision support systems (DSS) 1-3 distributed 1-12 electronic commerce 1-4 online transaction processing (OLTP) 1-2 Architecture hardware 1-8 NonStop SQL/MP 1-6 Automatic recompilation 1-11, 3-11 Availability features in NonStop Kernel 1-11 features in NonStop SQL/MP 1-11 system features 1-11 AVERAGE function 1-17 B BEGIN WORK statement 3-15 Bulk I/O operations 3-23 Business use of NonStop SQL/MP 1-2 C C language 1-14 Cache description 1-7 optimizations 3-23 Catalog 3-1, 3-2 Client tools, PC 1-14 Clustering key 3-3 Collation 1-16 Column 1-1 COMMIT WORK statement 3-15 Compiler 3-10 Concurrent updates 2-8 Constraints Tandem Computers Incorporated Index 1

64 Index COUNT function 3-16 Create Index operation 1-12 Cross product 1-17 D Data bulk I/O operations 3-23 changing 2-8 integrity 1-15 modification 2-8 prefetching into cache 3-23 selection description 2-2 from a table 2-1 from a view 2-5 sorting 3-14 Data access manager (DP2) 3-13, 3-18 Data Definition Language (DDL) 1-14 Data dictionary active 1-16 distributed 1-13 SQL catalog and 3-1 Data Manipulation Language (DML) 1-14 Database distributed 1-12 relational 1-1 Database administration 1-17 Database configuration operations 1-12 Database structure 3-1 Decision support systems (DSS) 1-3 Decisions support systems (DSS) 1-17 DELETE statement 2-8 Desktop software 1-14 Disaster protection 1-11 Disk cache 1-7 mirrored 1-11 Disk process 3-13, 3-18 Distributed application 1-12 Distributed data dictionary 1-13 Distributed database 1-12 Index Tandem Computers Incorporated

65 Index DP DSS 1-3, 1-17 Dynamic SQL 3-9 E Electronic commerce applications 1-4 Entry-sequenced file 3-4 table structure 3-4 ESP 3-16 Execution of a query 3-13 Execution plan 3-11 Executor server process (ESP) 3-16 F FastSort utility 1-18, 3-14 File description of 3-1 entry-sequenced 3-4 key-sequenced 3-3 label 3-2 online reorganization of 1-12 relation to table or view 3-1 relative 3-4 File system 3-13 First key example 3-8 Function aggregate 3-16 AVERAGE 1-17 COUNT 3-16 MAXIMUM 1-17 MINIMUM 1-17 SUM 1-17, 3-16 H Hardware architecture 1-8 Hash aggregate 1-17 Hash join Tandem Computers Incorporated Index 3

66 Index I I/O operations bulk 3-23 cache optimizations 3-23 prefetching data into cache 3-23 Index creating 3-6 description of 3-4 example 3-6 parallel maintenance of 3-18 partition 3-7 Index-only access 3-5 INSERT statement 2-8 Integration with Tandem software 1-9 Interprocessor bus (IPB) 1-8 ISO-standard SQL and NonStop SQL/MP 1-14 J Join description of 2-4 parallel execution 3-17 K Key clustering 3-3 column 3-3 primary 3-3 Key-sequenced file 3-3 table structure 3-3 L Local autonomy feature 1-13 Locking 1-15, 2-8 Logical database structure 3-1 Logical names for SQL objects 3-13 Logical schema 3-1 Index Tandem Computers Incorporated

67 Index M Master executor 3-16 MAXIMUM function 1-17 Measure performance measurement tool 1-10 MINIMUM function 1-17 Mirrored disks 1-11 Mixed workload environment 1-16, 3-20 Modification of data 2-8 Move Partition operation 1-12 N National languages 1-16 NonStop Kernel availability 1-11 integration with NonStop SQL/MP 1-9 NonStop ODBC Server 1-14 NonStop TM/MP distributed database support 1-13 integration with NonStop SQL/MP 1-10 invoking 3-15 SQL executor and 3-15 transaction management 2-9 NonStop TS/MP 1-9 O Object program file 3-10 ODBC 1-14 OLTP 1-2 Online file reorganization 1-12 Online transaction processing (OLTP) 1-2 Open standards 1-14 Open system services (OSS) 1-14 Optimization of a query plan 3-11 Optimizer 1-15, 3-10, 3-11 Organization of a table 3-3 OSS Tandem Computers Incorporated Index 5

68 Index P Parallel execution description 3-15 inter-query 3-15 intra-query 3-15 partitioning 1-10 repartitioning 3-17 Parallel hardware architecture 1-8 Parallel index maintenance 3-18 Parallel join operation 3-17 Parallel query execution 1-10, 3-15 Partition example 3-8 index 3-7 online database configuration of 3-9 skipping unavailable partition 1-12 splitting and moving operations 1-12 table 3-7, 3-16 Partitioning data 1-10 PC client tools 1-14 Performance features for DSS 1-17 low-level integration 1-6 Physical database configuration 1-12 structure 3-1 POSIX 1-14 Prefetching data into cache 3-23 Primary key 3-3 Process pairs 1-11 Protection view description of 2-7 example 2-7 shorthand view compared to 2-8 Index Tandem Computers Incorporated

69 Index Q Query execution 3-13 in a distributed database 1-13 optimization 1-15, 3-11 overview 2-1 parallel execution 1-10, 3-15 plan 3-11 saved as view 2-5 single-table evaluation 3-19 skipping unavailable partition 1-12 sources of 2-1 submission 3-9 writing 2-1 R RDBMS 1-1 Real sequential block buffering (RSBB) 3-21 Recompilation, automatic 1-11 Recompilation,automatic 3-11 Relational database 1-1 Relative file 3-4 table structure 3-4 Remote Duplicate Database Facility (RDF) 1-11 Repartitioning data 3-17 Row 1-1 S Scalability 1-8 SELECT statement 2-2 Selecting data description 2-2 from a table 2-1 from a view 2-5 Selectivity 3-12 Sequential access,cache optimizations 3-23 Sequential block buffering 3-21 Set-oriented update operations Tandem Computers Incorporated Index 7

70 Index Shorthand view description of 2-6 protection view compared to 2-8 Single-table query evaluation 3-19 Single-table request 3-13, 3-19 Skipping unavailable partition 1-12 Software, desktop 1-14 Sorting data 3-14 Sources of queries 2-1 Split Partition operation 1-12 SQL catalog 3-1, 3-2 SQL compiler 3-10 SQL executor 3-13 automatic recompilation 3-11 description of 3-13 SQL file system 3-1, 3-13 SQL objects, logical names 3-13 SQL optimizer 1-15, 3-10, 3-11 SQL query plan 3-10, 3-11 SQL source statements, compiling 3-10 SQLCI queries 2-1 Statements compiling 3-10 DELETE 2-8 INSERT 2-8 SELECT 2-2 UPDATE 2-8 Static SQL 3-9 Submitting an SQL query 3-9 SUM function 1-17, 3-16 Support for national languages 1-16 T Table entry-sequenced 3-4 joining 2-4 key-sequenced 3-3 logical names 3-13 organization 3-3 partition 3-7, 3-8 physical structure 3-3 Index Tandem Computers Incorporated

71 Index Table (continued) relation to files 3-1 relational 1-1 relative 3-4 single-table request 3-19 TorusNet 1-8 Transaction 3-15 Transaction management 2-9 U UNION operator 2-5 Update operations, set-oriented 3-20 UPDATE statement 2-8 Usability 1-14 User process sort (UPS) 3-14 V View protection description of 2-7 example 2-7 shorthand view compared to 2-8 relation to files 3-1 shorthand description of 2-6 protection view compared to 2-8 using 2-5 Virtual sequential block buffering (VSBB) 3-21, 3-22 W Workstation client tools Tandem Computers Incorporated Index 9