Department f Cmputer Science and Engineering 2013/2014 Database Administratin and Tuning Lab 8 2nd semester In this lab class we will apprach the fllwing tpics: 1. Query Tuning 1. Rules f thumb fr query tuning 2. Index Tuning 1. Using the the Database Engine Tuning Advisr 3. Experiments and Exercises 1. A Practical Exercise Using the Database Engine Tuning Advisr 2. Implementing the Database Engine Tuning Advisr recmmendatins 3. Exercise 1. Query Tuning SQL Server uses cst-based ptimizatin, i.e. it tries t find the executin plan with the lwest pssible cst, where cst means bth the time the query will take t execute and the hardware resurces that will be used. Basically, the query ptimizer is lking t minimize the number f lgical reads required t fetch the required data. The bad news is that it is nt magic, and the ptimizer des nt always cme up with the best slutin. A database administratr shuld be aware f the factrs that gvern query ptimizatin, what pitfalls there are, and hw the query ptimizer can be assisted in its jb. Database administratrs wh knw well their data can ften influence the ptimizer t chse certain indexes, in rder t cme up with the mst efficient slutin. 1.1. Rules f Thumb fr Query Tuning There are sme very basic guidelines fr writing efficient SQL cde. These guidelines largely cnstitute nthing mre than writing queries in the prper way. In fact, it might be quite surprising t learn that, as yu wrk with a relatinal database, ne f the mst cmmn causes f perfrmance prblems can usually be tracked dwn t prly cded queries. We will nw discuss in general terms what, in SQL statements, is gd fr perfrmance, and what is nt. Make careful use f the HAVING clause. HAVING is intended t filter recrds frm the result f a GROUP BY, and a cmmn mistake is using it t filter recrds that can be mre efficiently filtered using a WHERE clause. Make careful use f the DISTINCT clause. DISTINCT can cause an additinal srt peratin, and it shuld be avided except when strictly necessary. IST/DEI Pág. 1 de 10
Make careful use f functins. They simply shuld nt be used where yu expect an SQL statement t use an index. When using a functin in a query, d nt execute it against a table field if pssible. Instead, apply it n the search value, fr example: SELECT * FROM custmer WHERE zip = TO_NUMBER('94002') Data type cnversins are ften a prblem and will likely cnflict with existing indexes (e.g., an index is created ver a VARCHAR atribute named datestr, but a query accesses the atribute as if it were a date, fr instance thrugh a cnversin functin such as CONVERT(DATETIME, datestr)). Unless implicit data type cnversin ccurs, e.g. between number and string, indexes will likely be ignred. Make careful use f the ORDER BY clause. Avid srting the results when that is nt strictly necessary. Use UNION ALL instead f UNION. When yu use the UNION clause t cncatenate the results frm tw r mre SELECT statements, duplicate recrds are remved. This duplicate remval requires additinal cmputing. If yu are nt cncerned that yur results may include duplicate recrds, use the UNION ALL clause, which cncatenates the full results frm the SELECT statements. Avid anti-cmparisns. Avid instructins such as!= r NOT, as they are lking fr what is nt in a table the entire table must be read regardless. Use IN t test against literal values and EXISTS t create a crrelatin between a calling query and a subquery. IN will cause a subquery t be executed in its entirety befre passing the result t the calling query. EXISTS will stp nce a result is fund. Avid using the OR r the IN peratrs. Ntice, fr instance, that SELECT * FROM emplyees WHERE state IN ('CA', 'IL', 'KS') is the same as SELECT * FROM emplyees WHERE state = 'CA' OR state = 'IL' OR state = 'KS', and bth are cstly queries t execute. The query ptimizer will always perfrm a table scan (r a clustered index scan n an indexed table) if the WHERE clause in the query cntains an OR peratr, and if any f the referenced clumns in the OR clause des nt have an index with the clumn as the search key. If yu use many queries cntaining OR clauses r IN peratrs, yu will want t ensure that each referenced clumn has an index. A query with ne r mre OR clauses, r using the IN peratr, can smetimes be rewritten as a series f queries that are cmbined with a UNION statement, in rder t bst the perfrmance. Fr ptimizing jins, the fllwing rules f thumb apply: IST/DEI Pág. 2 de 10
Use equality first, and nly use range peratrs where equality des nt apply. Avid the use f negatives in the frm f!= r NOT. Avid LIKE pattern matching. In the relatins being jined, try t retrieve specific rws, and in small numbers, s that nly a small number f rws is actually invlved in the jin peratin(s). Filter large tables befre applying a jin peratin, t reduce the number f rws that is jined. Als access tables frm the mst highly filtered, preferably the largest, dwnward. Ntice that this is imprtante t reduce the number f rws that is invlved in the jin peratin(s). Use indexes wherever pssible except fr very small tables. Regarding jins, nested sub-queries can be difficult t tune but can ften be a viable tl, and smetimes highly effective, fr tuning mutable cmplex jins, with three and smetimes many mre tables in a single query. The term mutable cmplex jins refers t jin queries invlving mre than tw tables, that are mutable in the sense that different jin rders can be cnsidered, and that are cmplex in the sense that they invlve ther selectins n the tables that are being jined. Using nested sub-queries, it might be easier t tune such queries, because ne can tune each sub-query independently. 2. Index Tuning In terms f tuning, the ptin that prduces maximum gains with least impact n existing systems and prcesses is t examine yur indexing strategy. Hwever, the task f identifying the right indexes is nt necessarily straightfrward. It requires a sund knwledge f the srt f queries that will be run against the data, the distributin f that data, and the vlume f data, as well as an understanding f what type f index will best suit yur needs. Cnsider the fllwing query: Select A, COUNT(*) FROM T WHERE X < 10 GROUP BY A; The fllwing different physical design structures can reduce the executin cst f this query: (i) A clustered index n X; (ii) Table range partitined n X; (iii) A nn-clustered index with key X and including the additinal atribute A; (iv) A materialized view that matches the query, and s n. These alternatives can have widely varying strage and update characteristics. Thus, in the IST/DEI Pág. 3 de 10
presence f strage cnstraints, r fr a wrklad cntaining updates, making a glbal chice fr a wrklad is difficult. Fr example, a clustered index n a table and hrizntal partitining f a table are bth nn-redundant structures (i.e., they incur negligible additinal strage verhead) whereas nn-clustered indexes and materialized views can be ptentially strage intensive and invlve higher update csts. Hwever, nn-clustered indexes and materialized views can ften be much mre beneficial than a clustered index r a hrizntally partitined table. Clearly, a physical design tl that can give an integrated physical design recmmendatin can greatly reduce/eliminate the need fr a DBA t make ad-hc decisins. While understanding the basics is still essential, SQL Server des ffer a helping hand in the frm f sme tls in particular, the Database Engine Tuning Advisr that can help t determine, tune and mnitr yur indexes. It can be used t get answers t the fllwing questins: Which indexes are needed fr specific queries? Hw t mnitr index usage and its effectiveness? Hw t identify redundant indexes that culd negatively impact perfrmance? As the wrklad changes, hw t identify missing indexes that culd enhance perfrmance fr the new queries? 2.1. Using the Database Engine Tuning Advisr Determining exactly the right indexes fr yur system can be quite a taxing prcess. Fr example, yu have t cnsider: Which clumns shuld be indexed, based n the knwledge n hw the data is queried. Whether t chse a single-clumn index r a multiple clumn index. Whether t use a clustered index r a nn-clustered index. Whether ne culd benefit frm an index with included clumns. Hw t utilize indexed (i.e., materialized) views. Mrever, nce yu have determined the perfect set f indexes, yur jb is nt finished. Yur wrklad will change ver time (i.e., new queries will be added, and lder nes remved) and this might warrant revisiting existing indexes, analyzing their usage and making adjustments (i.e., mdifying r drpping existing indexes and creating new nes). Maintenance f indexes is critical t ensure ptimal perfrmance in the lng run. The Database Engine Tuning Advisr (DTA) is a physical design tl prviding an integrated cnsle where DBAs can tune all physical design features supprted by the IST/DEI Pág. 4 de 10
server. The DTA takes int accunt all aspects f perfrmance that the query ptimizer can mdel, including the impact f multiple prcessrs, amunt f memry n the server, and s n. It is imprtant t nte, hwever, that query ptimizers typically d nt mdel all the aspects f query executin (e.g., impact f indexes n lcking behavir, impact f data layut etc.). Thus, DTA s estimated imprvement can be different frm the actual imprvement in executin time. Taking as input a wrklad t fine-tune, i.e., a set f SQL statements that execute against the database server, the DTA prduces a set f physical design recmmendatins, cnsisting f indexes, materialized views, and strategies fr hrizntal range partitining f tables, indexes and views. The basis f DTA s recmmendatins is a what-if analysis prvided by the SQL Server query ptimizer, which allws the cmputatin f an estimated cst as if a given cnfiguratin (e.g., the existence f sme indexes) was materialized in the database. Similarly t the actual evaluatin f a given query plan, the query ptimizer cmpnent can d an evaluatin cnsidering the what-if existence f a given physical design structure. Yu can tune a single query r the entire wrklad t which yur server is subjected. A wrklad can be btained, fr instance, by using SQL Server Prfiler, i.e., a tl fr lgging events (e.g., queries) that execute n a server. In this case, the wrklad wuld be given t the DTA in the frm f a trace file, btained with the SQL Server Prfiler. The Prfiler tl is just used t cllect the wrklad, whereas the DTA perfrms the actual analysis and the tuning suggestins. Alternatively, a wrklad can be specified as an SQL file cntaining an rganizatin r industry benchmark. In this case, a text file with the SQL fr each query in the wrklad wuld be given t the DTA. The DTA can als take as input wrklads referring t either a single r t a set f databases, as many applicatins use mre than ne database simultaneusly. Based n the ptins that yu select, yu can use the DTA t make recmmendatins fr several Physical Design Structures (PDS), including: Clustered indexes Nn-clustered indexes Indexes with included clumns (t avid bkmark lkups) Indexed views Partitins The first step is t cllect a wrklad fr DTA t analyze. Yu can d this in ne f tw ways: IST/DEI Pág. 5 de 10
Using the Management Studi If yu need t ptimize the perfrmance f a single query, yu can use Management Studi t prvide directly an input t DTA. Type the query in Management Studi, highlight it and then right click n it t chse Analyze in Database Engine Tuning Advisr. Using the Prfiler If yu want t determine the ptimum index set fr the entire wrklad, crrespnding t the actual queries that are being executed against an SQL Server instance, yu shuld cllect a prfiler trace with the TUNING template (i.e., ne f the pssible ptins fr the trace file that is generated by the SQL Server Prfiler, and that cntains all the infrmatin that is required by the Database Engine Tuning Advisr). T fully explit the effectiveness f DTA, yu shuld always use a representative prfiler trace. Fr instance, the indexes and partitining cnsidered by the DTA are limited nly t interesting clumn grups (i.e., thse clumns that appear in a large fractin f the queries in the wrklad that have the highest cst), in rder t imprve scalability with little impact n quality. If the prfiler trace is nt representative f a true wrklad, imprtant queries will likely be missing. Yu shuld make sure that yu subject yur server t all the queries that will typically be run against the data, while yu are cllecting the trace. This culd lead t a huge trace file, but that is nrmal. If yu simply cllect a prfiler trace ver a 5-10 minute perid, yu can be pretty sure it will nt be truly representative f all the queries executed against yur database. In the SQL Prfiler, the TUNING template captures nly minimal events, s there shuld nt be any significant perfrmance impact n yur server. A technique fr wrklad cmpressin is als emplyed, partitining wrklads with basis n a signature f each query (i.e., tw queries have the same signature if they are identical in all aspects except fr the cnstants referenced in the query). 3. Experiments and Exercises 3.1. A Practical Exercise Using the Database Engine Tuning Advisr As materials fr this class, we have prvided a wrklad (Queries4Wrklad.sql) against the AdventureWrks2012 database. We recmmend that yu use this wrklad, t get a hands-n perspective f the DTA. Alternatively t prviding an SQL script with the wrklad, yu can use the SQL Prfiler Tl t gather a system trace, and prvide this trace as input t the DTA. IST/DEI Pág. 6 de 10
Assuming that the given wrklad is representative f the queries that wuld be executed against the database, yu can use it as an input t the DTA, which will then generate recmmendatins. Yu can perfrm ne f the fllwing tw types f analysis. A. Keep my existing Physical Design Structures and tell me what else I am missing This type f analysis is cmmn and is useful if yu have previusly established the set f indexes that yu deem t be mst useful fr yur given wrklad, and are seeking further recmmendatins. T cnduct this analysis: 1. Initiate a new sessin in the DTA, by launching the crrespnding tl frm the Windws menu with the SQL Server Perfrmance Tls, r by selecting this tl frm the tls menu within the SQL Server Management Studi. 2. Chse the prvided wrklad as the input t this sessin. 3. In the Select databases and tables t tune sectin, select AdventureWrks2012. 4. In the Database fr wrklad analysis drpdwn, use AdventureWrks2012. 5. At the Tuning Optins tab, select the fllwing ptins: (1) Physical Design Structures t use in database -> Indexes and Indexed views (2) Physical Design Structures t keep in database -> Keep all existing PDS 6. Uncheck the checkbx fr limit tuning time. 7. Hit START ANALYSIS -- the DTA will start cnsuming yur wrklad. Once DTA finishes cnsuming the wrklad, it will list its recmmendatins under the recmmendatins tab. B. Ignre my existing Physical Design Structures and tell me what query ptimizer needs In the previus scenari, DTA makes recmmendatins fr any missing indexes. Hwever, this des nt necessarily mean yur existing indexes are ptimal fr the query ptimizer. Yu may als cnsider cnducting an analysis whereby DTA ignres all existing physical design structures and recmmends what it deems the best pssible set fr the given wrklad. This way, yu can validate yur assumptins abut what indexes are required. T cnduct this analysis, fllw steps 1 t 6 f the previus scenari, except that at step 5(2), chse D nt keep any existing PDS. Cntrary t hw this might sund, the DTA will nt actually drp r delete any existing physical design structures. This is the biggest advantage f using DTA, as it means yu can use the tl t perfrm what-if analysis withut actually intrducing any changes t the underlying schema. IST/DEI Pág. 7 de 10
After cnsuming the wrklad, DTA presents, under the recmmendatins tab, a set f tuning recmmendatins. A gd idea is t fcus n the fllwing sectins: Recmmendatin this is the actin that yu need t take. Pssible values include Create r Drp. Target f Recmmendatin this is the prpsed name f the physical design structure t be created. The naming cnventin is typical f DTA and generally starts with _dta*. Hwever, it is recmmended that yu change this name based n the naming cnventin in yur database. Definitin this is the list f clumns that this new physical design structure will include. If yu click n the hyperlink, it will pen up a new windw with the T-SQL script t implement this recmmendatin. Estimated Imprvements this is the estimated percentage imprvement that yu can expect in yur wrklad perfrmance, if yu implement all the recmmendatins made by DTA. Space used by recmmendatin (MB) under the Tuning Summary sectin f the Reprts tab, yu can find ut the extra space in MB that yu wuld need, if yu decide t implement these recmmendatins. The reprts tab features several in-built analysis reprts. There are 15 built-in reprts, but the fllwing three are the mst imprtant. Current Index Usage Reprt - Start with this reprt t see hw yur existing indexes are being used by the queries running against yur server. Each index that has been used by a query is listed here. Each referenced index has a Percent Usage value which indicates the percentage f statements in yur wrklad that referenced this index. If an index is nt listed here, it means that it has nt been used by any query in yur wrklad. If yu are certain that all the queries that run against yur server have been captured by yur prfiler trace, then yu can use this reprt t identify indexes that are nt required and pssibly delete them. Recmmended Index Usage Reprt - Lk at this reprt t identify hw index usage will change if the recmmended indexes are implemented. If yu cmpare these tw reprts, yu will see that the index usage f sme f the current indexes has fallen while sme new indexes have been included with a higher usage percentage, indicating a different executin plan fr yur wrklad and imprved perfrmance. Statement Cst Reprt - This reprt lists individual statements in yur wrklad and IST/DEI Pág. 8 de 10
the estimated perfrmance imprvement fr each ne. Using this reprt, yu can identify yur prly perfrming queries and see the srt f imprvement yu can expect if yu implement the recmmendatins made by DTA. Yu will find that sme statements dn't have any imprvements (Percent imprvement = 0). This is because either the statement was nt tuned fr sme reasn r it already has all the indexes that it needs t perfrm ptimally. 3.2 Implementing the Database Engine Tuning Advisr Recmmendatins By nw, we have cllected a wrklad using Prfiler, cnsumed it using the DTA, and gt a set f recmmendatins t imprve perfrmance. Yu then have the chice t either: Save recmmendatins yu can save the recmmendatins in an SQL script by navigating t ACTIONS -> SAVE RECOMMENDATIONS. Yu can then manually run the script in Management Studi t create all the recmmended physical design structures. Apply recmmendatins using the DTA if yu are happy with the set f recmmendatins, then simply navigate t ACTIONS -> APPLY RECOMMENDATIONS. Yu can als schedule a later time t apply these recmmendatins, fr instance during ff-peak hurs s that interference with ther peratins is minimal. Perfrming what-if analysis is a very useful feature f the DTA. Yu may nt want t apply all the recmmendatins that the DTA prvided. Hwever, since the Estimated Imprvement value can nly be achieved if yu apply all f these recmmendatins tgether, yu are nt really sure what kind f impact it will have if yu nly chse t apply a sub-set f these recmmendatins. T d a what-if analysis, deselect the recmmendatins that yu d nt want t apply. Nw, g t ACTIONS -> EVALUATE RECOMMENDATIONS. This will launch anther sessin with the same ptins as the earlier ne. Hwever, when yu click n START ANALYSIS, the DTA will prvide data n estimated perfrmance imprvements, based n just this sub-set f the recmmendatins. Again, the key thing t remember is that the DTA perfrms this what-if analysis withut actually implementing anything in the database. 3.3. Exercise Cnsider the fllwing nrmalized relatin where the primary key is ID: Emplyees(ID, name, salary, department, cntract_year) Cnsider as well the fllwing fur queries equally imprtant and frequent: a) What is the average number f emplyees per department? b) Which are the IDs f the emplyees with the highest salary? c) What is the ttal amunt f salaries paid by each department? d) Hw many emplyees were hired in the current year? IST/DEI Pág. 9 de 10
Cnsidering each query individually, which indices wuld yu create ver the relatin? Fr each index, indicate the type (hash r B+tree) and indicate if the index is clustered r nnclustered. Justify. IST/DEI Pág. 10 de 10