The Art of SQL Server Performance Tuning Greg Linwood Managing Director MyDBA http://facebook.com\greglinwood
About Greg Linwood Founder of MyDBA Pty Ltd (2002) Microsoft s first SQL Server MVP Australian awardee (2003) https://mvp.support.microsoft.com/profile/greg.linwood Founder of the Australian SQL Server User Group (2004) www.sqlserver.org.au Working with SQL Server since 1993
Agenda Why do we still need to optimize databases? Things we can see, others we can t Reading between the lines Important aspects of: Database Tuning Infrastructure Tuning
Why we still need to optimize databases?(1 of 5) 1 st, because things change.. Code changes (releases, upgrades) Data changes (db growth, data patterns) Usage patterns change (users do things) Full implications of changes are rarely accounted for by developers (often impossible) Load testing is intrinsically hard (lack of toolset)
Why we still need to optimize databases?(2 of 5) 2 nd because query optimization is imprecise SQL is declarative not imperative CSS & RegEx are also declarative, C#, VB, Java etc imperative Returns CustID, OrderID & OrderDate for orders > 1st Jan 2005 No processing rules included in SQL statement, just the set of data to be returned Optimization often / sometimes goes wrong it can only work with indexes that exist.. sometimes queries are just too complex
Why we still need to optimize databases?(3 of 5) 2 nd because query optimization is imprecise (cont) Huge number of possible query plans available for complex queries Optimization is purely statistically based Sometimes optimizer has to guess in complex situations Lack of precision leads to inconsistency Parameter driven ( parameter sniffing )
Why we still need to optimize databases?(3 of 5) 3 rd, underlying technologies are inefficient Rotating disk drives are mechanical, not electronic Huge consequences RAID, battery caching Can be eliminated with SSDs Dependence on RAM caching to avoid HDD issues Dependence on aging algorithms to avoid caching issues Virtualization introduces new levels of inefficiency SQL OS within Windows OS within VM OS.. Layer upon layer upon layer of resource mgmt
Things we can see, others we can t (1 of 5) Things we CAN easily see CPU utilisation Taskman, Perfmon Yep, that s about all that s EASY to see
Things we can see, others we can t (2 of 5) Things we CAN T easily see Code / structure changes DDL Trigger / Event Notifications are options on SQL 2005+ Changing query plans No SQL Server feature to track this crucial metric yet Not just query plans, but query performance stats in general Requires manual trace capture / analysis or DMV polling Missing indexes DMVs are beginning to improve things, though unreliable and present recommendations for individual queries without considering wider workload
Things we can see, others we can t (3 of 5) Things we CAN T easily see (cont) Memory performance SQL Server uses RAM very differently to most apps SQL OS intelligently uses as much RAM as possible (whatever s available) to cache frequently accessed data, to avoid I/O as much as possible Requires analysis of internal SQLOS memory managers Page Life Expectancy is the best generic perfmon counter SQL Server:Buffer Manager\Page Life Expectancy Storage I/O performance I/O requests get queued, then cached at RAID controller Confuses analysis measuring queue depth isn t enough SQL Server Virtual File Stats provides waits analysis, full picture Requires manual scripting
Things we can see, others we can t (4 of 5) Things we CAN T easily see (cont) Physical Database Fragmentation Page Splitting System-wide perfmon counter available SQL Server:Access Methods\Page Splits/sec No-where near detailed enough, index level required There s no way of monitoring which indexes are fragmenting Leads to generic index maintenance practises Scanning indexes / defragging or rebuilding all indexes Deadlocking / Blocking SQLTrace events are available, but provide confusing information Deadlocking still best performed through trace flag 1204
Reading between the lines (1 of 5) High CPU consumption is usually an effect Usually the result of query plan inefficiency Very often lack of ideal indexes, sometimes poor compilation Sufficient CPU resources still necessary of course Determined by number of peak concurrent queries Query plan inefficiency also usually an effect Lack of ideal indexes a TRUE CAUSE Maybe poor compilation a rarer TRUE CAUSE Sometimes excessive SQL, TVFs, cursors
Reading between the lines (2 of 5) High memory consumption is by design SQL Server is designed to consume as much RAM as possible Cache as much data as possible to relieve disk I/O Cache as many query plans as possible to relieve compilations Very different from most other applications Controlled by SQL Server Operating System Fundamentally incompatible with Virtualization resource management I/O performance analysis requires reading layers O/S queues I/O requests Storage controller caches I/O requests HDDs experience latency during spindle rotation SQLOS waits for overall I/O request SQLOS measure only available at FILE level
Query Tuning (1 of 3) Splitting single huge queries Queries with too many tables joined require complex optimization, which is often done inconsistently Can be controlled by breaking into smaller query units, using temp tables to combine results Forces processing consistency, sometimes at slight performance trade-off. Don t always do this with complex queries, just with those that compile inconsistently & can t be easily hinted
Query Tuning (2 of 3) Query Hints Don t be afraid to use hints! WITH (INDEX = ) For when you want a specific index used WITH (MERGE, HASH, LOOPS JOIN) For when you want a specific join operator OPTION (FORCE ORDER) When you have ordered the tables specifically in SQL syntax for processing rules
Query Tuning (3 of 3) Splitting queries that use innocent little ORs OR is often better implemented with separate queries Particularly for flexible searching queries where a Stored Procedure is used & user chooses from optional parameters Separated query output can be piped to temp tables & requeried at the end of SP, or simply UNIONED together. Consider dynamic SQL multiple plans Modularization with Table Valued Functions SQL Server s query optimiser doesn t flatten out or fold query plans from TVFs into that of the calling query. Leads to massive inefficiencies eg a query that returns 1000 rows & refers to a TVF in select list, will call that TVF individually 1000 times
Physical Tuning (1 of 2) Indexing Good indexes are THE most crucial aspect of tuning SQL Server There are no universal rules, however: Generally use Clustered Indexes Especially if only index per table Sequential, narrow keys are ideal Natural / random keys tend to increase splitting Generally index columns referred to in foreign keys Beyond these basics, always look at indexing as a database or workload specific activity.
Physical Tuning (2 of 2) Filegroup / File placement Separate large databases into discrete filegroups Place large, intensely accessed tables onto separate FGs Allows discrete measurement of Virtual File Stats Always separate Tlogs from data files in storage Not just on separate volumes, must be physically seperated Less of an issue with SSDs, BUT Tlogs better on battery backed cache than SSDs anyway
Infrastructure Tuning (1 of 5) CPU # of CPUs required depends on # of concurrent queries Current Intel options: 6 core + HT (12 threads) @ 3.6Ghz (Xeon 5xxx) 8 core + HT (16 threads) @ 2.4Ghz (Xeon 7xxx) 10 core + HT (20 threads) @ 2.6Ghz (E7 8xxx) Choice depends on # of concurrent queries Fewer, faster threads for larger reporting workloads More, slower threads for highly concurrent workloads
Infrastructure Tuning (2 of 5) Memory Critical to cache workload away from disk I/O The slower your storage, the more Memory you need SANs, Virtualization particularly sensitive The more RAM, the better But no point provisioning substantially more RAM than DB sizes Other than to cater for future growth Current market price is ~AUD$9k for 256GB RAM is getting cheaper Versions & Editions of SQL Server & Windows define RAM limits Note SQL 2008 R2 Std Edn has new 64GB limit SQL OS manages RAM in a way that VMWare/ HyperV can t see Therefore, you can t rely on virtualization to dynamically manage memory this concept doesn t apply to SQL, due to SQLOS
Infrastructure Tuning (3 of 5) Storage HDDs Physical Spindles The more, faster, smaller spindles in arrays the better RAID 1+0 (mirroring + striping) faster than RAID 5 (parity) Write speeds significantly different, read speeds similar RAID 1+0 loses half of capacity for mirroring RAID 5 loses N-1 for parity RAID Controller Cache important (always battery backed) Set for read or write oriented depending on workload Remember, reads are synchronous, writes asynch
Infrastructure Tuning (4 of 5) Storage SSDs Solid State Drives No moving parts Massively faster Massively cheaper (for performance / capacity) Important to Over-provision Format to 75% of capacity for moderate write workloads Format to 50% of capacity for intense write workloads Keeps garbage collection from backing SSD device back into write cliff scenario Tlogs should not be on SSDs Still better on battery backed RAID controller cache TempDB data file is a primary candidate for SSD (no risk)
Infrastructure Tuning (5 of 5) Virtualization? Never a good idea for intense SQL workloads Beyond empirical overhead from extra host O/S layer Anywhere from 1% to 100% overhead Confusion during troubleshooting accounts for far more overhead than any empirical overhead. Trouble-shooting requires co-operation between VM admin & DBA, also often with SAN admin & very often these parties have competing agendas. Delays troubleshooting, extends downtime Virtualization really only appropriate for small scale, performance insensitive systems. Benefits are all IT maintenance, users bear perf consequences