Common mistakes developers make in SQL Server Amateurs work until they get it right. Professionals work until they can't get it wrong.
Who am I? Self-employed 6 years, Visual Basic 3.0 to 6.0, Access 1.0 to 95. Long time consultant 15 years, Visual Basic 6.0, SQL Server 4.21 to 2008R2. Employed 3 years, SQL Server 2000 to 2008R2 and Denali (2011?). MVP Microsoft Valuable Professional SQL Server since 2009. Active on several forums as Peso, SwePeso or Pesomannen www.sqlteam.com www.sqlservercentral.com msdn.microsoft.com
Agenda Faster hardware Row By Agonizing Row (pronounced Ree-bar) Bad indexing Index misuse Triggers Indeterministic functions
Before we start Before you make any change on a production SQL Server, be sure you test it first in a test environment NO EXCEPTIONS! I mean it! Really! No kidding Iwouldn t lie to you You d be crazy not listening to this advice Never forget, DBAs and developers are the protectors of the organization s data You took this oath when you accepted the job
The golden bullet FASTER HARDWARE
Faster hardware You can throw faster hardware at a problem if you want but don't be surprised if it doesn't actually help much. True performance can only be found in the code and the design of the database. Let's stop and think about faster hardware to do the same thing Let's say the CPU speed and disk throughput is doubled. That would mean a 2 hour query might drop to 1 hour. Rewriting the old code could drop the query time to 3seconds.
Faster hardware More cores mean more queries can request massive numbers of locks simultaneously and the badly written queries need many locks that may cause performance to deteriorate under faster hardware. SQL Server spends more time waiting and less time working.
Faster hardware
Faster hardware
Faster hardware
Faster hardware
Faster hardware
Faster hardware
Faster hardware Query Index Parallellism Reads CPU Duration Old query No No 1,805,000 216.00 217.00 Old query No Yes 1,845,000 221.00 113.00 Old query Yes No 59,000 3.00 4.00 Old query Yes Yes 59,000 4.00 3.00 New query No No 120 0.10 1.00 New query Yes No 120 0.05 1.00
Loops, cursors and triangular joins ROW BY AGONIZING ROW
Row By Agonizing Row Phrased by Jeff Moden Good friend and fellow SQL Server MVP from SQLServerCentral. http://www.sqlservercentral.com/articles/t-sql/61539/ Learn to recognize Triangular Joins They appear in the SELECT list as correlated subqueriesusually with a stand-alone inequality of some sort They can also appear in WHERE clauses with the same type of standalone inequality. Not all Triangular Joins are bad With some restraint and the right criteria, Triangular Joins can be used for some pretty remarkable things Make finite schedules Do high speed dupe checks But, you've really got to be careful. Improperly written Triangular Joins are worse than Cursors or While Loops and can bring a CPU and Disk System right to it's knees. Watch out for "Hidden RBAR"
Row By Agonizing Row Many folks think that set-based programming means one of two things It simply doesn't have a Cursor or a While Loop. It's all done in a single query. Both of those are very bad misconceptions as to what set based programming is Touch each row of data only once if you can and as few times as possible if you can't. A loop (cursor) Works row-by-agonizing-row. Overrides the natural abilities of the optimizer. Only operates on a single row instead of a set of rows. Most explicit loops are not set based programming and they just crush performance.
Row By Agonizing Row Cursor-based solution Set-based solution
Row By Agonizing Row SELECT a.rownum AS RowNum_A, b.rownum AS RowNum_B FROM TableA AS a, TableB AS b WHERE a.rownum <= b.rownum
Row By Agonizing Row Good examples from the Phil Factor Speed Phreak competitions http://ask.sqlservercentral.com/questions/92/the-subscriptionlist-sql-problem Cursor 780 seconds (13 minutes). Set-based 0.3 seconds - 2,500 times faster! http://ask.sqlservercentral.com/questions/826/the-fifo-stockinventory-sql-problem Cursor 2,400 seconds (40 minutes). Set-based 1.3 seconds 1,800 times faster! http://ask.sqlservercentral.com/questions/6529/the-ssnmatching-sql-problem Cursor 4,000 4,500 seconds (70-75 minutes). Set-based 0.5 seconds 9,000 times faster! SQLCLR 0.4 seconds 11,000 times faster!
Row By Agonizing Row Kathi Kellenberger, former SQL Server MVP, provides in-depth descriptions and analyses of the different solutions here http://www.simple-talk.com/sql/performance/writing-efficientsql-set-based-speed-phreakery/ http://www.simple-talk.com/sql/performance/set-based-speedphreakery-the-fifo-stock-inventory-sql-problem/ http://www.simple-talk.com/sql/performance/ssn-matchingspeed-phreakery/
The hidden performance crusher BAD INDEXING
Bad Indexing SELECT ent.entengagementid,thd.thdplaceofsale,count(*) FROM dbo.engagement ent INNER LOOP JOIN dbo.transactionhead thd ON thd.entengagementid = ent.entengagementid INNER JOIN dbo.engagementparameter epr ON epr.teptypeengagementparameterid = (SELECT teptypeengagementparameterid FROM TypeEngagementParameter WHERE tepcode = 'StorePeriodLen') AND DATEDIFF(MM, DATEADD(MM, -CONVERT(INT,eprParameterValue), GETDATE()), thd.thdtransactiondate) > 0 AND epr.tettypeengagementid = ent.tettypeengagementid INNER JOIN dbo.typetransactionhead tth ON tth.tthtypetransactionheadid = thd.tthtypetransactionheadid AND tth.tthcode IN ('Purchase', 'Return', 'Mixed') GROUP BY ent.entengagementid,thdplaceofsale
Bad Indexing
Bad Indexing SELECT ent.entengagementid, thd.thdplaceofsale, COUNT(*) FROM dbo.typeengagementparameter AS tep INNER JOIN dbo.engagementparameter AS epr ON epr.teptypeengagementparameterid = tep.teptypeengagementparameterid INNER JOIN dbo.engagement AS ent ON ent.tettypeengagementid = epr.tettypeengagementid INNER JOIN dbo.transactionhead AS thd ON thd.entengagementid = ent.entengagementid INNER JOIN dbo.typetransactionhead AS tth ON tth.tthtypetransactionheadid = thd.tthtypetransactionheadid AND tth.tthcode IN ('Purchase', 'Return', 'Mixed') WHERE thd.thdtransactiondate >= DATEADD(MONTH, DATEDIFF(MONTH, '19000101', GETDATE()) - CONVERT(INT, epr.eprparametervalue), '19000201') AND tep.tepcode = 'StorePeriodLen' GROUP BY ent.entengagementid, thdplaceofsale
Bad Indexing
Bad Indexing
Bad Indexing
O Index, Where Art Thou? INDEX MISUSE
Index misuse -- Query 1 SELECT * FROM dbo.testindexes WHERE YEAR(MyDate) = 2005 -- Query 2 SELECT * FROM dbo.testindexes WHERE MyDate >= '20050101' AND MyDate < '20060101' -- Query 3 SELECT * FROM dbo.testindexes WHERE CONVERT(CHAR(4), MyDate, 112) = '2005'
Index misuse
Index misuse
Index misuse
Index misuse
Index misuse Query Records Scan Count Logical Reads CPU Duration Query 1 1,000 1 5 0 0 Query 2 1,000 1 2 0 0 Query 3 1,000 1 5 0 0 Query 1 10,000 1 25 0 3 Query 2 10,000 1 2 0 0 Query 3 10,000 1 25 0 8 Query 1 100,000 1 226 31 30 Query 2 100,000 1 3 0 0 Query 3 100,000 1 226 78 81 Query 1 1,000,000 1 2,250 312 300 Query 2 1,000,000 1 12 0 1 Query 3 1,000,000 1 2,250 811 819 Query 1 10,000,000 17 22,679 3,620 295 Query 2 10,000,000 1 85 0 180 Query 3 10,000,000 17 22,679 10,329 739
To batch or not to batch? TRIGGERS
Triggers Create TRIGGER [dbo].[fault_parked_update] ON [dbo].[detail] FOR INSERT, UPDATE AS Declare @CalliD Varchar(8), @Park1Start datetime,@park1end datetime, @Park1 INT, @TotalParkedTime INT Declare @FaultEnd datetime, @FaultTotal INT, @FaultMinusParked INT, @FaultStart datetime Select @callid=callid FROM inserted WHERE isdate(faultstart)=1 and isdate(faultend)=1 IF @Callid IS NOT NULL Begin Select @Park1Start = CASE WHEN ISDATE(@Park1Start)= 1 THEN Cast([ParkedDate1]+' '+[ParkedTime1] as datetime) ELSE CAST('1900-01-01 00:00:00' as DateTime) END from detail Where callid = @Callid Select @Park1End = CASE WHEN ISDATE(@Park1End)= 1 THEN Cast([UnParkedDate1]+' '+[UnParkedTime1] as datetime) ELSE CAST('1900-01-01 00:00:00' as DateTime) END from detail Where callid = @Callid Select @Park1 = datediff(ss,@park1start,@park1end)/60 Select @TotalParkedTime = @Park1 Select @FaultStart = Cast([FaultStart]+' '+[FaultStartTime] as datetime) from detail Where callid = @Callid Select @FaultEnd = Cast([FaultEnd]+' '+[FaultEndTime] as datetime) from detail Where callid = @Callid Select @FaultTotal = datediff(ss,@faultstart,@faultend)/60 Select @FaultMinusParked = @FaultTotal - @TotalParkedTime UPDATE CallLog set Park1 = @Park1, TotalParkedTime = @TotalParkedTime, FaultMinusParked = @FaultMinusParked, FaultTotal = @FaultTotal Where calllog.callid = @CallID END
Triggers CREATE TRIGGER dbo.fault_parked_update ON dbo.detail FOR INSERT, UPDATE AS ;WITH ctevalid(callid, Park1, FaultTotal) AS ( SELECT CallID, CASE WHEN Park1End > Park1Start AND ISDATE(ParkedDate1) = 1 THEN DATEDIFF(SECOND, Park1Start, Park1End) / 60 ELSE 0 END AS Park1, DATEDIFF(SECOND, FaultStart, FaultEnd) / 60 AS FaultTotal FROM ( SELECT CallID, ParkedDate1, CAST(ParkedDate1 + ' ' + ParkedTime1 AS DATETIME) AS Park1Start, CAST(UnParkedDate1 + ' ' + UnParkedTime1 AS DATETIME) AS Park1End, CAST(FaultStart + ' ' + FaultStartTime AS DATETIME) AS FaultStart, CAST(FaultEnd + ' ' + FaultEndTime AS DATETIME) AS FaultEnd FROM inserted WHERE ISDATE(FaultStart) = 1 AND ISDATE(FaultEnd) = 1 AS d ) UPDATE cl SET cl.park1 = v.park1, cl.totalparkedtime = v.park1, cl.faultminusparked = v.faulttotal - v.park1, cl.faulttotal = v.faulttotal FROM dbo.calllog AS cl INNER JOIN ctevalid AS v ON v.callid = cl.callid
I didn t see that coming INDETERMINISTIC FUNCTIONS
Simple indeterministic function select top(10) case abs(checksum(newid()))%4 when 0 then 0 when 1 then 1 when 2 then 2 when 3 then 3 end from sys.objects CASE WHEN abs(checksum(newid())) % 4 = 0 THEN 0 ELSE WHEN abs(checksum(newid())) % 4 = 1 THEN 1 ELSE WHEN abs(checksum(newid())) % 4 = 2 THEN 2 ELSE WHEN abs(checksum(newid())) % 4 = 3 THEN 3 ELSE NULL END END END END
Advanced indeterministic function CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 1 THEN 'Jan' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 2 THEN 'Feb' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 3 THEN 'Mar' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 4 THEN 'Apr' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 5 THEN 'May' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 6 THEN 'Jun' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 7 THEN 'Jul' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 8 THEN 'Aug' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 9 THEN 'Sept' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 10 THEN 'Oct' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 11 THEN 'Nov' ELSE CASE WHEN datepart(month, dbo.fngetorderdate(salesorderid)) = 12 THEN 'Dec' ELSE NULL END END END END END END END END END END END END
Want to know more? Married with children 6 year girl, 3 year girl and 1 year boy. One girl to be born on May 11. Live outside Helsingborg Blog at http://weblogs.sqlteam.com/peterl/ Homepage and contact http://www.developerworkshop.net/ peso@developerworkshop.net Co-founder of PASS Scania http://www.pass-scania.se/ Now part of SQLUG Phil Factor Speed Phreak challenges 3 time winner Another presentation Execution Plan basics