Understanding and Detec.ng Real- World Performance Bugs Gouliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu Presented by Cindy Rubio- González Feb 10 th, 2015
Mo.va.on Performance bugs pervasively degrade somware performance Performance bugs lead to reduced throughput, increased latency, and wasted resources Performance bugs exist widely in released somware Performance bugs are costly to diagnose 2
Contribu.ons Characteris.cs Study Conducts comprehensive study of 109 real- world performance bugs Finds how performance bugs are introduced, exposed, and fixed Bug Detec.on Iden.fies useful efficiency rules Implements 25 rule checkers Finds 332 poten.al performance bugs in real- world applica.ons 3
Characteris.cs Study
Benchmarks Applica'on Descrip'on Language(s) # Bugs Apache Suite HTTPD: Web Server TomCat: Web App Server Ant: Build Management U.lity Chronium Suite Google Chrome Browser GCC Suite gcc and g++ Compilers Mozilla Suite Firefox: Web Browser Thunderbird: Email Client MySQL Suite Server: Database Server Connector: DB Client Libraries Total C Java Java C/C++ C/C++ C++, JavaScript C++ JavaScript C/C++ C/C++, Java,.Net 25 10 10 36 28 109 5
Bug Selec.on GCC, Mozilla, and MySQL Developers use special tags for performance bugs compile- )me- hog, perf, and S5 Apache and Chrome Used performance related keywords Slow, performance, latency, throughput, etc. Randomly sampled 109 fixed bugs 27 reported before 2004 38 reported between 2004 and 2007 44 reported amer 2008 6
Example 1: Transparent Draw Mozilla Bug #66461 A Mozilla bug drawing transparent images Useless computa.on when image is transparent Transparent images were not expected to be widely used for layout purposes nsimage::draw( ) { + if(mistransparent) return; } // render the input image Patch condi.onally skips Draw nsimagegtk.cpp 7
Example 2: Intensive GC Mozilla Bug #515287 A Mozilla bug performing intensive GCs Firefox 10x slower than Safari at idle GMail pages Bug observed un.l Web 2.0 XMLHttpRequest::OnStop() { // at the end of each XHR } - mscriptcontext- >GC(); Patch removes call to GC nsxmlhfprequest.cpp 8
Example 3: Bookmark All Mozilla Bug #490742 A Mozilla bug with un- batched DB opera.ons Firefox hangs when number of tabs >= 20 One of the first few batchable database tasks in Firefox for (i = 0; i < tabs.length; i++) { - tabs[i].dotransact(); } + doaggregatetransact(tabs); Patch adds a new API to aggregate DB transac.ons nsplacestransac.onsservice.js 9
Example 4: Slow Fast- Lock MySQL Bug #38941 A MySQL bug with over synchroniza.on fastmutex_lock calls random, which contains a lock Users reported a 40x slowdown when using the fast lock int fastmutex_lock (fmutex_t *mp) { } - maxdelay += (double)random(); + maxdelay += (double)park_rng(); Patch replaces random with non- synchronized number generator thr_mutex.c 10
How are Performance Bugs introduced? Workload Mismatch Input paradigm could shim later on Workload much diverse and complex } 41 API Misunderstanding Sensi.vity with respect to parameter values Code encapsula.on hides details } 31 Others Example: Side- effect of func.onal bugs Failing to reset a busy flag } 38 11
How are Performance Bugs introduced? 16 14 13 14 13 13 12 10 8 6 4 2 6 6 2 3 5 5 1 4 10 7 8 Workload API Others 0 Apache Chrome GCC Mozilla MySQL Workload: 41 API: 31 Others: 38 12
How are Performance Bugs exposed? Always Ac.ve Bugs Happen during start- up, shut down, and in heavily executed code In general, affect every single run }15 Input Feature and Scale Condi.ons Require specialized inputs Require large- scale inputs }198 13
How are Performance Bugs exposed? 25 20 15 10 5 0 23 22 19 19 18 17 15 13 13 10 10 10 6 6 4 3 2 2 1 0 Apache Chrome GCC Mozilla MySQL Always Ac.ve Special Feature Special Scale Feature+Scale Always Ac.ve: 15 Special Feature: 75 Special Scale: 71 Feature+Scale: 52 14
How are Performance Bugs fixed? Change Call Sequence Examples: Bookmark All and Intensive GC Change Condi.on Example: Transparent Draw }47 }35 Adjust Func.on Parameters }13 15
How are Performance Bugs fixed? 25 20 20 15 10 5 0 12 10 9 10 10 7 7 6 5 4 4 3 3 3 2 2 1 0 0 Apache Chrome GCC Mozilla MySQL Call Sequence Condi.on Parameter Others Call Sequence: 47 Condi.on: 35 Parameter: 13 Others: 23 16
Other Characteris.cs Life Span for 36 Mozilla bugs 935 days on average to be discovered 140 days to be fixed Bug Loca.on 75% inside input- dependent loop or input- event handler Server vs. Client Performance Bugs 41 bugs from servers applica.ons 68 bugs from client applica.ons Synchroniza.on issues different 17
Lessons Learned Performance bugs: hide for long periods of.me can emerge from non- buggy code as somware evolves cannot be avoided most of the.me can escape tes.ng even when coverage is good There is a need for: annota.on systems workload changes and performance impact tes.ng diagnos.c techniques beyond profiling 18
Bug Detec.on
Efficiency Rules in Patches Code transforma.on + condi.on Improve performance while preserving func.onality Inspected 109 patches 50 contain efficiency rules Most of them related to the Change Call Sequence Fix Condi.ons on Func.on call sequences Parameter/return variables Calling contexts 20
Rule Checkers Sta.c vs. Dynamic checks Benchmarks Apache, MySQL, and Mozilla 40 patches contain rules, 25 sta.c checkable Checkers implementa.on 14 LLVM checkers for C/C++ 11 Python checkers for Java, JavaScript, and C# Intraprocedural dataflow analysis in LLVM checkers 21
Methodology for Rule Checking SoMware Versions Original version (buggy version) Latest version Other somware (when cross- applica.on checking) Categoriza.on of Results Poten.al performance problems (PPPs) Bad and good prac.ces False posi.ves 22
PPPs and False Posi.ves 250 219 200 150 125 113 PPPs 100 FPs 50 41 30 46 0 Original Latest Other SoMware 23
Performance Bug Example MySQL Bug 15811 - char *end=str+strlen(str); - if (ismbchar(cs, str, end)) + if (ismbchar(cs, str, str + cs- >mbmaxlen)) strings/ctype- mb.c // end is only used in the ismbchar checking - for (end=s; *end ; end++) ; - if (ismbchar(mysqlcs, s, end) ) + if (ismbchar(mysqlcs, s, s+mysqlcs- >mbmaxlen) libmysql/libmysql.c 24
False Posi.ves 25% false posi.ve rate Three main sources: 1. Python checkers lack type informa.on 2. Non- func.on rules difficult to express and check 3. Some rules require workloads, and measuring run.me 25
Bad and Good Prac.ces 300 250 252 200 150 100 50 191 29 27 94 49 Good Prac.ces Bad Prac.ces 0 Original Latest Other SoMware 26
Performance and Effec.veness Analysis Performance Python checkers take 90 sec per 10 MLOC Each LLVM checker takes 4 to 1270 sec per applica.on Analysis Effec.veness Saving effort Improving Performance Maintaining Code Readability Other Usage: Performance Specifica.ons 27
Can PPPs be detected by other tools? Copy- Paste Detectors? Mistakes go beyond copy and paste errors Compiler Op.miza.ons? Many PPPs involve library func.ons and algorithmic inefficiencies Might require expensive interprocedural and points to analyses General Rule- Based Detectors? Unlike func.onal correctness rules, efficiency rules are not rare 28
Summary Studied 109 real- world performance bugs Provided guidance for future research on Performance- bug avoidance Performance tes.ng Performance- bug detec.on Explored rule- based performance- bug detec.on Iden.fied efficiency rules from real- world code Found hundreds of performance bugs 29