Java Performance Adrian Dozsa TM-JUG 18.09.2014
Agenda Requirements Performance Testing Micro-benchmarks Concurrency GC Tools
Why is performance important? We hate slow web pages/apps We hate timeouts even more (Black Friday :) You don t want to see a timeout at an ATM More HW cost money A slow site/app is slow business
Trade-offs Performance vs Scalability Latency vs Throughput Availability vs Consistency
Levels Just don t do stupid things Use best practices Consider performance in design Performance drives design Extreme edge cases
Requirements It all starts at the Requirements If we don t know our destination how can we get to it? It should be fast is not a requirement We need hard numbers, and lots of context info We also need HW details Upfront, not at the end NFR requirements can greatly influence your design
Requirements Examples of requirements you might need TPS (avg, 90%, 95%) Latency (avg, 90%, 95%) CPU consumption Memory consumption Data volume (count, size) and distribution Data retention On what HW&middleware
Performance Testing Never guess measure, measure and measure Start with the requirements Define performance use cases And test with real world conditions Beware of Micro-benchmarks
Performance Team anti-pattern Performance starts with Requirements Performance drives design and dev Too late and expensive to address during Performance testing Performance is everyone's responsibility
Micro-benchmarks Do not expect too much from micro-benchmarks Always include a warmup phase Always run with -XX:+PrintCompilation, -verbose:gc Be aware of deoptimization and recompilation effects Reduce noise in your measurements (stable environment) Beware of System.currentTimeMillis (use System. getnano)
JIT Optimizations Method inlining Dead code elimination Loop optimization Control flow optimization Branch prediction Method de-virtualizing
Concurrency Use parallelism (more threads) Avoid sharing data (Share-nothing Architecture) Stateless services Favor immutable data Avoid/minimize synchronization (avoid blocking) Use lock-free algorithms or data structures Avoid/minimize context switching Know your data structures (queue/map implementations)
Garbage Collection Can greatly influence performance (throughput/latency) Understand the basics of GC Understand GC metrics (GC logs) Know the GC algorithms (and their specifics)
Garbage Collection Hotspot Heap Structure
Garbage Collection Reduce GC frequency Reduce GC pause time Heap Size choice - impacts pause time and frequency Object allocation rate and liveliness GC choice based on throughput vs latency new G1 GC option Azul C4 pauseless GC
Garbage Collection Frequency of minor GC is dictated by Application object allocation rate Size of the eden space Frequency of object promotion into old generation is dictated by Frequency of minor GCs (how quickly objects age) Size of the survivor spaces (large enough to age effectively) Object retention impacts latency more than object allocation
Garbage Collection Object allocation is very cheap! Reclamation of new objects is also very cheap! Don t be afraid to allocate short lived objects GCs love small immutable objects and short-lived objects But, don t go overboard It is better to use short-lived immutable objects than longlived mutable objects Ideal situation: After application initialization phase, only experience minor GCs and old generation growth is negligible
Advice on choosing a GC Start with Parallel GC (-XX:+UseParallel[Old]GC) Parallel GC offers the fastest minor GC times If you can avoid full GCs, you ll likely achieve the best throughput, smallest footprint and lowest latency Move to CMS or G1 if needed (for old gen collections) CMS minor GC times are slower due to promotion into free lists CMS full GC avoided via old generation concurrent collection G1 minor GC times are slower due to remembered set overhead G1 full GC avoided via concurrent collection and fragmentation avoided by partial old generation collection
Database I/O is expensive Database calls will be most of the work (do them wisely) N+1 problem (ORM) Caching data Resource Pooling Indexing Minimize round-trips Use query explain plans
Transactions Good for correct programs Not so good for performance and scalability Beware of distributed transactions! ACID is good in short doses (keep transactions short)
Tools Multiple categories Load generation Performance measurement Application monitoring (for production) Profilers (CPU, memory, concurrency) Database tools
Perf4j Perf4J is to System.currentTimeMillis() as log4j is to System.out.println() Simple API Write to log file Parse log file Provides statistics Provides graphs
Apache JMeter Generates load Measures performance Many different server/protocol types Desktop app, pure Java Graphs and reports
Java VisualVM Part of the JDK Powerful tool to troubleshoot, monitor, profile Java apps Provides info on CPU, memory, GC, concurrency, JMX Covers most development needs
YourKit Java Profiler Commercial Java profiler Rich set of features (JEE, IDE integration, SQL, etc) Complex CPU, memory, GC, thread profiling
Java Mission Control Part of JDK (Java 7u40) - former JRockit Mission Control Free for development purposes Tools to monitor, manage, profile Java apps More production oriented (JMV ops) Java Flight Recorder (collect data from app, JVM, OS)
Query plans Learn to read and use query plans - to understand how your query will behave Use SQLDeveloper for Oracle IBM Data Studio for DB2 Microsoft SQL Server Management Studio
References Java Performance book Hotspot JVM GC tuning Java Performance Optimization DZone RefCard Top 10 Most Common Java Performance Problems Performance Testing Java Applications Presentation
Questions?