In and Out of the PostgreSQL Shared Buffer Cache

Similar documents

Outline. Failure Types

COS 318: Operating Systems

Q & A From Hitachi Data Systems WebTech Presentation:

Virtuoso and Database Scalability

Response Time Analysis

Storage in Database Systems. CMPSCI 445 Fall 2010

Operating Systems. Virtual Memory

Google File System. Web and scalability

The World According to the OS. Operating System Support for Database Management. Today s talk. What we see. Banking DB Application

Database Management Systems

Enhancing SQL Server Performance

Check Please! What your Postgres database wishes you would monitor. / Presentation

The Classical Architecture. Storage 1 / 36

Response Time Analysis

How to Optimize the MySQL Server For Performance

PGCon PostgreSQL Performance Pitfalls

Chapter 11 I/O Management and Disk Scheduling

Informix Performance Tuning using: SQLTrace, Remote DBA Monitoring and Yellowfin BI by Lester Knutsen and Mike Walker! Webcast on July 2, 2013!

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

Throwing Hardware at SQL Server Performance problems?

Lecture 16: Storage Devices

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

Lecture 17: Virtual Memory II. Goals of virtual memory

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Recovery and the ACID properties CMPUT 391: Implementing Durability Recovery Manager Atomicity Durability

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Chapter 13: Query Processing. Basic Steps in Query Processing

PostgreSQL when it s not your job. Christophe Pettus PostgreSQL Experts, Inc. DjangoCon Europe 2012

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Which Database is Better for Zabbix? PostgreSQL vs MySQL. Yoshiharu Mori SRA OSS Inc. Japan

SDFS Overview. By Sam Silverberg

Performance White Paper

Response Time Analysis

Azure VM Performance Considerations Running SQL Server

Audit & Tune Deliverables

PostgreSQL when it s not your job. Christophe Pettus PostgreSQL Experts, Inc. DjangoCon US 2012

Configuring Apache Derby for Performance and Durability Olav Sandstå

Tech Tip: Understanding Server Memory Counters

Chapter 11 I/O Management and Disk Scheduling

TUTORIAL WHITE PAPER. Application Performance Management. Investigating Oracle Wait Events With VERITAS Instance Watch

Database Hardware Selection Guidelines

KVM & Memory Management Updates

File Management. Chapter 12

One of the database administrators

Performance Tuning and Optimizing SQL Databases 2016

1 Storage Devices Summary

SQL Server Transaction Log from A to Z

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

File System Management

PERFORMANCE TUNING IN MICROSOFT SQL SERVER DBMS

Capacity Planning Process Estimating the load Initial configuration

Secondary Storage. Any modern computer system will incorporate (at least) two levels of storage: magnetic disk/optical devices/tape systems

Boost SQL Server Performance Buffer Pool Extensions & Delayed Durability

Cognos Performance Troubleshooting

ENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #

Two Parts. Filesystem Interface. Filesystem design. Interface the user sees. Implementing the interface

Deploying and Optimizing SQL Server for Virtual Machines

The 5-minute SQL Server Health Check

VI Performance Monitoring

Apache Derby Performance. Olav Sandstå, Dyre Tjeldvoll, Knut Anders Hatlen Database Technology Group Sun Microsystems

Deep Dive: Maximizing EC2 & EBS Performance

Before you install ProSeries software for network use

Postgres Plus Advanced Server

I/O Management. General Computer Architecture. Goals for I/O. Levels of I/O. Naming. I/O Management. COMP755 Advanced Operating Systems 1

Windows Server Performance Monitoring

DATABASE. Pervasive PSQL Performance. Key Performance Features of Pervasive PSQL. Pervasive PSQL White Paper

Using Continuous Operations Mode for Proper Backups

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Performance Monitoring with Dynamic Management Views

MySQL: Cloud vs Bare Metal, Performance and Reliability

These sub-systems are all highly dependent on each other. Any one of them with high utilization can easily cause problems in the other.

How to handle Out-of-Memory issue

Optimising SQL Server CPU performance

About Me: Brent Ozar. Perfmon and Profiler 101

Windows NT File System. Outline. Hardware Basics. Ausgewählte Betriebssysteme Institut Betriebssysteme Fakultät Informatik

Benchmarking FreeBSD. Ivan Voras

Outline. Windows NT File System. Hardware Basics. Win2K File System Formats. NTFS Cluster Sizes NTFS

Web Server (Step 1) Processes request and sends query to SQL server via ADO/OLEDB. Web Server (Step 2) Creates HTML page dynamically from record set

File Systems Management and Examples

Configuring RAID for Optimal Performance

Optimizing LTO Backup Performance

Outline. Principles of Database Management Systems. Memory Hierarchy: Capacities and access times. CPU vs. Disk Speed

Using Synology SSD Technology to Enhance System Performance Synology Inc.

PERFORMANCE TUNING ORACLE RAC ON LINUX

Current Status of FEFS for the K computer

Postgres Plus Advanced Server Performance Features Guide

Important Differences Between Consumer and Enterprise Flash Architectures

How To Make Your Database More Efficient By Virtualizing It On A Server

Introduction. AppDynamics for Databases Version Page 1

ACME Intranet Performance Testing

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Microsoft SharePoint Server 2010

DESKTOP VIRTUALIZATION SIZING AND COST MODELING PROCESSOR MEMORY STORAGE NETWORK

Novell File Reporter 2.5 Who Has What?

Page replacement in Linux 2.4 memory management

Performance Modeling and Analysis of a Database Server with Write-Heavy Workload

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

The Complete Performance Solution for Microsoft SQL Server

Distribution One Server Requirements

MSU Tier 3 Usage and Troubleshooting. James Koll

Transcription:

In and Out of the PostgreSQL Shared Buffer Cache 2ndQuadrant US 05/21/2010

About this presentation The master source for these slides is http://projects.2ndquadrant.com You can also find a machine-usable version of the source code to the later internals sample queries there

Database organization Databases are mainly a series of tables Each table gets a subdirectory In that directory are a number of files A single files holds up to 1GB of data (staying well below the 32-bit 2GB size limit) The file is treated as a series of 8K blocks

Buffer cache organization shared buffers sets the size of the cache (internally, NBuffers) The buffer cache is a simple array of that size Each cache entry points to an 8KB block (sometimes called a page) of data In many scanning cases the cache is as a circular buffer; when all buffers are used, scanning the buffer cache start over at 0 Initially all the buffers in the cache are marked as free

Entries in the cache Each buffer entry has a tag The tag says what file (and therefore table) this entry is buffering and which block of that file it contains A series of flags show what state this block of data is in Pinned buffers are locked by a process and can t be used for anything else until that s done. Dirty buffers have been modified since they were read from disk The usage count estimates how popular this page has been recently Good read cache statistics available in views like pg statio user tables

Buffer Allocation When a process wants to access a block, it requests a buffer allocation for it If the block is already cached, its returned with increased usage count Otherwise, a new buffer must be found to hold this data If there are no buffers free (there usually aren t) a buffer is evicted to make space for the new one If that page is dirty, it is written out to disk, and the backend waits for that write The block on disk is read into the page in memory The usage count of an allocated buffer starts at 1

Eviction with usage counts The usage count is used to sort popular pages that should be kept in memory from ones that are safer to evict Buffers are scanned sequentially, decreasing their usage counts the whole time Any page that has a non-zero usage count is safe from eviction The maximum usage count any buffer can get is set by BM MAX USAGE COUNT, currently fixed at 5 This means that a popular page that has reached usage count=5 will survive 5 passes over the entire buffer cache before it s possible to evict it.

Interaction with the Operating System cache PostgreSQL is designed to rely heavily on the operating system cache The shared buffer cache is really duplicating what the operating system is already doing: caching popular file blocks Exactly the same blocks can be cached by both the buffer cache and the OS page cache It s a bad idea to give PostgreSQL too much memory But you don t want to give it too little. The OS is probably using a LRU scheme, not a database optimized clock-sweep You can spy on the OS cache using pg fincore

Looking inside the buffer cache: pg buffercache You can take a look into the shared buffer cache using the pg buffercache module 8.3 and later versions includes the usage count information cd contrib/pg buffercache make make install psql -d database -f pg buffercache.sql

Limitations of pg buffercache Module is installed into one database, can only decode table names in that database Viewing the data takes many locks inside the database, very disruptive When you d most like to collect this information is also the worst time to do this expensive query Frequent snapshots will impact system load, might collect occasionallly via cron or pgagent,. Cache the information if making more than one pass over it

Simple pg buffercache queries: Top 10 SELECT c.relname,count(*) AS buffers FROM pg class c INNER JOIN pg buffercache b ON b.relfilenode=c.relfilenode INNER JOIN pg database d ON (b.reldatabase=d.oid AND d.datname=current database()) GROUP BY c.relname ORDER BY 2 DESC LIMIT 10; Join against pg class to decode the file this buffer is caching Top 10 tables in the cache and how much memory they have Remember: we only have the information to decode tables in the current database

Buffer contents summary relname buffered buffers % % of rel accounts 306 MB 65.3 24.7 accounts pkey 160 MB 34.1 93.2 usagecount count isdirty 0 12423 f 1 31318 f 2 7936 f 3 4113 f 4 2333 f 5 1877 f

General shared buffers sizing rules Anecdotal tests suggest 15% to 40% of total RAM works well Start at 25% and tune from there Systems doing heavy amounts of write activity can discover checkpoints are a serious problem Checkpoint spikes can last several seconds and essentially freeze the system. The potential size of these spikes go up as the memory in shared buffers increases. There is a good solution for this in 8.3 called checkpoint completion target, but in 8.2 and before it s hard to work around. Only memory in shared buffers participates in the checkpoint Reduce that and rely on the OS disk cache instead, the checkpoint spikes will reduce as well.

Monitoring buffer activity with pg stat bgwriter select * from pg stat bgwriter - added in 8.3 Statistics about things moving in and out of the buffer cache Need to save multiple snapshots with a timestamp on each to be really useful buffer alloc is the total number of calls to allocate a new buffer for a page (whether or not it was already cached) Comparing checkpoints timed and checkpoints req shows whether you ve set checkpoint segments usefully

Three ways for a buffer to be written buffers checkpoint: checkpoint reconciliation wrote the buffer buffers backend: client backend had to write to satisfy an allocation buffers clean: background writer cleaned a dirty buffer expecting an allocation maxwritten clean: The background writer isn t being allowed to work hard enough

Derived statistics Timed checkpoint % % of buffers written by checkpoints, background writer cleaner, backends If you have two snapshots with a time delta, can compute figures in real-world units Average minutes between checkpoints Average amount written per checkpoint Buffer allocations per second * buffer size / interval = buffer allocations in MB/s Total writes per second * buffer size / interval = avg buffer writes in MB/s

Spreadsheet Sometimes slides are not what you want

Iterative tuning with pg buffercache and pg stat bgwriter Increase checkpoint segments until time between is reasonable Increase shared buffers until proportion of high usage count buffers stop changing Positive changes should have a new MB/s write figure and changed checkpoint statistics Optimize system toward more checkpoint writes, and total writes should drop Go too far and the size of any one checkpoint may be uncomfortably large, causing I/O spikes When performance stops improving, you ve reached the limits of usefully tuning in this area

Wrap-up Database buffer cache is possible to instrument usefully Saving regular usage snapshots allows tracking internals trends It s possible to measure the trade-offs made as you adjust buffer cache and checkpoint parameters No one tuning is optimal for everyone, workloads have very different usage count profiles

Credits Contributors toward the database statistics snapshots in the spreadsheet: Kevin Grittner (Wisconsin Courts) Ben Chobot (Silent Media)

Questions? The cool kids hang out on pgsql-performance