High-Volume Writes with PostgreSQL

Similar documents
PGCon PostgreSQL Performance Pitfalls

Scalability And Performance Improvements In PostgreSQL 9.5

In and Out of the PostgreSQL Shared Buffer Cache

PostgreSQL when it s not your job. Christophe Pettus PostgreSQL Experts, Inc. DjangoCon Europe 2012

PostgreSQL when it s not your job. Christophe Pettus PostgreSQL Experts, Inc. DjangoCon US 2012

PostgreSQL 9.0 High Performance

PostgreSQL Performance when it s not your job.

Outline. Failure Types

Check Please! What your Postgres database wishes you would monitor. / Presentation

Which Database is Better for Zabbix? PostgreSQL vs MySQL. Yoshiharu Mori SRA OSS Inc. Japan

Monitoring PostgreSQL database with Verax NMS

XenDesktop 7 Database Sizing

DMS Performance Tuning Guide for SQL Server

Benchmarking FreeBSD. Ivan Voras

Monitor the Heck Out Of Your Database

Audit & Tune Deliverables

How To Run A Standby On Postgres (Postgres) On A Slave Server On A Standby Server On Your Computer (Mysql) On Your Server (Myscientific) (Mysberry) (

Whitepaper: performance of SqlBulkCopy

Agenda Hi-Media Activities Tool Set for production Replication and failover Conclusion. 2 years of Londiste. Dimitri Fontaine.

Enhancing SQL Server Performance

Optimising the Mapnik Rendering Toolchain

Tech Tip: Understanding Server Memory Counters

Virtuoso and Database Scalability

Performance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit.

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Sawmill Log Analyzer Best Practices!! Page 1 of 6. Sawmill Log Analyzer Best Practices

Database Hardware Selection Guidelines

Administering your PostgreSQL Geodatabase

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3

PERFORMANCE TUNING ORACLE RAC ON LINUX

The Classical Architecture. Storage 1 / 36

MS SQL Performance (Tuning) Best Practices:

Gladinet Cloud Backup V3.0 User Guide

EZManage V4.0 Release Notes. Document revision 1.08 ( )

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Configuring Apache Derby for Performance and Durability Olav Sandstå

PostgreSQL database performance optimization. Qiang Wang

This document will list the ManageEngine Applications Manager best practices

MSU Tier 3 Usage and Troubleshooting. James Koll

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5

Benchmarking Hadoop & HBase on Violin

plproxy, pgbouncer, pgbalancer Asko Oja

How To Test For Speed On Postgres (Postgres) On A Microsoft Powerbook On A 2.2 Computer (For Microsoft) On An 8Gb Hard Drive (For

7.x Upgrade Instructions Software Pursuits, Inc.

Performance And Scalability In Oracle9i And SQL Server 2000

Seradex White Paper. Focus on these points for optimizing the performance of a Seradex ERP SQL database:

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

MySQL Cluster Deployment Best Practices

PostgreSQL Backup Strategies

Enterprise Architectures for Large Tiled Basemap Projects. Tommy Fauvell

Oracle Database 11 g Performance Tuning. Recipes. Sam R. Alapati Darl Kuhn Bill Padfield. Apress*

Managing your Domino Clusters

CMS Performance Tuning Guide

Database Virtualization

Magento & Zend Benchmarks Version 1.2, 1.3 (with & without Flat Catalogs)

Auslogics BoostSpeed 5 Manual

1 How to Monitor Performance

Recommended hardware system configurations for ANSYS users

Distribution One Server Requirements

Virtual server management: Top tips on managing storage in virtual server environments

Tushar Joshi Turtle Networks Ltd

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Geospatial Server Performance Colin Bertram UK User Group Meeting 23-Sep-2014

Databases Going Virtual? Identifying the Best Database Servers for Virtualization

MySQL: Cloud vs Bare Metal, Performance and Reliability

SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here

Sitecore Health. Christopher Wojciech. netzkern AG. Sitecore User Group Conference 2015

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

Optimizing Your Database Performance the Easy Way

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

COS 318: Operating Systems

Enterprise Manager Performance Tips

Performance test report

PERFORMANCE TUNING IN MICROSOFT SQL SERVER DBMS

Performance Tuning best pracitces and performance monitoring with Zabbix

Serving 4 million page requests an hour with Magento Enterprise

PTC System Monitor Solution Training

Adobe Marketing Cloud Data Workbench Monitoring Profile

Squeezing The Most Performance from your VMware-based SQL Server

PostgreSQL Audit Extension User Guide Version 1.0beta. Open Source PostgreSQL Audit Logging

Web Server (Step 1) Processes request and sends query to SQL server via ADO/OLEDB. Web Server (Step 2) Creates HTML page dynamically from record set

Using Redis as a Cache Backend in Magento

Gleb Arshinov, Alexander Dymo PGCon 2010 PostgreSQL as a secret weapon for high-performance Ruby on Rails applications

CS 6290 I/O and Storage. Milos Prvulovic

Chapter 14: Recovery System

GraySort on Apache Spark by Databricks

Cognos Performance Troubleshooting

I-Motion SQL Server admin concerns

Ceph Optimization on All Flash Storage

PERFORMANCE TUNING FOR PEOPLESOFT APPLICATIONS

ONE NATIONAL HEALTH SYSTEM ONE POSTGRES DATABASE

Tuning Tableau Server for High Performance

How to Optimize the MySQL Server For Performance

MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?

Recovery Principles in MySQL Cluster 5.1

Greenplum Database Best Practices

Analyzing IBM i Performance Metrics

Response Time Analysis

Transcription:

High-Volume Writes with PostgreSQL

Major parameters to set shared_buffers: 512MB to 8GB checkpoint_segments: 16 to 256 effective_cache_size: typically ¾ RAM wal_buffers: typically 16MB Auto-tuned in 9.1

Checkpoints Dirty data in buffer must be flushed WAL segments are 16MB Requested checkpoint checkpoint_segments of writes Timed checkpoint checkpoint_timeout (5 minute default)

Checkpoint spikes 8.3 added Spread Checkpoints Aims to finish at 50% of progress fsync flush to disk at end of checkpoint Optimal behavior: OS wrote data out before fsync call Spreading sync out didn t work usefully Spikes still happen

A bad checkpoint LOG: checkpoint complete: wrote 127961 buffers (12.2%); 0 transaction log file(s)added, 1818 removed, 0 recycled; write=80.190 s, sync=359.823 s, total=520.913 s

A funding checkpoint LOG: checkpoint complete: wrote 141563 buffers (13.5%); 0 transaction log file(s) added, 1109 removed, 257 recycled; write=944.601 s, sync=10635.130 s, total=11613.953 s

Types of writes Checkpoint write: most efficient Background writer write: still good Backend write, fsync Fine if aborbed by background writer Write will be cached by OS later Backend write, BGW queue filled backend does fsync itself Very bad, multi-hour checkpoints possible Improved in 9.1

bgwriter monitoring $ psql x c "select * from pg_stat_bgwriter" checkpoints_timed 0 checkpoints_req 4 buffers_checkpoint 6 buffers_clean 0 maxwritten_clean 0 buffers_backend 654685 buffers_backend_fsync 84 buffers_alloc 1225

Time analysis $ psql c "select now(),* from pg_stat_bgwriter" Sample two points Buffers are 8K each (normally) Compute time delta, value delta Buffers allocated: read MB/s Sum of buffers written: write MB/s Compute or graph Munin has an example

bgwriter trends

Cache refill

Linux tuning ext3 on old kernels does blocky fsync dirty_ratio lowers write cache size in % Kernel 2.6.29 is finer grained dirty_bytes sets exact amount of RAM Cannot go too far OS write caching is expected VACUUM slows a lot: 50% drop possible

VACUUM Cleans up after UPDATE and DELETE The hidden cost of MVCC Must happen eventually Frozen ID cleanup

Autovacuum Cleans up after dead rows Also updates database stats Large tables: 20% change required autovacuum_vacuum_scale_factor=20

VACUUM Overhead Intensive when it happens Focus on smoothing and scheduling Putting it off makes it worse Dead rows add invisible overhead Table bloat can be very large Thresholds can be per-table

Index Bloating Indexes can become less efficient after deletes VACUUM FULL before 9.0 makes this worse REINDEX helps, but it locks the table CREATE INDEX can run CONCURRENTLY Rename: simulate REINDEX CONCURRENTLY All transactions must end to finish CLUSTER does a full table rebuild Same fresh performance as after dump/reload Full table lock to do it

VACUUM Gone Wrong Aim at a target peak performance VACUUM isn't accounted for Just survive peak load? You won't survive VACUUM

VACUUM monitoring Watch pg_stat_user_tables timestamps Beware long-running transactions log_autovacuum_min_duration Sizes of tables/indexes critical too

Improving efficiency maintenance_work_mem: up to 2GB shared_buffers & checkpoint_segments (again) Hardware write caches Tune read-ahead

VACUUM Cost Limits vacuum_cost_page_hit = 1 vacuum_cost_page_miss = 10 vacuum_cost_page_dirty = 20 vacuum_cost_limit = 200 autovacuum_vacuum_cost_delay = 20ms

autovacuum Cost Basics Every 20 ms = 50 runs/second Each run accumulates 200 cost units 200 * 50 = 10000 cost / second

Costs and Disk I/O 20ms = 10000 cost/second All misses @ 10 cost? 10000 / 10 = 1000 reads/second 1000*8192/(1024*1024)=7.8MB/s read All dirty @ 20 cost? 10000 / 20 = 500 writes/second 500*8192/(1024*1024)=3.9 MB/s write Halve the delay to 10ms? Doubles the rate: 17.2 MB/s / 7.8 MB/s Double the delay to 40ms? Halves the rate: 3.9 MB/s / 1.95 MB/s

Submission for 9.2 Displaying accumulated autovacuum cost In November CommitFest Easily applies to older versions Not very risky to production Just adds some logging Useful for learning how to tune costs

Sample logging output LOG: automatic vacuum of table "pgbench.public.pgbench_accounts": index scans: 1 pages: 0 removed, 163935 remain tuples: 2000000 removed, 2928356 remain buffer usage: 117393 hits, 123351 misses, 102684 dirtied, 2.168 MiB/s write rate system usage: CPU 2.54s/6.27u sec elapsed 369.99 sec

Common tricks Manual VACUUM during slower periods Make sure to set vacuum_cost_delay Start with daily Break down by table size Alternate fast/slow configurations Two postgresql.conf files, or edit script Swap/change using cron or pgagent Aggressive freezing

Write to disk, slow way Data page change to pg_xlog WAL Checkpoint pushes page to disk Hint bits update page for faster visibility Autovacuum marks free space Freeze old transaction IDs

Manually maintained path Data page change to pg_xlog WAL Checkpoint pushes page to disk Manually freeze old transaction Ids Tweak vacuum_freeze_min_age and/or vacuum_freeze_table_age

Hardware for fast writes Log checkpoints to catch spikes Battery-backed write cache Fast commits Beware volatile write caches http://wiki.postgresql.org/wiki/reliable_writes

Hard Drive Latency Type Latency-ms Transactions/Sec 5400 RPM 11.1 90 7200 RPM 8.3 120 10K RPM 6.0 167

Latency driving TPS

Partitioning Time-series data splits most easily Monthly partitions typical Setup is manual and requires some code Queries can only exclude partitions Old partitions don't need vacuum Once frozen, they're ignored Indexes are smaller Less used indexes fade from cache Oldest data can be truncated No deletion VACUUM cleanup!

Skytools Proven to handle write scaling Database access wrapped in functions PL/Proxy routes to appropriate notes Any, all, etc. Replication used for shared data Rebalancing is tricky Pause feature in pgbouncer helps Hard to retrofit to existing system

Special thanks Some monitoring samples provided by: Track, measure, and improve your fitness Clients for Android and iphone http://runkeeper.com/

PostgreSQL Books http://www.2ndquadrant.com/books/

Questions Slides at 2ndQuadrant.com Resources / Talks