Module 7 Backup and Recovery
Objectives Backup Types SQL Dump Cluster Dump Offline Copy Backup Online Backups Point-In Time Recovery
Backup As with any database, PostgreSQL database should be backed up regularly. There are three fundamentally different approaches to backing up PostgreSQL data: SQL dump File system level backup On-line backup Lets discuss them in detail.
Backup SQL Dump Generate a text file with SQL commands PostgreSQL provides the utility program pg_dump for this purpose. pg_dump does not block reads or writers. pg_dump does not operate with special permissions. In particular, it must have read access to all tables that you want to back up,so in practice you almost always have to run it as a database superuser. Dumps created by pg_dump are internally consistent, that is, the dump represents a snapshot of the database as of the time pg_dump begins running. Syntax: pg_dump [options] [dbname]
Backup SQL Dump cont.. pg_dump Options -a Data only. Do not dump the data definitions (schema) -s Data definitions (schema) only. Do not dump the data -n <schema> -Dump from the specified schema only -t <table> -Dump specified table only -f <path/file name.backup> - Send dump to specified file -Fp Dump in plain-text SQL script (default) -Ft Dump in tar format -Fc Dump in compressed, custom format -v Verbose option -o use oids
Restore SQL Dump The text files created by pg_dump are intended to be read in by the psql program. The general commandform to restore a dump is psql dbname< infile where infileis what you used as outfilefor the pg_dump command. The database dbnamewill not be created by this command, so you must create it yourself. pg_restore is used to restore a database backed up with pg_dump that was saved in an archive format i.e., a non-text format Files are portable across architectures Syntax: pg_restore [options ] [filename.backup]
Restore SQL Dump cont.. pg_restore Options -d <database name> -Connect to the specified database. Also restores to this database if C option is omitted -C Create the database named in the dump file & restore directly into it -a Restore the data only, not the data definitions (schema) -s Restore the data definitions (schema) only, not the data -n <schema> - Restore only objects from specified schema -t <table> - Restore only specified table -v Verbose option
Entire Cluster SQL Dump pg_dumpall is used to dump an entire database cluster in plain-text SQL format Dumps global objects -user, groups, and associated permissions Use PSQL to restore Syntax: pg_dumpall [options ] > filename.backup
Entire Cluster SQL Dump pg_dumpall Options -a, Data only. Do not dump schema. -s, Data definitions (schema) only. -g, Dump global objects only not databases. -c, Clean (drop) databases before recreating. -O, Skip restoration of object ownership. -x, do not dump privileges (grant/revoke) --disable-triggers, disable triggers during data-only restore -v Verbose option.
Restore SQL Dump cont.. Syntax: psql d template1 < filename.backup or psql d template1 f filename.backup Any database in the cluster can be used for the initial connection it doesn t have to be template1
Backup -File system level backup An alternative backup strategy is to directly copy the files that PostgreSQL uses to store the data in the database. You can use whatever method you prefer for doing usual file system backups, for example: tar -cf backup.tar/usr/local/pgsql/data The database server must be shut down in order to get a usable backup. File system backups only work for complete backup and restoration of an entire database cluster.
Backup - On-line backup Use when database must stay up while backup is occurring. postgres=# select pg_start_backup('label_goes_here') Copy the files/directory postgres=# select pg_stop_backup(); Archive_command must be set in postgresql.conf which archives WAL logs and supports PITR
Point-in-Time Recovery (PITR) Point-in-time recovery (PITR) is the ability to restore a database cluster up to the present or to a specified point of time in the past Uses a full database cluster backup and the write-ahead logs found in the /pg_xlog subdirectory Must be configured before it is needed (writeahead log archiving must be enabled)
Point-in-Time Recovery Step 1:Edit the postgresql.conf file and set the archive_command parameter Unix: archive_command= cp i %p /mnt/server/archivedir/%f </dev/null Windows: archive_command= 'copy "%p" c:\\mnt\\server\\archivedir\\"%f"' %p is absolute path of WAL otherwise you can define the path %f is a unique file name which will be created on above path.
Point-in-Time Recovery Step 2:Make a base backup Connect using edb-psql and issue the command: SELECT pg_start_backup( any useful label ); Use a standard file system backup utility to back up the /data subdirectory Connect using edb-psql and issue the command: SELECT pg_stop_backup(); Continuously archive the WAL segment files
Point-in-Time Recovery Final Step: Recovering the database Clean out all existing files in the /data directory and subdirectories (be sure to backup configuration files if you have not already done so) Restore the database files from the backup dump Copy any unarchived WAL files into the /pg_xlog directory Create a recovery.conf file in the /data directory Restart the database server
Point-in-Time Recovery Settings in the recovery.conf file: restore_command(string) Unix: restore_command = 'cp/mnt/server/archivedir/%f"%p" Windows: restore_command = 'copy c:\\mnt\\server\\archivedir\\"%f" "%p"' recovery_target_time(timestamp) recovery_target_xid(string) recovery_target_inclusive(boolean)
Summary In this Module we learned: Online Backups Point-In Time Recovery Scheduling Backup
Lab Exercise EDBStore Websitedatabase is all set and now as a DBA you need to plan a proper backup strategy and implement it. As root user create a folder /postgres_backup and assign ownership to postgres user using chown utility of windows security tab in folder properties. Take a full database dump of edbstore database with pg_dump utility. Dump should be in plain text format. Name the dump file as edbstore_full.sql and store it in postgres_backup folder.
Lab Exercise Take a schema only dump of edbstore database and name the file as edbstore_schema.sql Take a data only dump of edbstore database, disable all triggers for faster restore, use insert command instead of copy & name the file as edbstore_data.sql Take a full dump of only customers table and name the file as edbstore_customers.sql
Lab Exercise Take a full database dump of edbstore in compressed format using pg_dump utility, name the file as edbstore_full_fc.dmp Take a full database cluster dump of cluster running onport5445 using pg_dumpall. Remember pg_dumpall supports only plain text format, name the file edbdata_5445.sql In these exercises you learned how to take different types of dump backups. Lets now do some hand on restoring dumps.
Lab Exercise Drop database edbstore. Create database edbstore with edbstore owner. Restore the full dump from edbstore_full.sql and verify all the objects and their ownership. Drop database edbstore. Create database edbstore with edbstore owner. Restore the full dump from compressed file edbstore_full_fc.dmp and verify all the objects and their ownership.
Lab Exercise Create a directory /opt/arch or c:\arch and give ownership to postgres user. Open postgresql.conf file of your edbdata cluster and configure your cluster to run in archive mode and archive log location to be /opt/arch or c:\arch. Take a full online base backup of your cluster in postgres_backup directory.