PostgreSQL backps with NetWorker Release nmber 1.0 302-001-174 REV 01 Jne 30, 2014 Adience... 2 Reqirements... 2 Terminology... 2 PostgreSQL backp methodologies...2 PostgreSQL dmp backp... 3 Configring NetWorker for PostgreSQL dmp backps...3 Recovering backp issed by pg_dmp command... 4 PostgreSQL WAL backp... 5 Configring NetWorker for WAL backps... 7 Recovering WAL backp... 9 Known limitations... 10 Conclsion...11
Adience Reqirements Terminology The docment is intended for se by system administrators of NetWorker. Readers of this docment are expected to have DBA-level knowledge of PostgreSQL as well as NetWorker Administration skills to sccessflly implement backps and recoveries. Be aware of the following reqirements before attempting any backps: Yo shold know the specific folder or filesystems to store the archive logs. PostgreSQL configration files mst have been pdated with the relevant attribtes for backp to scceed. The NetWorker client mst be installed and rnning on the PostgreSQL host. No NetWorker Modle (for example, NMDA) is reqired to be installed on the PostgreSQL host. Yo shold be familiar with the following terms and their definitions. PostgreSQL an open sorce object-relational database management system (ORDBMS) with an emphasis on extensibility and compliance to standards. Write Ahead Log (WAL) a standard method for ensring data integrity. The WAL central concept is that changes to data files, where tables and indexes reside, mst be written only after those changes have been logged; that is, after log records which describe the changes have been flshed to permanent storage. If we follow this procedre, we do not need to flsh data pages to disk on every transaction commit becase we know that in the event of a crash we will be able to recover the database sing the log: Any change that has not been applied to the data pages can be redone from the log records. This is roll-forward recovery, also known as REDO. PostgreSQL backp methodologies Yo can perform PostgreSQL backps in two different ways. One way, as with many other databases, is to dmp the database, schema, and content to a file at certain point in time and backp that file. Another way is to interface the backp software with the database in order to have consistent online backps allowing Point in Time (PiT) recoveries. PostgreSQL can perform both backps methods becase its dmp process has the ability to interface the database dmp with the backp as well an online backp fnctionality called Write Ahead Logging (WAL). WAL creates redo log files that enable incremental backps and PiT Recoveries. Both the dmp backp and the WAL backp methodologies are discssed in this docment. Dmps backps are easy and fast to set p bt provide the ability to recover to recover only at the time of the dmp itself. WAL backps are a bit more complex to set p bt provide fll flexibility over of recovery, becase yo can recover yor database p to a specific second. 2 PostgreSQL backps with NetWorker 1.0 Technical Notes
PostgreSQL dmp backp pg_dmp Command command sage PostgreSQL dmps are text files containing all the SQL commands that recreate the database in the exact same state as it was at the time of the dmp. Those dmps provide the minimm secrity needed for a backp administrator to garantee the backp and recovery of a database. PostgreSQL dmp backp is the most commonly sed backp method. The easiest way to back p a database is to dmp it to a file and back p this dmp with NetWorker. There are two different pg_dmp commands to create PostgreSQL dmps. pg_dmp to dmp a specific database pg_dmpall to dmp all databases of a specific PostgreSQL server These create a fll backp. Fll docmentation on thepg_dmp command is available http://www.postgresql.org/ docs nder Dmp. Command sage example These commands can be sed with a script, either directly on the client definition as pre commands starting with NetWorker 8.1, or within a savepnpc script. An example of a pre command script is: "C:\Program Files\PostgresPls\9.3AS\bin\pg_dmp.exe" -p 5432 -d "testdb" -U postgres -w > "C:\Program Files\PostgresPls\Backps \dmp_testdb.dmp where: -p 5432 specifies the port nmber 5432 that is being sed to connect to the PostgreSQL database. The port nmber is configrable. -dspecifies specifies the folder location where the base backp is to be stored. For ease of configration and recovery, the folder location shold be the same as the one specified to store archive logs when yo enable WAL. -U postgres specifies that the sername is postgres. -w specifies that the application does not prompt for a password. Configring NetWorker for PostgreSQL dmp backps Configring NetWorker for PostgreSQL dmpbackps involves sing the pg_dmp commands. These can be sed in a script which can be triggered by a pre command attribte in the NetWorker client or in a savepnpc script. Before yo begin The pre command attribte in the NetWorker client is spported on NetWorker version 8.1 and later. Ensre that: the pre command script is prefixed with nsr yo save the pre command script in the NetWorker client /bin folder. PostgreSQL dmp backp 3
Below is an example of the NetWorker client pre command setting: Procedre 1. Create the script. Ensre that the script contains the pg_dmp or pg_dmpall command, with the appropriate argments to create the dmp in a specific location which can be shared with the archive logs generated by WAL. pg_dmp Command command sage on page 3 provides more information. 2. Generate a specific dmp name that contains the date and time of the dmp. Doing this prevents isses dring recovery. 3. Create a specific post command to remove the dmp created from the NetWorker client attribte or from the savepnpc script. This is becase if dmps are stored in the same folder as WAL archive logs, the dmps are backed p as well. This increases the incremental backp time and potentially cold affect yor SLAs. Recovering backp issed by pg_dmp command This procedre describes PostgreSQL in-place recovery sing the psql program. PostgreSQL dmps are recovered sing the psql program, reading back in and execting the SQL commands contained in the dmp file, sch as the following: - - -- EnterpriseDB database dmp -- SET statement_timeot = 0; SET lock_timeot = 0; SET client_encoding = 'UTF8'; SET standard_conforming_strings = on; SET check_fnction_bodies = false; SET client_min_messages = warning; -- -- Name: Contries; Type: SCHEMA; Schema: -; Owner: postgres -- CREATE SCHEMA "Contries"; ALTER SCHEMA "Contries" OWNER TO postgres; This recreates all the databases and schemas and inserts data in the databases to match the exact point in time of when the dmp was taken. Procedre 1. Create the database dbname to be sed by the psql dbname command. Yo can create it graphically or by command line. For example: createdb -T template0 dbname 2. Rn the following command to restore the dmp: psql dbname < dmpfile 4 PostgreSQL backps with NetWorker 1.0 Technical Notes
where dmpfile is the file otpt by the pg_dmp command. PostgreSQL WAL backp Setting p PostgreSQL WAL backps Integrating WAL with NetWorker enables incremental backps of PostgreSQL databases as well as Point-in-Time recoveries, by applying the logs to the recovered database p to a certain point. The high-level process for a WAL PostgreSQL backp is: 1. Configre the postgresql.conf file to copy the PostgreSQL logs in a specific location, at a specific point in time, and to be available for backp. 2. Configre a base backp in Networker to back p those logs. A base backp is a specific backp created by sing either the pg_basebackp command or low-level API calls. We discss only the pg_basebackp command in this docment. 3. Carry ot an in initial fll backp of the database. 4. Then incremental backps are enogh to recover the database entirely. Best practice is to rn one fll a day in order to speed p incremental recoveries becase the transaction log can be large. The WAL featre is configred directly in the postgresql.conf file located in the \data folder of yor PostgreSQL installation. Before yo begin Create a folder or filesystem to save the archive log files. Create a script sing the pg_archivecleanp command to delete the archive log files. "After yo finish" provides more information. Ensre that yo create a specific ser to rn the PostgreSQL service, and that yo match the PostgreSQL admin privileges. This is on both Windows and UNIX. Doing this eases recoveries. For Windows: Ensre that permissions are granted for the PostgreSQL ser for the / backp folder, as logs won't be poplated if it isn't set. For Linx/UNIX: Use the cp command to copy the files when yo archive and other applicable Linx/UNIX commands. Procedre 1. Go to the postgresql.conf file located in the \data folder of yor PostgreSQL installation. 2. Edit the postgresql.conf file attribtes according to the following table to enable consistent backps: Attribte Vale to set Details wal_level Archive Enables Write-Ahead Logging fsync on Ensres that pdates are physically written to disk synchronos_commit on Ensres that transaction commit waits for WAL records to be written to disk before command sccess is indicated. PostgreSQL WAL backp 5
Attribte Vale to set Details wal_sync_method fsynch Calls the fsync command at each commit to ensre consistency fll_page_writes on Ensres that PostgreSQL writes the entire content of each disk page to WAL dring the first modification of that page after a checkpoint archive_mode on Enables Write-Ahead Logging archive_command archive_timeot Windows: copy "%p" "D:\\backp\\%f" UNIX: cp "%p" "/ backp/%f" Note that the archive command calls the cp fnction to copy the files or any other UNIX/ Linx OS fnction Yor RPO vale; defalt is 600. Secres the system archive logs Specifies when the archive log files will be switched and ths copied to the /backp folder. This vale mst match yor Recovery Point Objective (RPO). PostgreSQL defalt archive log size is 16MB with an archive_timeot defalt of 600 seconds. The defalt generates a daily log file of 2.3GB. "After yo finish" provides additional information. 3. Save and close the file. After yo finish Errors can be seen in the pg_logs folder: 2013-12-30 15:54:42 PST LOG: archive command failed with exit code 1 2013-12-30 15:54:42 PST DETAIL: The failed archive command was: copy "pg_xlog\000000010000000000000001" "C:\Program Files\PostgresPls\Backps\000000010000000000000001" Access is denied On Windows, errors are logged to the /pg_logs folder. To delete the archive log file, se the pg_archivecleanp command. For example: pg_archivecleanp archivelocation restartwalfile Recommended best practices regarding deleting the archive log file: Rn the pg_archivecleanp command daily. Delete the previos day log file each time yo rn the pg_archivecleanp command. Yo can se a script to lanch pg_archivecleanp command after the backp or at any point in time. 6 PostgreSQL backps with NetWorker 1.0 Technical Notes
pg_basebackp command sage Yo se the native command pg_basebackp to create a base backp of the PostgreSQL databases. The base backp is mandatory for point-in-time recoveries becase it informs the database of the backp and creates a checkpoint that transaction logging refers to. Fll docmentation on the pg_basebackp command is available http:// www.postgresql.org/docs nder Backp. Command sage example pg_basebackp.exe -p 5432 -U postgres -w -D D:\backps\niqe_folder_name -X stream -F plain where: -p 5432 specifies the port nmber 5432 that is being sed to connect to the PostgreSQL database. The port nmber is configrable. -U postgres specifies that the sername is postgres. -w specifies that the application does not prompt for a password. -Dspecifies the folder location where the base backp is to be stored. For ease of configration and recovery, the folder location shold be the same as the one specified to store archive logs when yo enable WAL. -X stream specifies that the incoming transactions come in as a stream so no data is lost dring the base backp. -F plainspecifies that the data is stored in plain-text format for ease of recovery. It cold be stored as a tarball. Configring NetWorker for WAL backps To carry ot a complete PostgreSQL WAL backp, yo mst perform a base backp by sing the pg_basebackp command and yo mst back p the transaction logs. Both the basebackp sing the pg_basebackp command and the backp of the transaction logs can be performed as part of an incremental backp. The pg_basebackp command creates a dmp of the file needed in a specific folder or tarball. On NetWorker, backps need a specific pre command to create the base backp. Configring NetWorker for PostgreSQL dmp backps on page 3 provides information on pre commands. The backp mst inclde the backp folder location yo specify in the PostgreSQL configration, as in the procedre below: Procedre 1. In Client Properties, General tab: a. Ensre that the following options are selected: In Backp: Schedled backp and Client direct; In Grop: Posgresql; and Backp renamed directories b. Set the following file paths in the Save set field: C:\Program Files \PostgresPls\9.3AS and C:\Program Files\PostgresPls \Backps pg_basebackp command sage 7
The image below provides an example of the save set files and other option settings: 2. In Grop Properties, Advanced tab, set the backp interval. In the Grop Properties setting a special configration might be needed. For example, the backp interval cold be set according to yor SLAs or to whatever plan yo deem convenient. The image below provides an example of a backp interval set to every 30 mintes, with one fll backp every Friday. 8 PostgreSQL backps with NetWorker 1.0 Technical Notes
Recovering WAL backp This procedre describes PostgreSQL in-place recovery for a WAL backp. This is also applicable to directed recoveries and disaster recoveries. Procedre 1. Go to the /postgresql/data folder and delete the contents. 2. Go to the backp directory where the archive logs are stored and delete contents. 3. Lanch NetWorker. 4. Rn the recovery of the backp directory where the archive logs are stored, at the closest Point in Time to which yo want to recover. The images below provide an example of the backp directory recoveries: 5. Log in to the PosgreSQL server and move the data contained in the base backp folder where the transaction logs are stored to the data directory. 6. Create a recover.conf file nder the /postgresql/data folder. 7. Open the recover.conf file in the /postgresql/data folder and edit the following attribtes: Recovering WAL backp 9
Attribte Vale to set Details restore_command = 'copy "C:\\Program Files\\PostgresPls\ \Backps\\%f" %p' Ensres that the copy log files are saved to the PostgreSQL backps folder. recovery_target_time = '[timestamp]' Ensres that recovery roll-forward stops at a specific time. For example: '2013-12-31 10:30:00 PST' recovery_target_inclsive= tre Ensres that recovery roll-forward stop incldes incldes the given target time. Known limitations 8. Restart PostgreSQL and verify the backp logs. After yo finish After the PostgreSQL database is recovered, the recover.conf file is atomatically renamed to recover.done. Restarts to not affect the file. Depending on yor backp policy, yo may need to reinstall PostgreSQL. Yo shold be aware of the following limitations for WAL backps. PostgreSQL docmentation [is there a particlar doc name or sbject we can reference?] provides additional information on known limitations. Operations on hash indexes are not presently WAL-logged, so replay will not pdate these indexes. This will mean that any new inserts will be ignored by the index, pdated rows will apparently disappear and deleted rows will still retain pointers. In other words, if yo modify a table with a hash index on it then yo will get incorrect qery reslts on a standby server. When recovery completes it is recommended that yo manally REINDEX each sch index after completing a recovery operation. If a CREATE DATABASE command is exected while a base backp is being taken, and then the template database that the CREATE DATABASE copied is modified while the base backp is still in progress, it is possible that recovery will case those modifications to be propagated into the created database as well. This is of corse ndesirable. To avoid this risk, it is best not to modify any template databases while taking a base backp. CREATE TABLESPACE commands are WAL-logged with the literal absolte path, and will therefore be replayed as tablespace creations with the same absolte path. This might be ndesirable if the log is being replayed on a different machine. It can be dangeros even if the log is being replayed on the same machine, bt into a new data directory: The replay will still overwrite the contents of the original tablespace. To avoid potential "gotchas" of this sort, the best practice is to take a new base backp after creating or dropping tablespaces. 10 PostgreSQL backps with NetWorker 1.0 Technical Notes
Conclsion PostgreSQL WAL fntionality, when configred with NetWorker, enables yo to rn sccessfl backps and recoveries, so that yo can meet company RPOs and RTOs with a minimm amont of work. This interaction gives yo the ability to recover data p to a second or before specific actions, so that yo can restore a complete systems as it was before a crash or corrption. Conclsion 11
Copyright 2014 EMC Corporation. All rights reserved. Pblished in USA. Pblished Jne 30, 2014 EMC believes the information in this pblication is accrate as of its pblication date. The information is sbject to change withot notice. The information in this pblication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this pblication, and specifically disclaims implied warranties of merchantability or fitness for a particlar prpose. Use, copying, and distribtion of any EMC software described in this pblication reqires an applicable software license. EMC², EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other contries. All other trademarks sed herein are the property of their respective owners. For the most p-to-date reglatory docment for yor prodct line, go to EMC Online Spport (https://spport.emc.com). 12 PostgreSQL backps with NetWorker 1.0 Technical Notes