We happen to run our ETL processes only once per day. So that allows us to simply use pg_dump for our database backups. But there’s the old adage “Your backups are only as good as your last restore.” So I do a restore of the production server backup on the development server at least every 45 days.
However, with some recent development work involving extracting data from MS SQL Server databases, I have grappled with some interesting encoding issues. So I should have expected problems when doing my most recent test restore…
While I have self-authored bash scripts for doing backups & restores, I had to do a simple yet significant change to ensure that I can complete a restore.
First, excluding tables from being backed up with pg_dump can be done with -T / --exclude-table option, however, using it may introduce some interesting–& undesired–side effects above & beyond not backing up data that you may want to restore. My approach therefore is to backup everything & figure out the problems when restoring.
Note: Of course I’m going to make the huge assumption that you have already verified that your dump file was created successfully without any errors.
Determining What You Actually Want to Restore
It all starts with generating a list of objects contained in a dump file.
pg_restore --list ${basepath}/${backupfile} > ${basepath}/recovery.lst
Open recovery.lst in your editor of choice to view & make sure that the list is readable. Assuming that’s the case, let’s find a table to not restore. But before we go removing lines from our recovery list file, it is worth pointing out that there are at least 3 lines per table–one to create the table, one for permissions for the table, & one for loading the data into the table. Those 3 lines look like this (but almost certainly will not be found consecutively in the recovery list):
647; 1259 37177 TABLE public stage_subscription_orders bi
14971; 0 0 ACL public TABLE stage_subscription_orders bi
...
13615; 0 37177 TABLE DATA public stage_subscription_orders bi
It’s well worth pointing out here: to avoid getting into a death spiral of being unable to restore other objects (namely views) that depend on the undesired table, you really will only want to remove the third line–the one for restoring the data–but keep the table creation & permissions lines.
Editing the Recovery List File and Performing the Restore
Back in the editor of your choice, locate & remove the ... TABLE DATA ... line, save changes, and exit. Then use the -L / --use-list option for explicitly telling pg_restore what objects to restore:
pg_restore --dbname=postgres --exit-on-error \
--use-list=${basepath}/recovery.lst \
--verbose ${basepath}/${backupfile}
And if you don’t manage to exclude all of the necessary items the first go round, simply re-edit recovery.lst & re-run pg_restore until you get a successful restore.