Backup and restore

This section explains how to manage backups for PostgreSQL database and the Object Storage.

Important

This section assumes that you deployed Parsec following the instructions from Server deployment section. If you deployed Parsec differently, you might need to adapt this section to your custom deployment.

Notes on data consistency

The user data accessible to the user depends primarily on the metadata stored in the PostgreSQL database and secondly on the Object Storage. This is because the metadata contain references to file blocks stored in the Object Storage.

During backup and restore, the following situations may occur:

  • The PostgreSQL database is up to date, but some referenced objects are missing in the Object Storage: Parsec will consider files with missing objects as corrupted. These files should still be visible to the user, but they cannot be downloaded or opened.

  • The PostgreSQL database is not up to date, and there are some objects non referenced by any in the Object Storage: Objects that are not referenced by any file are considered orphaned and therefore will be ignored by Parsec. All files displayed in Parsec should still be accessible.

  • The PostgreSQL database is not up to date and some objects are not referenced: In this scenario, the effect of the previous two points are cumulative.

It should be noted that no block is deleted or modified from the Object Storage, even in the case of deleting a file or folder for historical purposes.

Important

To ensure data consistency between databases, you must back up the PostgreSQL database *before* backing up the Object Storage, as any excess objects will have no consequences. The backup date to consider is that of the PostgreSQL database.

In conclusion, it is not necessary to ensure exact consistency between the databases, since the PostgreSQL database is the authoritative source; rather the PostgreSQL database backup simply needs to be older (previous) than the Object Storage backup.

PostgreSQL database

Before starting, make sure you have the necessary permissions to perform these operations and that the PostgreSQL service is running. If you encounter any errors, check the error messages for clues as to what might be going wrong.

Backing up the database

You can create a backup file of the database using pg_dump.

Open a terminal or command prompt and run the following command:

pg_dump -U $USER -h $HOST -p $PORT "$DATABASE_NAME" > backup.sql

Where $USER is your PostgreSQL username, $HOST is the database host address (use localhost if the database is on your computer), $PORT is the port on which PostgreSQL listens (usually 5432) and $DATABASE_NAME is the database name.

If your database has a password, you will be prompted to enter it.

The previous command will create a backup.sql file containing the structure of the PostgreSQL database and all its data.

Restoring the database

To restore the database from the backup file, you must first ensure that the target database exists. If it doesn’t, create it with PostgreSQL.

Create the database (if needed)

To create the database, run the following command:

createdb -U $USER -h $HOST -p $PORT "$DATABASE_NAME"

Where $USER is your PostgreSQL username, $HOST is the database host address (use localhost if the database is on your computer), $PORT is the port on which PostgreSQL listens (usually 5432) and $DATABASE_NAME is the database name.

After making sure the database exists, you can restore the database with the backup.sql file with a single command depending on the format of the backup.

For an SQL file, use psql:

psql -U $USER -h $HOST -p $PORT -d "$DATABASE_NAME" < backup.sql

For a binary file (if you used pg_dump -Fc), use pg_restore:

pg_restore -U $USER -h $HOST -p $PORT -d "$DATABASE_NAME" -1 backup.bin

Where $USER is your PostgreSQL username, $HOST is the database host address (use localhost if the database is on your computer), $PORT is the port on which PostgreSQL listens (usually 5432) and $DATABASE_NAME is the database name.

Object Storage (S3)

This section covers Object Storage backup and restore in AWS S3.

Before starting, make sure your AWS account has the necessary permissions to access the S3 bucket and perform these operations.

Backing up the bucket

Use aws to manage buckets compatible with Amazon’s S3 service.

Synchronize the S3 bucket with a local directory:

aws s3 sync s3://bucket_name /path/local/backup

Where:

  • s3://bucket_name is the path to the S3 bucket

  • /path/local/backup is the path to the local directory where you want to store the backup

This command will download all files in the “bucket_name” bucket to the local directory specified.

Restoring the bucket

Restore all objects to an S3 bucket

To restore data from backup, use aws s3 sync in the opposite direction, i.e. from the local directory to the S3 bucket.

aws s3 sync /path/local/backup s3://bucket_name
  • s3://bucket_name is the path to the S3 bucket

  • /path/local/backup is the path to the local directory where you stored the backup.

This command will send all files in the specified local directory to the “bucket_name” bucket.

Tip

Incremental backup: aws s3 sync is smart enough to copy only those files that have been modified. This makes subsequent backups faster after the first full backup.