Skip to content

restore

restore command

The restore command is used to restore a database from a previously created dump. You can specify the dump to restore by providing the dump ID or use the latest keyword to restore the latest completed dump.

greenmask --config=config.yml restore DUMP_ID

Alternatively, to restore the latest completed dump, use the following command:

greenmask --config=config.yml restore latest

Note that the restore command shares the same parameters and environment variables as pg_restore, allowing you to configure the restoration process as needed.

Mostly it supports the same flags as the pg_restore utility, with some extra flags for Greenmask-specific features.

Supported flags
      --batch-size int                  the number of rows to insert in a single batch during the COPY command (0 - all rows will be inserted in a single batch)
  -c, --clean                           clean (drop) database objects before recreating
  -C, --create                          create the target database
  -a, --data-only                       restore only the data, no schema
  -d, --dbname string                   connect to database name (default "postgres")
      --disable-triggers                disable triggers during data-only restore
      --enable-row-security             enable row security
  -N, --exclude-schema strings          do not restore objects in this schema
  -e, --exit-on-error                   exit on error, default is to continue
  -f, --file string                     output file name (- for stdout)
  -P, --function strings                restore named function
  -h, --host string                     database server host or socket directory (default "/var/run/postgres")
      --if-exists                       use IF EXISTS when dropping objects
  -i, --index strings                   restore named index
      --inserts                         restore data as INSERT commands, rather than COPY
  -j, --jobs int                        use this many parallel jobs to restore (default 1)
      --list-format string              use table of contents in format of text, json or yaml (default "text")
      --no-comments                     do not restore comments
      --no-data-for-failed-tables       do not restore data of tables that could not be created
  -O, --no-owner string                 skip restoration of object ownership
  -X, --no-privileges                   skip restoration of access privileges (grant/revoke)
      --no-publications                 do not restore publications
      --no-security-labels              do not restore security labels
      --no-subscriptions                ddo not restore subscriptions
      --no-table-access-method          do not restore table access methods
      --no-tablespaces                  do not restore tablespace assignments
      --on-conflict-do-nothing          add ON CONFLICT DO NOTHING to INSERT commands
      --overriding-system-value         use OVERRIDING SYSTEM VALUE clause for INSERTs
      --pgzip                           use pgzip decompression instead of gzip
  -p, --port int                        database server port number (default 5432)
      --restore-in-order                restore tables in topological order, ensuring that dependent tables are not restored until the tables they depend on have been restored
  -n, --schema strings                  restore only objects in this schema
  -s, --schema-only                     restore only the schema, no data
      --section string                  restore named section (pre-data, data, or post-data)
  -1, --single-transaction              restore as a single transaction
      --strict-names                    restore named section (pre-data, data, or post-data) match at least one entity each
  -S, --superuser string                superuser user name to use for disabling triggers
  -t, --table strings                   restore named relation (table, view, etc.)
  -T, --trigger strings                 restore named trigger
  -L, --use-list string                 use table of contents from this file for selecting/ordering output
      --use-set-session-authorization   use SET SESSION AUTHORIZATION commands instead of ALTER OWNER commands to set ownership
  -U, --username string                 connect as specified database user (default "postgres")
  -v, --verbose string                  verbose mode

Extra features

Inserts and error handling

Warning

Insert commands are a lot slower than COPY commands. Use this feature only when necessary.

By default, Greenmask restores data using the COPY command. If you prefer to restore data using INSERT commands, you can use the --inserts flag. This flag allows you to manage errors that occur during the execution of INSERT commands. By configuring an error and constraint exclusion list in the config, you can skip certain errors and continue inserting subsequent rows from the dump.

This can be useful when adding new records to an existing dump, but you don't want the process to stop if some records already exist in the database or violate certain constraints.

By adding the --on-conflict-do-nothing flag, it generates INSERT statements with the ON CONFLICT DO NOTHING clause, similar to the original pg_dump option. However, this approach only works for unique or exclusion constraints. If a foreign key is missing in the referenced table or any other constraint is violated, the insertion will still fail. To handle these issues, you can define anexclusion list in the config.

example with inserts and error handling
```shell title="example with inserts and on conflict do nothing"
greenmask --config=config.yml restore DUMP_ID --inserts --on-conflict-do-nothing

By adding the --overriding-system-value flag, it generates INSERT statements with the OVERRIDING SYSTEM VALUE clause, which allows you to insert data into identity columns.

example of GENERATED ALWAYS AS IDENTITY column
CREATE TABLE people (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    generated text GENERATED ALWAYS AS (id || first_name) STORED,
    first_name text
);
example with inserts
greenmask --config=config.yml restore DUMP_ID --inserts --overriding-system-value

Restoration in topological order

By default, Greenmask restores tables in the order they are listed in the dump file. To restore tables in topological order, use the --restore-in-order flag. This flag ensures that dependent tables are not restored until the tables they depend on have been restored.

This is useful when you have the schema already created with foreign keys and other constraints, and you want to insert data into the tables in the correct order or catch-up the target database with the new data.

Warning

Greenmask cannot guarantee restoration in topological order when the schema contains cycles. The only way to restore tables with cyclic dependencies is to temporarily remove the foreign key constraint (to break the cycle), restore the data, and then re-add the foreign key constraint once the data restoration is complete.

If your database has cyclic dependencies you will be notified about it but the restoration will continue.

2024-08-16T21:39:50+03:00 WRN cycle between tables is detected: cannot guarantee the order of restoration within cycle cycle=["public.employees","public.departments","public.projects","public.employees"]

Pgzip decompression

By default, Greenmask uses gzip decompression to restore data. In mist cases it is quite slow and does not utilize all available resources and is a bootleneck for IO operations. To speed up the restoration process, you can use the --pgzip flag to use pgzip decompression instead of gzip. This method splits the data into blocks, which are decompressed in parallel, making it ideal for handling large volumes of data.

example with pgzip decompression
greenmask --config=config.yml restore latest --pgzip

Restore data batching

The COPY command returns the error only on transaction commit. This means that if you have a large dump and an error occurs, you will have to wait until the end of the transaction to see the error message. To avoid this, you can use the --batch-size flag to specify the number of rows to insert in a single batch during the COPY command. If an error occurs during the batch insertion, the error message will be displayed immediately. The data will be committed only if all batches are inserted successfully.

This is useful when you want to be notified of errors as immediately as possible without waiting for the entire table to be restored.

Warning

The batch size should be chosen carefully. If the batch size is too small, the restoration process will be slow. If the batch size is too large, you may not be able to identify the error row.

In the example below, the batch size is set to 1000 rows. This means that 1000 rows will be inserted in a single batch, so you will be notified of any errors immediately after each batch is inserted.

example with batch size
greenmask --config=config.yml restore latest --batch-size 1000