> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-fbfa8bee.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> Overview of ClickHouse backup and restore

# Backup and restore in ClickHouse

> This section broadly covers backups and restores in ClickHouse. For a more
> detailed description of each backup method, see the pages for specific methods
> in the sidebar.

<h2 id="introduction">
  Introduction
</h2>

While [replication](/reference/engines/table-engines/mergetree-family/replication) provides protection from hardware failures, it doesn't
protect against human errors: accidental deletion of data, deletion of the wrong
table or a table on the wrong cluster, and software bugs that result in incorrect
data processing or data corruption.

In many cases mistakes like these will affect all replicas. ClickHouse has built-in
safeguards to prevent some types of mistakes, for example, by [default](/reference/settings/session-settings#max_table_size_to_drop)
you can't just drop tables with a `MergeTree` family engine containing more than
50 Gb of data. However, these safeguards don't cover all possible cases and
problems can still occur.

To effectively mitigate possible human errors, you should carefully prepare a
strategy for backing up and restoring your data **in advance**.

Each company has different resources available and business requirements, so
there's no universal solution for ClickHouse backups and restores that will fit
every situation. What works for one gigabyte of data likely won't work for tens
of petabytes of data. There are a variety of possible approaches with their own pros
and cons, which are presented in this section of the docs. It is a good idea to
use several approaches instead of just one such as to compensate for their various
shortcomings.

<Note>
  Keep in mind that if you backed something up and never tried to restore it,
  chances are that the restore won't work properly when you actually need it (or at
  least it will take longer than the business can tolerate). So whatever backup
  approach you choose, make sure to automate the restore process as well, and practice
  it on a spare ClickHouse cluster regularly.
</Note>

The following pages detail the various backup and
restore methods available in ClickHouse:

| Page                                                                                          | Description                                                                 |
| --------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- |
| [Backup/restore using local disk or S3 disk](/concepts/features/backup-restore/local-disk)    | Details backup/restore to or from a local disk or S3 disk                   |
| [Backup/restore using S3 endpoint](/concepts/features/backup-restore/s3-endpoint)             | Details backup/restore to or from an S3 endpoint                            |
| [Backup/restore using AzureBlobStorage](/concepts/features/backup-restore/azure-blob-storage) | Details backup/restore to or from Azure blob storage                        |
| [Alternative methods](/concepts/features/backup-restore/alternative-methods)                  | Discusses alternative backup methods                                        |
| [Snapshot backup](/concepts/features/backup-restore/snapshot)                                 | Lightweight snapshots for SharedMergeTree tables using cloud object storage |

Backups can:

* be [full or incremental](#backup-types)
* be [synchronous or asynchronous](#synchronous-vs-asynchronous)
* be [concurrent or non-concurrent](#concurrent-vs-non-concurrent)
* be [compressed or uncompressed](#compressed-vs-uncompressed)
* use [named collections](#using-named-collections)
* be password protected
* be taken of [system tables, log tables, or access management tables](#system-backups)

<h2 id="backup-types">
  Backup types
</h2>

Backups can be either full or incremental. Full backups are a complete copy of the
data, while incremental backups are a delta of the data from the last full backup.

Full backups have the advantage of being a simple, independent (of other backups)
and reliable recovery method. However, they can take a long time to complete and
can consume a lot of space. Incremental backups, on the other hand, are more
efficient in terms of both time and space, but restoring the data requires all
the backups to be available.

Depending on your needs, you may want to use:

* **Full backups** for smaller databases or critical data.
* **Incremental backups** for larger databases or when backups need to be done frequently and cost effectively.
* **Both**, for instance, weekly full backups and daily incremental backups.

<h2 id="synchronous-vs-asynchronous">
  Synchronous vs asynchronous backups
</h2>

`BACKUP` and `RESTORE` commands can also be marked `ASYNC`. In this case, the
backup command returns immediately, and the backup process runs in the background.
If the commands aren't marked `ASYNC`, the backup process is synchronous and
the command blocks until the backup completes.

<h2 id="concurrent-vs-non-concurrent">
  Concurrent vs non-concurrent backups
</h2>

By default, ClickHouse allows concurrent backups and restores. This means you
can initiate multiple backup or restore operations simultaneously. However,
there are server-level settings that let you disallow this behavior. If you set
these settings to false, only one backup or restore operation is allowed to run
on a cluster at a time. This can help avoid resource contention or potential
conflicts between operations.

To disallow concurrent backup/restore, you can use these settings respectively:

```xml theme={null}
<clickhouse>
    <backups>
        <allow_concurrent_backups>false</allow_concurrent_backups>
        <allow_concurrent_restores>false</allow_concurrent_restores>
    </backups>
</clickhouse>
```

The default value for both is true, so by default concurrent backup/restores are
allowed. When these settings are false on a cluster, only a single backup/restore
is allowed to run on a cluster at a time.

<h2 id="compressed-vs-uncompressed">
  Compressed vs uncompressed backups
</h2>

ClickHouse backups support compression through the `compression_method` and `compression_level` settings.

When creating a backup, you can specify:

```sql theme={null}
BACKUP TABLE test.table
  TO Disk('backups', 'filename.zip')
  SETTINGS compression_method='lzma', compression_level=3
```

<h2 id="using-named-collections">
  Using named collections
</h2>

Named collections allow you to store key-value pairs (like S3 credentials, endpoints, and settings) that can be reused across backup/restore operations.
They help to:

* Hide credentials from users without admin access
* Simplify commands by storing complex configuration centrally
* Maintain consistency across operations
* Avoid credential exposure in query logs

See ["named collections"](/concepts/features/configuration/server-config/named-collections) for further details.

<h2 id="system-backups">
  Backing up system, log or access management tables
</h2>

System tables can also be included in your backup and restore workflows, but their
inclusion depends on your specific use case.

System tables that store historic data, such as those with a `_log` suffix (e.g.,
`query_log`, `part_log`), can be backed up and restored like any other table.
If your use case relies on analyzing historic data - for example, using `query_log`
to track query performance or debug issues - it's recommended to include these
tables in your backup strategy. However, if historic data from these tables is
not required, they can be excluded to save backup storage space.

System tables related to access management, such as users, roles, row\_policies,
settings\_profiles, and quotas, receive special treatment during backup and restore operations.
When these tables are included in a backup, their content is exported to a special
`accessXX.txt` file, which encapsulates the equivalent SQL statements for creating
and configuring the access entities. Upon restoration, the restore process
interprets these files and re-applies the SQL commands to recreate the users,
roles, and other configurations. This feature ensures that the access control
configuration of a ClickHouse cluster can be backed up and restored as part of
the cluster's overall setup.

This functionality only works for configurations managed through SQL commands
(referred to as ["SQL-driven Access Control and Account Management"](/concepts/features/security/access-rights#enabling-access-control)).
Access configurations defined in ClickHouse server configuration files (e.g. `users.xml`)
aren't included in backups and can't be restored through this method.

<h2 id="syntax">
  General syntax
</h2>

```sql theme={null}
-- core commands
BACKUP | RESTORE 
--- what to backup/restore (or exclude)
TABLE [db.]table_name           [AS [db.]table_name_in_backup] |
DICTIONARY [db.]dictionary_name [AS [db.]name_in_backup] |
DATABASE database_name          [AS database_name_in_backup] |
TEMPORARY TABLE table_name      [AS table_name_in_backup] |
VIEW view_name                  [AS view_name_in_backup] |
[EXCEPT TABLES ...] |
ALL [EXCEPT {TABLES|DATABASES}...] } [,...]
--- 
[ON CLUSTER 'cluster_name']
--- where to backup or restore to or from
TO|FROM 
File('<path>/<filename>') | 
Disk('<disk_name>', '<path>/') | 
S3('<S3 endpoint>/<path>', '<Access key ID>', '<Secret access key>', '<extra_credentials>') |
AzureBlobStorage('<connection string>/<url>', '<container>', '<path>', '<account name>', '<account key>')
--- additional settings
[SETTINGS ...]
[ASYNC]
```

**See ["command summary"](/concepts/features/backup-restore/overview#command-summary) for more details
of each command.**

<h3 id="command-summary">
  Command summary
</h3>

Each of the commands above is detailed below:

| **Command**                                                            | **Description**                                                                                                                                      |   |
| ---------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | - |
| `BACKUP`                                                               | Creates a backup of specified objects                                                                                                                |   |
| `RESTORE`                                                              | Restores objects from a backup                                                                                                                       |   |
| `TABLE [db.]table_name [AS [db.]table_name_in_backup]`                 | Backs up/restores a specific table (can be renamed)                                                                                                  |   |
| `[PARTITION[S] partition_expr [,...]]`                                 | Only backup/restore specific partitions of the table                                                                                                 |   |
| `DICTIONARY [db.]dictionary_name [AS [db.]name_in_backup]`             | Backs up/restores a dictionary object                                                                                                                |   |
| `DATABASE database_name [AS database_name_in_backup]`                  | Backs up/restores an entire database (can be renamed)                                                                                                |   |
| `TEMPORARY TABLE table_name [AS table_name_in_backup]`                 | Backs up/restores a temporary table (can be renamed)                                                                                                 |   |
| `VIEW view_name [AS view_name_in_backup]`                              | Backs up/restores a view (can be renamed)                                                                                                            |   |
| `[EXCEPT TABLES ...]`                                                  | Exclude specific tables when backing up a database                                                                                                   |   |
| `ALL`                                                                  | Backs up/restores everything (all databases, tables, etc.). Prior to version 23.4 of ClickHouse, `ALL` was only applicable to the `RESTORE` command. |   |
| `[EXCEPT {TABLES\|DATABASES}...]`                                      | Exclude specific tables or databases when using `ALL`                                                                                                |   |
| `[ON CLUSTER 'cluster_name']`                                          | Execute the backup/restore across a ClickHouse cluster                                                                                               |   |
| `TO\|FROM`                                                             | Direction: `TO` for backup destination, `FROM` for restore source                                                                                    |   |
| `File('<path>/<filename>')`                                            | Store to/restore from local file system                                                                                                              |   |
| `Disk('<disk_name>', '<path>/')`                                       | Store to/restore from a configured disk                                                                                                              |   |
| `S3('<S3 endpoint>/<path>', '<Access key ID>', '<Secret access key>')` | Store to/restore from Amazon S3 or S3-compatible storage                                                                                             |   |
| `[SETTINGS ...]`                                                       | See below for complete list of settings                                                                                                              |   |
| `[ASYNC]`                                                              | Makes the operation run asynchronously (returns immediately with an ID you can monitor)                                                              |   |

<h3 id="settings">
  Settings
</h3>

**Generic backup/restore settings**

| Setting                                                      | Description                                                                                                                                                                                                                                                                                                                                          | Default value |
| ------------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- |
| `id`                                                         | ID of backup or restore operation, randomly generated UUID is used if not specified. If there's already a running operation with the same ID, an exception is thrown.                                                                                                                                                                                |               |
| `compression_method`                                         | Specifies the compression method for the backup. See section ["column compression codecs"](/reference/statements/create/table#column_compression_codec)                                                                                                                                                                                              |               |
| `compression_level`                                          | Specifies the compression level for the backup                                                                                                                                                                                                                                                                                                       |               |
| `password`                                                   | Password for the backup archive. Only supported for ZIP archives (`.zip`, `.zipx`).                                                                                                                                                                                                                                                                  |               |
| `base_backup`                                                | The destination of the base backup used for incremental backups. For example: `Disk('backups', '1.zip')`                                                                                                                                                                                                                                             |               |
| `use_same_password_for_base_backup`                          | Whether base backup archive should inherit the password from the query.                                                                                                                                                                                                                                                                              |               |
| `structure_only`                                             | If enabled, only backs up or restores the CREATE statements without the actual table data.                                                                                                                                                                                                                                                           |               |
| `storage_policy`                                             | Storage policy for the tables being restored. See ["using multiple block devices for data storage](/reference/engines/table-engines/mergetree-family/mergetree#table_engine-mergetree-multiple-volumes). Only applicable to the `RESTORE` command. Applies only to tables with an engine from the `MergeTree` family.                                |               |
| `allow_non_empty_tables`                                     | Allows `RESTORE TABLE` to insert data into non-empty tables. This will mix earlier data in the table with the data extracted from the backup. This setting can therefore cause data duplication in the table, use with caution.                                                                                                                      | `0`           |
| `backup_restore_keeper_max_retries`                          | Max retries for \[Zoo]Keeper operations in the middle of a BACKUP or RESTORE operation. Should be big enough so the whole operation won't fail because of a temporary \[Zoo]Keeper failure.                                                                                                                                                          | `1000`        |
| `backup_restore_keeper_retry_initial_backoff_ms`             | Initial backoff timeout for \[Zoo]Keeper operations during backup or restore                                                                                                                                                                                                                                                                         | `100`         |
| `backup_restore_keeper_retry_max_backoff_ms`                 | Max backoff timeout for \[Zoo]Keeper operations during backup or restore                                                                                                                                                                                                                                                                             | `5000`        |
| `backup_restore_failure_after_host_disconnected_for_seconds` | If a host during a `BACKUP ON CLUSTER` or `RESTORE ON CLUSTER` operation doesn't recreate its ephemeral 'alive' node in ZooKeeper for this amount of time then the whole backup or restore is considered as failed. This value should be bigger than any reasonable time for a host to reconnect to ZooKeeper after a failure. Zero means unlimited. | `3600`        |
| `backup_restore_keeper_max_retries_while_initializing`       | Max retries for \[Zoo]Keeper operations during the initialization of a `BACKUP ON CLUSTER` or `RESTORE ON CLUSTER` operation.                                                                                                                                                                                                                        | `20`          |
| `backup_restore_keeper_max_retries_while_handling_error`     | Max retries for \[Zoo]Keeper operations while handling an error of a `BACKUP ON CLUSTER` or `RESTORE ON CLUSTER` operation.                                                                                                                                                                                                                          | `20`          |
| `backup_restore_finish_timeout_after_error_sec`              | How long the initiator should wait for other host to react to the 'error' node and stop their work on the current `BACKUP ON CLUSTER` or `RESTORE ON CLUSTER` operation.                                                                                                                                                                             | `180`         |
| `backup_restore_keeper_value_max_size`                       | Maximum size of data of a \[Zoo]Keeper's node during backup                                                                                                                                                                                                                                                                                          | `1048576`     |
| `backup_restore_batch_size_for_keeper_multi`                 | Maximum size of batch for multi request to \[Zoo]Keeper during backup or restore                                                                                                                                                                                                                                                                     | `1000`        |
| `backup_restore_batch_size_for_keeper_multiread`             | Maximum size of batch for multiread request to \[Zoo]Keeper during backup or restore                                                                                                                                                                                                                                                                 | `10000`       |
| `backup_restore_keeper_fault_injection_probability`          | Approximate probability of failure for a keeper request during backup or restore. Valid value is in interval `[0.0f, 1.0f]`                                                                                                                                                                                                                          | `0`           |
| `backup_restore_keeper_fault_injection_seed`                 | `0` for a random seed, otherwise the setting value                                                                                                                                                                                                                                                                                                   | `0`           |
| `backup_restore_s3_retry_attempts`                           | Setting for Aws::Client::RetryStrategy, Aws::Client does retries itself, 0 means no retries. It takes place only for backup/restore.                                                                                                                                                                                                                 | `1000`        |
| `max_backup_bandwidth`                                       | The maximum read speed in bytes per second for particular backup on server. Zero means unlimited.                                                                                                                                                                                                                                                    | `0`           |
| `max_backups_io_thread_pool_size`                            | ClickHouse uses threads from the Backups IO Thread pool to do S3 backup IO operations. `max_backups_io_thread_pool_size` limits the maximum number of threads in the pool.                                                                                                                                                                           | `1000`        |
| `max_backups_io_thread_pool_free_size`                       | If the number of **idle** threads in the Backups IO Thread pool exceeds `max_backup_io_thread_pool_free_size`, ClickHouse will release resources occupied by idling threads and decrease the pool size. Threads can be created again if necessary.                                                                                                   | `0`           |
| `backups_io_thread_pool_queue_size`                          | The maximum number of jobs that can be scheduled on the Backups IO Thread pool. It is recommended to keep this queue unlimited due to the current S3 backup logic. Note: A value of `0` (default) means unlimited.                                                                                                                                   | `0`           |
| `backup_threads`                                             | The maximum number of threads to execute `BACKUP` requests.                                                                                                                                                                                                                                                                                          |               |
| `max_backup_bandwidth_for_server`                            | The maximum read speed in bytes per second for all backups on server. Zero means unlimited.                                                                                                                                                                                                                                                          | `0`           |
| `shutdown_wait_backups_and_restores`                         | If set to true ClickHouse will wait for running backups and restores to finish before shutdown.                                                                                                                                                                                                                                                      | `1`           |

**S3 specific settings**

| Setting                                   | Description                                                                                | Default value |
| ----------------------------------------- | ------------------------------------------------------------------------------------------ | ------------- |
| `use_same_s3_credentials_for_base_backup` | Whether base backup to S3 should inherit credentials from the query. Only works with `S3`. |               |
| `s3_storage_class`                        | The storage class used for S3 backup. For example: `STANDARD`                              |               |

**Azure specific settings**

| Setting                             | Description                                                                                             | Default value |   |   |
| ----------------------------------- | ------------------------------------------------------------------------------------------------------- | ------------- | - | - |
| `azure_attempt_to_create_container` | When using Azure Blob Storage, whether to attempt creating the specified container if it doesn't exist. | `true`        |   |   |

<h2 id="check-the-status-of-backups">
  Administration and troubleshooting
</h2>

The backup command returns an `id` and `status`, and that `id` can be used to
get the status of the backup. This is very useful to check the progress of long
`ASYNC` backups. The example below shows a failure that happened when trying to
overwrite an existing backup file:

```sql theme={null}
BACKUP TABLE helloworld.my_first_table TO Disk('backups', '1.zip') ASYNC
```

```response theme={null}
┌─id───────────────────────────────────┬─status──────────┐
│ 7678b0b3-f519-4e6e-811f-5a0781a4eb52 │ CREATING_BACKUP │
└──────────────────────────────────────┴─────────────────┘

1 row in set. Elapsed: 0.001 sec.
```

```sql theme={null}
SELECT
*
FROM system.backups
WHERE id='7678b0b3-f519-4e6e-811f-5a0781a4eb52'
FORMAT Vertical
```

```response highlight={5,9} theme={null}
Row 1:
──────
id:                7678b0b3-f519-4e6e-811f-5a0781a4eb52
name:              Disk('backups', '1.zip')
status:            BACKUP_FAILED
num_files:         0
uncompressed_size: 0
compressed_size:   0
error:             Code: 598. DB::Exception: Backup Disk('backups', '1.zip') already exists. (BACKUP_ALREADY_EXISTS) (version 22.8.2.11 (official build))
start_time:        2022-08-30 09:21:46
end_time:          2022-08-30 09:21:46

1 row in set. Elapsed: 0.002 sec.
```

Along with the [`system.backups`](/reference/system-tables/backups) table, all backup and restore operations are also tracked in the system log table
[`system.backup_log`](/reference/system-tables/backup_log):

```sql theme={null}
SELECT *
FROM system.backup_log
WHERE id = '7678b0b3-f519-4e6e-811f-5a0781a4eb52'
ORDER BY event_time_microseconds ASC
FORMAT Vertical
```

```response highlight={26} theme={null}
Row 1:
──────
event_date:              2023-08-18
event_time_microseconds: 2023-08-18 11:13:43.097414
id:                      7678b0b3-f519-4e6e-811f-5a0781a4eb52
name:                    Disk('backups', '1.zip')
status:                  CREATING_BACKUP
error:
start_time:              2023-08-18 11:13:43
end_time:                1970-01-01 03:00:00
num_files:               0
total_size:              0
num_entries:             0
uncompressed_size:       0
compressed_size:         0
files_read:              0
bytes_read:              0

Row 2:
──────
event_date:              2023-08-18
event_time_microseconds: 2023-08-18 11:13:43.174782
id:                      7678b0b3-f519-4e6e-811f-5a0781a4eb52
name:                    Disk('backups', '1.zip')
status:                  BACKUP_FAILED
error:                   Code: 598. DB::Exception: Backup Disk('backups', '1.zip') already exists. (BACKUP_ALREADY_EXISTS) (version 23.8.1.1)
start_time:              2023-08-18 11:13:43
end_time:                2023-08-18 11:13:43
num_files:               0
total_size:              0
num_entries:             0
uncompressed_size:       0
compressed_size:         0
files_read:              0
bytes_read:              0

2 rows in set. Elapsed: 0.075 sec.
```
