Backup Using Gibby
Overview
The Gibby utility allows you to back up all of your data that is stored on Iguazio's MLOps Platform. Gibby can run on an application node as a Kubernetes job, or as a standalone utility on a dedicated server.
Gibby as a Kubernetes Job (preferred method)
This section describes how to deploy and use Gibby as a Kubernetes job.
--verify false
flag must be added to the backup and restore yaml files.Backing Up
STEP 1 - Create a Persistent Volume and Claim
-
Create a directory named
pv
under/home/iguazio
onk8s-node1
(first app node). -
Use the following yaml to create a persistent volume on app node 1 by running (editing the size and path as needed)
kubectl apply -f <yaml name>:
apiVersion: v1
kind: PersistentVolume
metadata:
name: gibby-pv
spec:
capacity:
storage: 100Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /tmp/pv
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s-node1
- Use the following yaml to create a volume claim by running (edit the size as needed)
kubectl apply -f <yaml name>:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: gibby-pv-claim
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 99Gi
STEP 2 - Create Control Access Key and Data Access Key
Run these commands on the data node.
- Run
http --session sys --auth sys:<sys PW> post http://127.0.0.1:8001/api/sessions
- Run
echo '{"data": {"type": "access_key", "attributes": {"plane": "control"}}}' | http --session sys post http://127.0.0.1:8001/api/access_keys
and from the output, save the value of theID
key (located aboverelationships
) for future use. - Run
echo '{"data": {"type": "access_key", "attributes": {"plane": "data"}}}' | http --session sys post http://127.0.0.1:8001/api/access_keys
and from the output, save the value of theID
key (located aboverelationships
) for future use.
STEP 3 - Run the backup
- Fill in the following yaml with all the relevant details and start the backup by running
kubectl apply -f <yaml name>:
Change the path as needed, and fill in the--control-access-key
and--data-access-key
with the savedID
key from the previous step (items 2 and 3).
Modify--control-plane-url
and--data-plane-url
based on your system paths. Verify that your URL starts withhttps://
.
apiVersion: batch/v1
kind: Job
metadata:
name: gibbybackupjob
# Optional:
# namespace: <ANY NAMESPACE>
spec:
template:
spec:
containers:
- name: gibby
image: gcr.io/iguazio/gibby:0.8.4
volumeMounts:
- mountPath: /tmp/bak
name: backups-volume
args:
- "create"
- "snapshot"
- "--control-plane-url=<DASHBOARD URL (example - https://dashboard.default-tenant.app.dev84.lab.iguazeng.com)>"
- "--data-plane-url=<WEBAPI URL (example - https://webapi.default-tenant.app.dev84.lab.iguazeng.com)>"
- "--control-access-key=<CONTROL ACCESS KEY EXTRACTED IN STEP 2>"
- "--data-access-key=<DATA ACCESS KEY EXTRACTED IN STEP 2>"
- "--backup-name=<BACKUP NAME>"
- "--path=/tmp/bak"
# Optional:
# Comma separated list of containers to backup
# - "--containers=users,bigdata"
# Split size threshold for backup files [MB]
# - "--file-size-limit=512"
# Max data size to include along with object attributes [Bytes] (<256KB) (v3.0.1+ only)
# - "--object-scanner-max-included-data-size=131072"
# Number of objects scanners replicas
# - "--object-scanner-replicas=<N>"
# Comma separated list of object types to skip during backup [objects,tiny_objects,streams]
# - "--skip-object-types=tiny_objects,streams"
# Enable/disable recovery after failure [enabled, disabled]
# - "--recovery-mode=enabled"
# Comma separated list of backuppers to run [v3io,platform_resources]
# - "--backupppers=v3io,platform_resources"
# List of platform resources to backup separated by semicolon and including URL query
# - "--platform-resources=users?include=user_groups,primary_group,data_policy_layers,data_lifecycle_layers;user_groups;data_policy_groups;data_policy_layers;data_lifecycle_layers"
# Outputs logger to a file
# - "--logger-file-path=/backups/<LOG-FILENAME>"
# Logger won't output with colors
# - "--logger-no-color"
restartPolicy: Never
volumes:
- name: backups-volume
persistentVolumeClaim:
claimName: <PVC NAME CREATED IN STEP 1>
Restore
Step 1 - Run Restore
- Fill the following yaml with all the relevant details and start the restore by running
kubectl apply -f <yaml name>:
apiVersion: batch/v1
kind: Job
metadata:
name: gibbyrestorejob
# Optional:
# namespace: <ANY NAMESPACE>
spec:
template:
spec:
containers:
- name: gibby
image: quay.io/iguazio/gibby:0.7.3
volumeMounts:
- mountPath: /tmp/bak
name: backups-volume
args:
- "restore"
- "backup"
- "--control-plane-url=<DASHBOARD URL (example - https://dashboard.default-tenant.app.dev84.lab.iguazeng.com>)"
- "--data-plane-url=<WEBAPI URL (example - https://webapi.default-tenant.app.dev84.lab.iguazeng.com)>"
- "--control-access-key=<CONTROL ACCESS KEY EXTRACTED IN STEP 2>"
- "--data-access-key=<DATA ACCESS KEY EXTRACTED IN STEP 2>"
- "--backup-name=<BACKUP NAME>"
- "--path=/tmp/bak"
# Optional:
# Comma separated list of containers to restore
# - "--containers=users,bigdata"
# Comma separated list of containers to restore under different name
# - "--target-containers=original_container_name1:target_container_name1,original_container_name2:target_container_name2"
# Specific snapshot to restore
# - "--snapshot-id=<snapshotID>"
# Number of objects restorers replicas
# - "--object-restorers-replicas=<N>"
# Comma separated list of object types to skip during restore [objects,tiny_objects,streams]
# - "--skip-object-types=tiny_objects,streams"
# Enable/disable recovery after failure [enabled, disabled]
# - "--recovery-mode=enabled"
# Outputs logger to a file
# - "--logger-file-path=/backups/<LOG-FILENAME>"
# Logger won't output with colors
# - "--logger-no-color"
restartPolicy: Never
volumes:
- name: backups-volume
persistentVolumeClaim:
claimName: <PVC NAME CREATED IN STEP 1>
Gibby as a Stand-Alone Utility
Gibby can be run as a standalone backup/restore utility.
Backup
- Contact Iguazio Support for the latest binary package of Gibby.
- Run the following command adding your values for the
data-access-key
andcontrol-access-key
fields.
./gibctl-<version>-linux-amd64 create snapshot \
--data-plane-url <WEBAPI URL for TENANT NAMESPACE> --control-plane-url <DASHBOARD URL for TENANT NAMESPACE> \
--data-access-key <DATA ACCESS KEY> --control-access-key <CONTROL ACCESS KEY> \
--backup-name <BACKUP NAME> --path <PATH TO GIBBY BACKUP LOCATION>
Example
--control-plane-url https://dashboard.default-tenant.app.satsdl.satsnet.com.sg --data-access-key 40b59ba2-0e59-4ab7-8843-1bcdb1fb79ef
--control-access-key 1eec7eaa-9064-4699-8c9b-b2327943b0ae --backup-name efi-test --path /home/iguazio/backup
Restore
Run the following command, substituting your values for the data-access-key
and control-access-key
fields.
./gibctl-<version>-linux-amd64 restore backup \
--data-plane-url <WEBAPI URL for TENANT NAMESPACE> --control-plane-url <DASHBOARD URL for TENANT NAMESPACE> \
--data-access-key <DATA ACCESS KEY> --control-access-key <CONTROL ACCESS KEY> \
--backup-name <BACKUP NAME> --path <PATH TO GIBBY BACKUP LOCATION>
Example
https://dashboard.default-tenant.app.satsdl.satsnet.com.sg --data-access-key 40b59ba2-0e59-4ab7-8843-1bcdb1fb79ef --control-access-key
1eec7eaa-9064-4699-8c9b-b2327943b0ae --backup-name backup-test --path /home/iguazio/backup
Gibby as a Docker Image
Backup
Run the following command, substituting your values for the data-access-key
and control-access-key
fields.
docker run --rm -v <PATH TO GIBBY BACKUP LOCATION>:/gibby_backup gcr.io/iguazio/gibby:0.7.10 \
create snapshot \
--data-plane-url <WEBAPI URL for TENANT NAMESPACE> --control-plane-url <DASHBOARD URL for TENANT NAMESPACE> \
--data-access-key <DATA ACCESS KEY> --control-access-key <CONTROL ACCESS KEY> \
--path /gibby_backup \
--backup-name <BACKUP NAME>
Restore
Run the following command, substituting your values for the data-access-key
and control-access-key
fields.
docker run --rm -v <PATH TO GIBBY BACKUP LOCATION>:/gibby_backup gcr.io/iguazio/gibby:0.7.10 \
restore backup \
--data-plane-url <WEBAPI URL for TENANT NAMESPACE> --control-plane-url <DASHBOARD URL for TENANT NAMESPACE> \
--data-access-key <DATA ACCESS KEY> --control-access-key <CONTROL ACCESS KEY> \
--path /gibby_backup \
--backup-name <BACKUP NAME>
Optional Backup Run Arguments
Use the following optional arguments (options) when running your backup.
Options | Description |
---|---|
-c, --containers=users,bigdata | Comma separated list of containers to backup |
--skip-object-types=tiny_objects,streams | Comma separated list of object types to skip during backup [objects,tiny_objects,streams] |
--recovery-mode=enabled | Enable/disable recovery after failure [enabled, disabled] |
--backupppers=v3io,platform_resources | Comma separated list of backuppers to run [v3io,platform_resources] |
--platform-resources=users?include=user_groups, | List of platform resources to backup separated by semicolon and including URL query |
--attribute-scattering=enabled | Enable/disable attribute scattering [enabled/disabled] |
--available-storage-space-margin=25 | Additional storage space overhead required for backup [percentage]. Set to 0 to disable storage space validation |
--backup-config-spec= | Path to yaml file with backup config spec. Yaml parameters override command line arguments |
--v3io-retries=5 | Number of retries in case of v3io request failures |
--v3io-retry-interval=3 | Interval between v3io retries [sec] |
--request-timeout=60 | REST request timeout (against v3io and internal REST servers) [sec] |
--file-size-limit=512 | Split size threshold for backup (pak) files [MB] |
--object-scanner-task-listen-addr=:48541 | Listening port for object task server |
--report-max-failed-items-limit=50 | Maximum number of failed item details to include in the restore report |
--report-server-listen-addr string=:58541 | Listening port for report server |
--logger-file-path=/backups/ | Outputs logger to a file |
--logger-no-color | Logger won't output with colors |
--object-scanner-max-included-data-size=131072 | Max data size to include along with object attributes [Bytes] (<256KB) (v3.0.1+ only) |
--object-scanner-replicas= | Number of objects scanners replicas. By default, is equal to number of VNs or according to available memory size |
--object-scanner-max-concurrent-writes= | Maximum number of concurrent object body write tasks per each object scanner (object body throttling). Set to negative number for unlimited |
--object-scanner-max-pending-object-bodies=500 | Max number of pending object bodies write tasks per each object scanner |
--object-scanner-max-pending-object-attributes=250 | Max number of object attributes pending for packing per each object scanner |
--object-scanner-choke-get-items=0 | Duration of object-scanner sleep (in mili-seconds) before sending each get-items request |
--object-writer-ack-retries=5 | Number of retries in case of object writer ACK request failures against object task server |
--object-writer-ack-retry-interval=3 | Interval between object writer ACK retries against object task server [sec] |
--object-writer-replicas=24 | Number of replicas to perform object writes |
--tree-scanner-container-content-max-nodes-limit=1000 | Max number of tree nodes in get container contents response |
--tree-scanner-max-pending-tree-nodes=500 | Max number of pending tree node indexing task |
--tree-scanner-num-of-tree-node-directory-scanners=3 | Number of tree node directory fetchers per each tree node scanner |
--tree-scanner-replicas=3 | Number of replicas to perform tree node scanners |
--max-unused-buffers-pool-size=36 | Maximum number of unused buffers in buffer pool |
--packing-buffer-size=10 | The size of buffers used for packing/unpacking objects [MB] |
--profiling=disabled | Enable/disable profiling |
--profiling-port=6060 | Port for profiling (pprof) web server |
--backup-config-spec= | Path to .yaml file that specifies the backup configuration. This flag can be specified multiple times: Gibby merges the .yaml files. You could use, for example, one config file with control/data access-keys and another config file with the select configuration to exclude and/or include directories/subtrees. |
Example: Configuration file using the select
option
This example show how to filter files during creation of the snapshot.
spec:
select:
- container: container_a # container name
# object will be part of the snapshot only if it matches one or more of the include patterns and not matches any of the exclude
# patterns. no include patterns means include all.
# backup the subtree /A/B (anything not under /A/B will not be backed-up).
# exclude the directory /A/B/C/D and the subtree /A/B/C/E from the backup.
spec:
- kind: include-subtree # include-dir/exclude-dir/include-subtree/exclude-subtree
path: "/A/B"
- kind: exclude-dir
path: "/A/B/C/D" # notice that /A/B/C/D/F will be included ("exclude-dir" kind)
- kind: exclude-subtree
path: "/A/B/C/E" # notice that /A/B/C/E/F will nt be included ("exclude-subtree" kind)
- container: container_b
.
.
.
Optional Restore Run Arguments
Use the following optional arguments (options) when running your restore.
Options | Description |
---|---|
--containers=users,bigdata | Comma separated list of containers to restore |
--target-containers=original_container_name1: | Comma separated list of containers to restore under different name |
--snapshot-id= | Specific snapshot to restore |
--skip-object-types=tiny_objects,streams | Comma separated list of object types to skip during restore [objects,tiny_objects,streams] |
--recovery-mode=enabled | Enable/disable recovery after failure [enabled, disabled] |
--scattering-attributes-limit=1700000 | Attributes are split into chunks during restore if their total size is above the limit |
--backup-config-spec= | Path to yaml file with backup config spec. Yaml parameters override command line arguments |
--v3io-retries=5 | Number of retries in case of v3io request failures |
--v3io-retry-interval=3 | Interval between v3io retries [sec] |
--request-timeout=60 | REST request timeout (against v3io and internal REST servers) [sec] |
--report-max-failed-items-limit=50 | Maximum number of failed item details to include in the restore report |
--report-server-listen-addr string=:58541 | Listening port for report server |
--logger-file-path=/backups/ | Outputs logger to a file |
--logger-no-color | Logger won't output with colors |
--object-restorers-replicas= | Number of objects restorers replicas |
--object-restorer-num-of-object-writers=1 | Number of object writers per each object restorer |
--tree-restorer-replicas=3 | Number of replicas to perform tree node restore |
--tree-restorer-max-pending-tree-node-writes=100 | Max number of pending tree node write tasks |
--max-unused-buffers-pool-size=36 | Maximum number of unused buffers in buffer pool |
--packing-buffer-size=10 | The size of buffers used for packing/unpacking objects [MB] |
--profiling=disabled | Enable/disable profiling |
--profiling-port=6060 | Port for profiling (pprof) web server |