Backing-up an OVHcloud Managed Kubernetes cluster using Velero
Objective
In this tutorial, we are using Velero to backup and restore an OVHcloud Managed Kubernetes cluster.
Velero is an Open Source tool to safely backup and restore, perform disaster recovery, and migrate Kubernetes cluster resources.
For cluster configuration backup, we are using our Public Cloud's Swift Object Storage with the Swift S3 API as storage backend for Velero. Velero uses the Amazon S3 protocol to store the cluster backups on a S3 * compatible object storage.
For Persistent Volumes backup, we are using the CSI snapshot support for Velero, that enables Velero to backup and restore CSI-backed volumes using the Kubernetes CSI Snapshot Beta APIs.
Before you begin
This tutorial presupposes that you already have a working OVHcloud Managed Kubernetes cluster, and some basic knowledge of how to operate it. If you want to know more on those topics, please look at the deploying a Hello World application documentation.
Instructions
Creating the Object Storage bucket for Velero
Velero needs a S3 compatible bucket as storage backend to store the data from your cluster. In this section you will create your S3 bucket on OVHcloud Object Storage.
Preparing your working environment
Before creating your Object Storage bucket you need to:
Setting the OpenStack environment variables
You should now have access to your OpenStack RC file, with a filename like <user_name>-openrc.sh, and the username and password for your OpenStack account.
Set the environment variables by sourcing the OpenStack RC file:
The shell will ask you for your OpenStack password:
Creating EC2 credentials
Object Storage tokens are different, you need 2 parameters (access and secret) to generate an Object Storage token.
These credentials will be safely stored in Keystone. To generate them with python-openstack client:
Please write down the access and secret parameters:
Configuring awscli client
Install the awscli client:
Create the credentials file into ~/.aws/credentials:
Where <AWS_ACCESS_KEY_ID> and <AWS_SECRET_ACCESS_KEY> are the access and secret Object Storage credentials generated in the precedent step.
Complete and write down the configuration into ~/.aws/config:
Replace s3_region by the Public Cloud Region with no digits (e.g.: gra, sbg, bhs)
You can test your settings by running this command:
If your .aws/config only contains one profile, the argument --profile default is optional.
Create an Object Storage bucket for Velero
Create a new bucket:
Make sure your bucket name is specific enough or a BucketAlreadyExists error will occur, as the bucket names should be unique across all s3 users.
List your buckets:
Installing Velero
We strongly recommend that you use an official release of Velero. The tarballs for each release contain the velero command-line client. Expand the tarball and add it to your PATH.
Install Velero, including all prerequisites, into the cluster and start the deployment. This will create a namespace called velero, and place a deployment named velero in it.
Example for velero v1.16.2:
Replace s3_region by the Public Cloud Region with no digits (e.g.: gra, sbg, bhs).
Starting with version 1.14 the plugin-for-csi is integrated in Velero. For upgrading an older version follow the upgrade notes: Upgrade-to-1.14. Please refer to those links to check Velero's plugins comptability: velero-plugin-for-aws and velero-plugin-for-csi.
In order to allow Velero to do Volume Snapshots, we need to deploy a new VolumeSnapshotClass.
Create a velero-snapclass.yaml file with this content:
Apply the new class:
In our case, the result looks like this:
Verifying Velero is working without Persistent Volumes
To verify that Velero is working correctly, let's test with one example deployment:
Copy the following code into a nginx-example-without-pv.yml file:
Deploy it to your cluster:
Check Pods have been created:
Create a backup of the namespace:
Since Velero 1.14, CSI VolumeSnapshots are used by default for Persistent Volumes. The --snapshot-move-data flag is no longer required for CSI-backed volumes and can be safely omitted. It is only needed for non-CSI volumes backed up with Restic.
Verify that the backup is done:
Wait until the status is equal to Completed.
Simulate a disaster:
Restore the deleted namespace:
Verify that the restore is correctly done:
You can see that the resources were recreated as expected:
Before continuing, clean the nginx-example namespace:
Verifying Velero is working with Persistent Volumes
Node Agents (Restic): Node agents are mainly used for file-level backups or for Persistent Volumes that are not managed by CSI.
For Persistent Volumes managed via CSI (as with OVHcloud Managed Kubernetes), CSI VolumeSnapshots are the recommended method. In this case, there is no need to deploy Node Agents, as backups and restores are handled natively by the CSI snapshot mechanism.
To verify that Velero is working correctly with Volume Snapshots of Persistent Volumes, let's test with one example deployment:
Copy the following code into a nginx-example-with-pv.yml file:
Pay attention to the deployment part of this manifest, you will see that we have defined a .spec.strategy.type. It specifies the strategy used to replace old Pods by new ones, and we have set it to Recreate, so all existing Pods are killed before new ones are created.
We do so as the Storage Class we are using, csi-cinder-high-speed, only supports a ReadWriteOnce, so we can only have one pod writing on the Persistent Volume at any given time.
Deploy it to the cluster:
Create an index.html file:
Check that the webserver responds as expected:
Now we can ask velero to do the backup of the namespace:
Reminder: --snapshot-move-data is not needed for CSI-backed volumes. It has been removed from the command below.
Check the backup has finished successfully:
Describe the backup to confirm that the CSI volumesnapshots were included in the backup:
Simulate a disaster:
Restore the deleted namespace:
Verify that the restore is correctly done:
Check that the webserver responds as expected:
The content of the file was restored!
Scheduling backups with Velero
With Velero you can schedule backups regularly, a good solution for disaster recovery.
In this guide you will create a schedule Velero resource that will create regular backups.
Copy the following code into a schedule.yml file:
Apply it to the cluster:
Verify that the schedule is correctly created:
Wait several minutes and verify that a backup has been created automatically:
You should have a result like this:
Cleanup
Clean the nginx-example namespace:
Clean velero schedule:
Clean existing velero backup:
Where do we go from here
So now you have a working Velero on your cluster.
Please refer to official Velero documentation to learn how to use it, including scheduling backups, using pre- and post-backup hooks and other matters.
Go further
- If you need training or technical assistance to implement our solutions, contact your sales representative or click on this link to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project.
Join our community of users.
*: S3 is a trademark of Amazon Technologies, Inc. OVHcloud’s service is not sponsored by, endorsed by, or otherwise affiliated with Amazon Technologies, Inc.