Troubleshooting permission errors when enabling persistence
Objective
This guide will teach you how to autonomously fix an OVHcloud Managed Kubernetes Service when Permission Errors are encountered during Helm Chart deployment or deployment creation.
Issue explanation
Several Helm charts are updated with some security hardening best practices.
Using a non-root container, for example, is a new rule to follow for security reason.
But a main drawback to use non-root containers is related to mounting persistent volumes in these containers.
Indeed, processes running inside these containers do not have the necessary privileges to modify the ownership of the existing filesystem in a volume.
A solution is to use the SecurityContext provided by Kubernetes to automatically modify the ownership of the attached volumes and to provide a StorageClass which supports modifying the volume's filesystem.
However, the StorageClass used by default for the "OVHcloud Managed Kubernetes Service" didn't support the possibility to modify the volume's filesystem.
In the following documentation, we are providing some patches, prior to an update of our service.
Observed behaviors
Some pods can be marked in CrashLoopBackOff status a few seconds/minutes after being scheduled, due to insufficient write access to persistent volumes.
Example of error logs:
Provided solutions
- We (the OVHcloud Managed Kubernetes Service team) are working on a patch to be released in early 2022. So, if you are not impacted by the issue, please do not update your Helm Chart deployment (as only recent Helm Charts seem to make use of security context, which causes this issue) and wait until a new version of your managed service is available through the OVHcloud console.
- You are using the Bitnami Helm Charts and you want to be able to quickly fix this behavior without waiting for our patch. You can follow the instructions described in this documentation: https://docs.bitnami.com/kubernetes/faq/troubleshooting/troubleshooting-helm-chart-issues/
- **This solution is not recommended if you don't know what you are doing and only works with clusters above `1.20` version.** You are impacted by this issue but your Helm Chart provider didn't offer a proper solution and you can't wait for our official patch.
If you are in this case, please follow these instructions at your own risk:
- Verify what is the
StorageClassthat you are using by default (generally thecsi-cinder-high-speed):
If your cluster is deployed in a region that supports LUKS encrypted storage, you will also see the -luks variants of the storage classes listed above.
- Delete the concerned
StorageClassthat you are using by default
- Create a new
StorageClasswith the required fix
- Delete the concerned Helm Chart
For example with the Helm Chart bitnami/wordpress which is concerned by this behavior:
And don't forget to verify if concerned PersistentVolumeClaim and PersistentVolume have been deleted before reinstalling the Helm Chart:
- Reinstall the concerned Helm Chart or deployment
For example with the Helm Chart bitnami/wordpress which is concerned by this behavior:
You can see that the pods are now up and running, which means that the permission errors related to the persistentVolumes are now fixed.
Go further
To learn more about using your Kubernetes cluster the practical way, we invite you to look at our OVHcloud Managed Kubernetes documentation.
-
If you need training or technical assistance to implement our solutions, contact your sales representative or click on this link to get a quote and ask our Professional Services experts for assisting you on your specific use case of your project.
-
Join our community of users.