Konubinix' opinionated web of thoughts

Why Does K3d/K3s/Kubernetes Clean My Images?

Fleeting

Pods evicted due to lack of disk space

Pods go to evicted state after doing X

Related issues: #133 - Pods evicted due to NodeHasDiskPressure (collection of #119 and #130)

Background: somehow docker runs out of space for the k3d node containers, which triggers a hard eviction in the kubelet

Possible fix/workaround by @zer0def:

use a docker storage driver which cleans up properly (e.g. overlay2)

clean up or expand docker root filesystem

change the kubelet’s eviction thresholds upon cluster creation:

k3d cluster create \ –k3s-agent-arg ‘–kubelet-arg=eviction-hard=imagefs.available<1%,nodefs.available<1%’ \ –k3s-agent-arg ‘–kubelet-arg=eviction-minimum-reclaim=imagefs.available=1%,nodefs.available=1%’

https://github.com/rancher/k3d/blob/main/docs/faq/faq.md

How the cluster (as in k3s/Kubernetes) acts is mostly rancher/k3s’ concern. What I suspect is:

image garbage collection kicks in, because your node(s) is reaching filesystem capacity you’re possibly running on Mac or Windows with Docker for Desktop, where the VM has limited disk space available

https://github.com/rancher/k3d/issues/216

Importing images works great - but seems that if the images are not used - they get deleted. How can we prevent this happening so soon?

https://github.com/rancher/k3d/issues/216

You can adjust the following thresholds to tune image garbage collection with the following kubelet flags :

  • image-gc-high-threshold, the percent of disk usage which triggers image garbage collection. Default is 85%.
  • image-gc-low-threshold, the percent of disk usage to which image garbage collection attempts to free. Default is 80%.

You can customize the garbage collection policy through the following kubelet flags:

  • minimum-container-ttl-duration, minimum age for a finished container before it is garbage collected. Default is 0 minute, which means every finished container will be garbage collected.
  • maximum-dead-containers-per-container, maximum number of old instances to be retained per container. Default is 1.
  • maximum-dead-containers, maximum number of old instances of containers to retain globally. Default is -1, which means there is no global limit.

Containers can potentially be garbage collected before their usefulness has expired. These containers can contain logs and other data that can be useful for troubleshooting. A sufficiently large value for maximum-dead-containers-per-container is highly recommended to allow at least 1 dead container to be retained per expected container. A larger value for maximum-dead-containers is also recommended for a similar reason. See this issue for more details.

https://kubernetes.io/docs/concepts/cluster-administration/kubelet-garbage-collection/#user-configuration

Yes, same with k8s, default 80%

https://github.com/k3s-io/k3s/issues/813

–image-gc-high-threshold int32     Default: 85

The percent of disk usage after which image garbage collection is always run. Values must be within the range [0, 100], To disable image garbage collection, set to 100. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet’s `–config` flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)

–image-gc-low-threshold int32     Default: 80

The percent of disk usage before which image garbage collection is never run. Lowest disk usage to garbage collect to. Values must be within the range [0, 100] and should not be larger than that of `–image-gc-high-threshold`. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet’s `–config` flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)

https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

Existing Flag New Flag Rationale

  • –image-gc-high-threshold –eviction-hard or –eviction-soft existing eviction signals can trigger image garbage collection
  • –image-gc-low-threshold –eviction-minimum-reclaim eviction reclaims achieve the same behavior

https://kubernetes.io/docs/concepts/cluster-administration/kubelet-garbage-collection/#user-configuration

“eviction-minimum-reclaim”: “imagefs.available=10%,nodefs.available=10%”

https://github.com/k3s-io/k3s/blob/master/pkg/daemons/agent/agent.go#L67