Kubernetes Best Practices
Fleeting- make sure the alertmanager is in HA,
- monitor the exposed endpoint from outside (see blackbox prober exporter),
- reduce cognitive ease, use non default namespace or hard to configure clusters so that people will be forced to be aware of the risk of their actions, (I still need to think this through. Isn’t it in contradiction with my pragmatisme value?)
- separate dev / testing / staging / prod
- you can break everything in dev
- the team can work on testing, but try to keep it stable
- staging is run only when wanting to simulate the prod
- you never touch the prod
- add a linter to make sure you put resources requests and limits everywhere or else you loose the promise of node scheduling
- check on a regular basis the liveness, readiness probes https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
- add a linter to ensure to have network policies
- set the checksum of your configmaps in an annotation of the pod that use them, so that a pod will restart when a configmap changes
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
- use annotations to restart a pod
{{- if .Values.recreatePodsOnUpgrade }} checksum/release-time: {{ now | unixEpoch | quote }} {{- end }}