How I Debug My K8s Tests Not Running
FleetingSome times, the tests of clk k8s run forever, with a log like
+test | action: Getting node:None
+test | debug: Given the distribution kind, I inferred the context kind-clk-k8s
+test | action: run: kubectl --context kind-clk-k8s get node --namespace default --output json
+test | warning: Waited 780s for the node to be ready. It's been a long time now, something may be wrong. I'm still waiting for eternity
+test | warning: clk-k8s-control-plane: KubeletHasSufficientMemory, KubeletHasNoDiskPressure, KubeletHasSufficientPID, KubeletNotReady
It wait forever for the cluster to start.
In that case, I want to understand what makes the cluster not been ready.
The layers I have to deal with are
- earthly runs the tests using
WITH DOCKER
. I can enter it usingdocker exec -ti earthly-buildkitd sh
to enter the earthly builder docker containerbuildkit-runc list
to find the running earthly jobbuildkit-runc exec -t o7mtrt511xmf9okk0pcgjbtqv bash
to enter it
- in the
WITH DOCKER
layer, hence I candocker ps
to find the running containersdocker exec -ti clk-k8s-control-plane bash
to enter the running instance of kind if need be
In the earthly job, I can use kubectl to request the cluster.
This can be made a onliner to ease debugging. Using a temporary clk alias
clk alias set run exec -- docker exec earthly-buildkitd buildkit-runc exec o7mtrt511xmf9okk0pcgjbtqv
New global alias for run: exec docker exec earthly-buildkitd buildkit-runc exec o7mtrt511xmf9okk0pcgjbtqv
Then, I can investigate using kubectl.
clk run kubectl get node
NAME STATUS ROLES AGE VERSION
clk-k8s-control-plane NotReady control-plane,master 26m v1.21.1
This confirms the fact that the cluster is not ready.
Now, let’s dig deeper into why it is not ready
clk run kubectl get node --output json | jq | jq -r '.items[0].status.conditions[]|select(.reason == "KubeletNotReady").message'
container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
This tells me that I should look into the calico installation.