Troubleshooting Kubernetes nodes storage space shortage on Aliyun (Alibaba Cloud)

Published in

FAUN — Developer Community 🐾

7 min readAug 17, 2022

Background

At some point using the Managed Kubernetes Cluster on Aliyun, there are alerts prompting about over 80% of disk utilization on a certain node regularly.

Each time I am hoping the “garbage collection” kicks in, but that did not happen, here and there I start cleaning up things like log (at /var/log/) but things happens again every few months, so I figured out I have to do a better investigation and find the root cause (as well as solution hopefully).

[update on 2024–04–22 at the end of article]

The Investigation

Interacting with the node

There are several ways to access the Kubernetes node, one way is using the send remote command function.

I prefer this over the others ways like VNC or SSH remote as this is built in function in the node and I don’t need additional setup either at client or remote machine. On the other hand, I do not have enough permission to execute a cloud shell.

At Elastic Compute Service > Instances (on left navigation), then click on the link of the node

At pop up , choose to Send Remote Commands (Cloud Assistant)

The upper box allow us to enter script like bash shell script, Python or Perl, while the lower box would show the result (with limited lines of output)

Figuring out the large disk space consumptions

To start with, I first check upon the disk space available with df command:

While scrolling down the result, we would see quite a pattern of “/run/containerd/io.containerd.runtime.v2.task/k8s.io/<guid>/rootfs

The interesting part is, most of these mount points are having the max disk space and used disk space and percentage, which looks to me they are all mount to the same disk.

With another command du, we can check the disk usage per folder, here is the command I used, which try to focus on the parent folder of the problematic folders and get each subfolder disk space.

du -m –max-depth=1 /run/containerd/io.containerd.runtime.v2.task/k8s.io/ | sort -rn

The folder size aligned with my container image sizes, so I start figuring out they might refer to root file system of my container, and I did perform a “ls” command to those folder and confirm the findings.

And that’s just part of the problems…

The major problems — Kubernetes images snapshots

Inspecting more closely with du, the major consumption is from “/var/lib/containerd/io.containerd.snapshooer.v1.overlayfs/snapshots/”

I have no concrete idea on how those snapshots work, I believe it’s the filesystem to support storage of layers of data on the container images per image pulling.

And further believe is all these unused container images snapshot should be taken care by garbage collection mechanism of the Kubernetes cluster, but not.

Checking if Garbage Collection settings are wrong (irrelevant)

Garbage collection is configuration at Kubelet level, so if I want to change that, I would need to update the configuration and restart the Kubelet service (and all containers in the node should be drained and recreated at another node, which is bad for my production environment), so I better be just checking.

What drove me check that configuration, is the documentation of Kubernetes say the default value of garbage collection is at 85% disk utilization, which is under the disk space alert (80%).

But that’s a wrong direction after all, as the documentation mentioned the coverage of garbage collection, and seems that old image pulled are not included.

Managing the old images

Finally, given the focus is the images, so I need to find a way to clean them up, and given they are inside containerd, so there are 2 command line interface (ctr and crictl) that I can use.

Gotcha for using crictl at Aliyun node

The crictl is already installed in the cluster node, and the version information:

This seems to be an old run version, and using that to run any crictl commands like “crictl images ls” (to list all images) failed with error “failed to connect: failed to connect: context deadline exceeded”.

Some search result online mentioned using root or access the master node of the cluster, but those are NOT the case.

The root cause is, Aliyun node setup have configured the crictl to use socket at “unix:///var/run/crio/crio.sock”, this is written at /etc/crictl.yaml. And the actual socket that containerd running is at “/run/containerd/containerd.sock”, so we have to update the yaml.

The send remote command approach come with a shortcoming, which is no interactive file editor. Given the yaml is just having 1 line, so we can completely rewrite the file with:

cat >> /etc/crictl.yaml <<EOL
runtime-endpoint: unix:///run/containerd/containerd.sock
EOL

After that, the crictl connect to the correct socket and that works.

The ctr command — unexpected behavior but helpful in my case

Before I fixed the crictl connection issue, I did try using ctr to remove images.

The first problem I got is “ctr images ls” suppose to show all images I have in the containerd, but it show empty table.

Ended up, ctr is using namespace, and the kubernetes default namespace is k8s.io, so the commands would be like “ctr -n k8s.io images ls”.

I try clean up the old images with following command, which in human language:

List all images with specific pattern
Sort by the first column (the image full tag as ‘xxx/yyy:tag’) in reverse order (this is because all my tags are like xxx/yyy:dev_yyyyMMdd)
Use awk to make a string per line as “ctr -n k8s.io images rm <full tag>”
Finally execute it with xargs as bash command

ctr -n k8s.io images ls | grep <what you want as “filter”> | awk ‘{ print $1 }’ | sort -r | awk ‘{print “ctr -n k8s.io images rm “ $1}’ | xargs -0 bash -c

A more complicated version is as follow, the middle “sed -n ‘3~1p’ ” retain the first 2 items (the 3~1p means — start process from the 3rd item, with every 1 item)

ctr -n k8s.io images ls | grep <what you want as “filter”> | awk ‘{ print $1 }’ | sort -r | sed -n ‘3~1p’ | awk ‘{print “ctr -n k8s.io images rm “ $1}’ | xargs -0 bash -c

The result looks legit, but the disk space didn’t goes down.

Finally

Ended up, I see people also marked issue that crictl and ctr showing different results, like importing image from ctr is not recognized by crictl.

So I go back to crictl (as I fixed it), and see a lot of images that have <none> tag, so I go ahead and remove them all with following (simpler) command:

crictl images ls | awk '{ if ($2 == "<none>") { print "crictl rmi " $3} }' | xargs -0 bash -c

After that, the old images are cleaned.

Summary

Ended up, it’s a 2 steps solution to clean up the old image, I suspect it would be more robust to check all images used in the cluster (containerd, to be exact) with crictl command, then filter anything that’s not being used and clean them up.

That…would be next time when disk space run till 80%, I wish it do not comes within 5 months.

And I believe (not verified), it might be a very good practice to make the container images having good layers (so that underneath layers are slim and does not change much), so that every time the pulled images are only the slim changes on the top most layer(s).

I wish you enjoy the article and hopefully it help someone in future.

Update on 2024–04-22

Ended up the steps to clean up image using ctr is not necessary, the crictl is the one that manage lower level while the ctr is more focus on dealing with container, so the update is running the following command:

# in the sed X~1p portion, X-1 is the number of count you want to keep, replace X with the number you need
crictl images ls | grep <what you want as "filter"> | awk '{ print $3 }' | sort -r | sed -n 'X~1p' | awk '{ print "crictl rmi " $1}' | xargs -0 bash -c

And the better is Aliyun moved the send remote command to “schedule and automated tasks” tab on your node (EC2 node of the Kubernetes), which means that you can run the command on a scheduled manner