Lately I have been hitting situations where containers overload the system disk on my EKS nodes. Once the root disk maxes out its throughput, the operating system freezes, CPU usage spikes, and the node drops into NotReady. Even worse, the node stops accepting SSH, so I cannot inspect what actually went wrong. Kubernetes will eventually reschedule the Pods, but only after the node has stayed in NotReady for five minutes.
To keep this from happening again, I split the system disk and the container data disk into two separate EBS volumes so that container workloads always land on the second volume.
Why split the volumes?
The second EBS volume exists for one reason: even if a workload drives the disk to its limits, only the container volume suffers while the system disk continues to run smoothly. That layout brings a few perks:
Keep the system disk at the default 20 GiB.
Let the container data disk scale with the workload (for example 200 GiB) so the pressure stays isolated. If a workload needs its own volume, you can still add an Amazon EBS CSI PersistentVolume on top.
Create a new Launch Template version
I run Managed Node Groups backed by a Launch Template, so I created a new version with two block devices:
The embedded user data decodes to the following script. It formats the second disk, writes the mount point into fstab, and copies the existing containerd data back into place.
After the Managed Node Group picks up this launch template, each node gets two EBS volumes. The second volume mounts to /var/lib/containerd, the location where container files live.
# When you omit an AMI, Amazon Linux 2023 is used by default aws eks create-nodegroup \ --region eu-west-1 \ --cluster-name <your-cluster-name> \ --nodegroup-name <your-nodegroup-name> \ --subnets subnet-xxx subnet-yyy \ --node-role arn:aws:iam::<account-id>:role/<node-instance-role> \ --launch-template id=$LT_ID,version=1 \ --instance-types t3.medium \ --scaling-config minSize=1,maxSize=3,desiredSize=2
Once the nodes finish provisioning, log in and check the block devices:
1 2 3 4 5 6 7
sh-5.2$ lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS nvme1n1 259:0 0 50G 0 disk /var/lib/containerd nvme0n1 259:1 0 30G 0 disk ├─nvme0n1p1 259:2 0 30G 0 part / ├─nvme0n1p127 259:3 0 1M 0 part └─nvme0n1p128 259:4 0 10M 0 part /boot/efi
Even if a Pod hammers the filesystem, the load stays on nvme1n1, so the root disk (nvme0n1) keeps the node healthy. Other Pods that share the node might still slow down, but at least the node remains reachable for debugging.
Deploy a test Pod
I asked ChatGPT to draft a Pod that pounds the disk. The Pod spins up a Ubuntu container and uses stress-ng to write heavily to disk:
Finally, I ran sar (system activity report) to validate the disk pressure. The throughput lands on nvme1n1, confirming the second disk absorbs the I/O burst.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
sh-5.2$ sar -d 1 5 Linux 6.12.46-66.121.amzn2023.x86_64 (ip-192-168-94-143.eu-west-1.compute.internal) 10/31/25 _x86_64_ (2 CPU)
Splitting the volumes keeps heavy container I/O from knocking the node offline. You can also define multiple node groups tailored to specific workloads, pairing them with the right EBS profile—for example, place high-I/O applications on nodes that ship with io1-based volumes.
Give EKS Nodes a Dedicated EBS Volume for Containers