"docker system df --verbose" tells that a container is eating up a lot of space. But when I "docker exec -it <container> sh" into the container shell and run "du -h -d1 /", I don't find any dir that's taking up space
Other Common Causes1. Container Logs2. Volumes and Bind Mounts3. Open File DescriptorsHow to Troubleshoot and Fix It 🕵️Step 1: Analyze the Image LayersStep 2: Check the Container's Log FileStep 3: Check for Changes in the Writable LayerBest Practices for Prevention 📦
Of course. The most likely reason for this discrepancy is how Docker's layered filesystem works, specifically with deleted files.
When you run
du
inside the container, you are seeing the current, merged filesystem. However, docker system df
reports the total space used by all the layers of the container's image plus its own writable layer.If a large file was added in a lower (earlier) layer and then "deleted" in a higher (later) layer, the file isn't actually gone. It's just marked as hidden in the top layer. The data still exists in the lower layer, consuming space. 👻
Think of it like this:
- Layer 1 (Image Base): You add a huge 5 GB video file.
- Layer 2 (Your Dockerfile): You run
rm /path/to/huge/video.mp4
.
Docker's copy-on-write system doesn't erase the file from Layer 1. Instead, it places a "whiteout" file in Layer 2 that says "the huge video file is deleted."
du
inside the container: Sees the "whiteout" file and reports that the video is gone. The space appears free.
docker system df
on the host: Sees both Layer 1 (with the 5 GB file) and Layer 2 (with the whiteout marker). It correctly reports that the total space used is still over 5 GB.
Other Common Causes
While deleted files are the most common culprit, here are other possibilities:
1. Container Logs
The container's logs (
stdout
/stderr
) are typically managed by a logging driver (like json-file
) and stored on the host machine, not inside the container's filesystem. If the container is very verbose, this log file can grow to be enormous. du
inside the container will never see this file.2. Volumes and Bind Mounts
If you have a volume or bind mount attached to the container, the data resides on the host filesystem. While
du
inside the container might see it, it's easy to overlook, and its space is managed outside the container's writable layer.3. Open File Descriptors
This is a general Linux issue. A process inside your container might have a file open that has since been deleted from the filesystem. The disk space for that file won't be freed until the process closes the file handle (or the process terminates).
How to Troubleshoot and Fix It 🕵️
Here’s a step-by-step guide to find the real source of the space usage.
Step 1: Analyze the Image Layers
This will help you find those "hidden" deleted files. The best tool for this is
dive
.- Install
dive
(if you don't have it): Installation Instructions
- Run
dive
on your image:Bash
dive <your-image-name>:<tag>
dive
gives you a brilliant interactive TUI to explore each layer of your image. You can easily navigate the filesystem and see what files were added, modified, or deleted in each layer on the left. Look for layers with a large size and check for files that were later removed.Step 2: Check the Container's Log File
Find where Docker is storing the container's log and check its size.
- Find the log path:Bash
docker inspect <container_id_or_name> | grep LogPath
- Check its size on the host:Bash
# The command above will return something like:
# "LogPath": "/var/lib/docker/containers/abc.../abc...-json.log",
#
# Now check its size:
ls -lh /var/lib/docker/containers/abc.../abc...-json.log
If this file is huge, you've found a problem. You should configure log rotation for your containers in the Docker daemon's
daemon.json
file.Step 3: Check for Changes in the Writable Layer
You can use
docker diff
to see what has been added (A), changed (C), or deleted (D) in the container's writable layer since it was created.Bash
docker diff <container_id_or_name>
This can help you spot large temporary files or caches that were created during runtime.
Best Practices for Prevention 📦
- Use Multi-Stage Builds: This is the #1 solution for the deleted file problem. You use one stage to build your application (installing dependencies, compiling, etc.) and then copy only the necessary artifacts to a final, clean, minimal base image. The intermediate layers with all the build tools and source code are discarded.
- Combine
RUN
Commands: Chain yourapt-get install
(or equivalent) and cleanup commands in the sameRUN
instruction using&&
. This ensures the temporary files are removed in the same layer they were created in. - Bad (creates a bloated layer):Dockerfile
- Good (cleans up in the same layer):Dockerfile
RUN apt-get update && apt-get install -y build-essential
RUN rm -rf /var/lib/apt/lists/*
RUN apt-get update && apt-get install -y build-essential \
&& rm -rf /var/lib/apt/lists/*
- Configure Log Rotation: As mentioned above, prevent container logs from growing indefinitely.