Skip to content

fix(docker-in-docker): disable containerd erofs snapshotter to fix dockerd startup#1645

Open
Kaniska244 wants to merge 12 commits into
devcontainers:mainfrom
Kaniska244:d-in-d-moby-containerd-issue
Open

fix(docker-in-docker): disable containerd erofs snapshotter to fix dockerd startup#1645
Kaniska244 wants to merge 12 commits into
devcontainers:mainfrom
Kaniska244:d-in-d-moby-containerd-issue

Conversation

@Kaniska244
Copy link
Copy Markdown
Contributor

@Kaniska244 Kaniska244 commented May 11, 2026

Summary

Fixes #1642 and #1639.

This PR fixes docker-in-docker startup failures caused by containerd >= 2.3 probing the erofs snapshotter on hosts where the kernel exposes the erofs filesystem.

The main fix is to disable the io.containerd.snapshotter.v1.erofs plugin in containerd config and ensure dockerd actually uses that config by starting containerd explicitly and passing its socket to dockerd.

In addition, this PR includes a few related improvements and test updates in the docker-in-docker feature.


Problem

On some hosts, dockerd fails to start because bundled containerd probes the erofs snapshotter plugin during startup.

That plugin requires mkfs.erofs support that is not available in older distro versions of erofs-utils, especially on:

  • Debian 12 (bookworm)
  • Ubuntu 22.04 (jammy)

When the host kernel exposes erofs, plugin initialization fails, containerd never becomes ready, and dockerd times out.

This is especially visible on newer environments such as:

  • GitHub Actions arm64 runners
  • Codespaces hosts
  • systems where erofs has been loaded explicitly

Since the feature uses overlayfs anyway, the safest fix is to disable the erofs snapshotter entirely.


Why the previous config-only approach was insufficient

A config-only change under /etc/containerd/config.toml was not enough because dockerd, when started normally, launches its own child containerd instance using an auto-generated config under /var/run/docker/containerd/.

That means /etc/containerd/config.toml is ignored unless we explicitly start containerd ourselves and point dockerd to it.

This PR therefore applies a two-part fix:

  1. Write the disabled_plugins entry into /etc/containerd/config.toml
  2. Start containerd explicitly and run dockerd with --containerd /run/containerd/containerd.sock

Main changes

1. Disable the erofs snapshotter in containerd

In src/docker-in-docker/install.sh:

  • ensure /etc/containerd/config.toml exists
  • if missing or empty, generate it from containerd config default
  • add io.containerd.snapshotter.v1.erofs to top-level disabled_plugins
  • handle all config states safely:
    • existing empty list
    • existing populated list
    • no disabled_plugins key yet
  • append an idempotency marker comment so repeated runs do not duplicate entries

This makes the configuration safe for re-runs and image-layer reuse.


2. Start containerd explicitly before dockerd

Also in src/docker-in-docker/install.sh, the generated init flow now:

  • probes common locations for the containerd binary
  • starts containerd --config /etc/containerd/config.toml
  • waits for /run/containerd/containerd.sock
  • if available, passes --containerd /run/containerd/containerd.sock to dockerd
  • otherwise falls back to the previous behavior where dockerd spawns its own containerd

This keeps the fix additive and avoids breaking hosts where explicit containerd startup is not available.

The retry/cleanup flow also continues to clean up containerd alongside dockerd.


3. Add erofs-utils on Debian installs

In the Debian package path of src/docker-in-docker/install.sh, this PR adds:

  • erofs-utils

This ensures mkfs.erofs is available where relevant and documents the dependency clearly, even though the primary fix is still disabling the snapshotter.


4. Fix Docker CE package download architecture handling for RHEL/tdnf path

In src/docker-in-docker/install.sh, the Docker CE RPM download logic now derives the correct repository architecture dynamically instead of hardcoding x86_64.

This adds support for architectures such as:

  • x86_64
  • aarch64

and updates the RPM lookup/download patterns accordingly.

This improves portability for non-x86 environments.


5. Install Docker Compose v1 in an isolated virtual environment

In src/docker-in-docker/install.sh, the Compose v1 installation path now:

  • creates a dedicated Python virtual environment
  • installs pip, setuptools, and wheel into that venv
  • installs docker-compose and dependencies there
  • symlinks the resulting docker-compose binary into the expected path

This avoids PEP 668 / externally-managed-environment issues on newer distros such as:

  • Debian trixie
  • Ubuntu noble

and avoids modifying distro-managed Python site-packages.


Test updates

Workflow update

In .github/workflows/test-pr-arm64.yaml:

  • include src/docker-in-docker/** and test/docker-in-docker/** in workflow path triggers
  • add a docker-in-docker filter in changed-path detection
  • add a step to explicitly load and verify the erofs kernel module:
sudo modprobe erofs && grep erofs /proc/filesystems

This makes the failure scenario reproducible in CI.

There is also a matrix exclusion added for:

  • docker-in-docker + mcr.microsoft.com/devcontainers/base:debian

Test script updates

The following test scripts now explicitly verify that Docker is actually usable by adding docker ps checks:

  • test/docker-in-docker/dockerIp6tablesDisabledTest.sh
  • test/docker-in-docker/docker_build_older.sh
  • test/docker-in-docker/pin_docker-ce_version_moby_false.sh

This strengthens validation by confirming that dockerd not only starts, but is usable.


Docs / metadata updates

This PR also updates docker-in-docker feature metadata and documentation:

src/docker-in-docker/devcontainer-feature.json

  • bump feature version from 2.17.0 to 3.0.0

src/docker-in-docker/README.md

  • update example reference from:
    • ghcr.io/devcontainers/features/docker-in-docker:2
    • to ghcr.io/devcontainers/features/docker-in-docker:3

src/docker-in-docker/NOTES.md

  • update the sample image tag from:
    • mcr.microsoft.com/devcontainers/typescript-node:16
    • to mcr.microsoft.com/devcontainers/typescript-node:24

Files changed

  • .github/workflows/test-pr-arm64.yaml
  • src/docker-in-docker/NOTES.md
  • src/docker-in-docker/README.md
  • src/docker-in-docker/devcontainer-feature.json
  • src/docker-in-docker/install.sh
  • test/docker-in-docker/dockerIp6tablesDisabledTest.sh
  • test/docker-in-docker/docker_build_older.sh
  • test/docker-in-docker/pin_docker-ce_version_moby_false.sh

Compatibility / risk

  • No behavior change when the erofs snapshotter is not relevant
  • Fallback behavior remains unchanged if explicit containerd startup is unavailable
  • Config modification is idempotent and safe across repeated runs
  • No new external dependency flow is introduced beyond packages already used by the feature
  • Docker continues to use the expected overlayfs path

Result

This PR makes docker-in-docker startup more reliable across affected Debian/Ubuntu environments, especially on arm64 and newer hosts where erofs is exposed, while also improving test coverage, compose installation robustness, and architecture handling.


Related

@Kaniska244 Kaniska244 changed the title Test case [docker-in-docker] - docker daemon not running with latest moby packages May 12, 2026
@Kaniska244 Kaniska244 changed the title [docker-in-docker] - docker daemon not running with latest moby packages fix(docker-in-docker): disable containerd erofs snapshotter to fix dockerd startup on Debian 12 / Ubuntu 22.04/ Ubuntu 24.04 (closes #1642) May 13, 2026
@Kaniska244 Kaniska244 marked this pull request as ready for review May 14, 2026 13:52
@Kaniska244 Kaniska244 requested a review from a team as a code owner May 14, 2026 13:52
@Kaniska244 Kaniska244 changed the title fix(docker-in-docker): disable containerd erofs snapshotter to fix dockerd startup on Debian 12 / Ubuntu 22.04/ Ubuntu 24.04 (closes #1642) fix(docker-in-docker): disable containerd erofs snapshotter to fix dockerd startup May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

docker-in-docker feature fails on macOS ARM64 with containerd 2.3.x unless erofs-utils is installed

1 participant