Debugging docker run issue on arm64 due to no access to debugfs

Seeing this error on arm64: Travis CI - Test and Deploy with Confidence

docker: Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused “process_linux.go:430: container init caused "rootfs_linux.go:70: creating device nodes caused \"open /var/lib/docker/overlay2/65f79f3b00ab0d319ee70214ce263d8e7db5eb1ed0f78d2d493d29458238c5c9/merged/dev/tty: no such device or address\""”: unknown.

The same docker command works on x86_64.

Suggestions on how to debug this?

1 Like

Here’s the reference to the exact same command that works on x86_64: https://travis-ci.org/zlim/bpftrace/jobs/598535854#L2061

@Michal could this be due to arm64 running on LXD?

Hi @zlim

Happy to have you on ARM builds :slight_smile:
I don’t know yet. This is new. We’ll be looking into this. Thanks for bringing that to our attention!

Michał

Hello, there. We have confirmed that with LXD, you cannot access privileged file systems such as debugfs. See https://github.com/lxc/lxd/issues/2661

Thank you @zlim for pointing out security constraints, that should be documented.

Documentation update: https://github.com/travis-ci/docs-travis-ci-com/pull/2564

If you’d like to build on ARM outside LXD container, let us know. I am trying to get a grip on how many use cases out of whole population require priviliged filesystems on ARM.

Happy building!

Thanks @BanzaiMan and @Michal for looking into this.

Yes - would appreciate a solution for arm64. Right now this capability discrepancy is gating inclusion of arm64 to the project’s CI.

Thanks for your feedback @zlim.
I guess bpftrace is one of those projects that actually do need access to /sys/kernel/debugfs with debugfs :slight_smile: Please mind there’s no established date for ARM builds outside LXD yet. Whenever we get there, there’ll be some kind of notification for sure, so please stay tuned.

Happy building!

Hello, I am trying to run the test suite for the TechEmpower Web framework benchmarks (which uses Docker extensively), and I am experiencing the same issue:
https://travis-ci.org/TechEmpower/FrameworkBenchmarks/builds/606576448

Here’s one particular error message:

libreactor: APIError: 500 Server Error: Internal Server Error ("OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"rootfs_linux.go:70: creating device nodes caused \\\"open /var/lib/docker/overlay2/b9cb175e51f3fb8f4395d25322553bf455b0ae2c04bae1f716e7ce6df1ea2489/merged/dev/tty: no such device or address\\\"\"": unknown")

And another one:

tee: /dev/tty: No such device or address

I tried to work around the second error with sudo to no avail, as visible in another build:

$ sudo bash -c "echo test > /dev/tty"
bash: /dev/tty: No such device or address

There are also several other failures that I suspect have the same underlying cause:

gemini-postgres: APIError: 400 Client Error: Bad Request ("OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"write sysctl key kernel.sem: open /proc/sys/kernel/sem: permission denied\"": unknown")

I believe I’ve just hit the same issue in https://travis-ci.com/github/mstormi/openhabian/jobs/320853567#L470

Have there been any improvements or news on this since October ?

Figured out that I must not use -t on docker run/exec.
Now Balena images work fine for me on native ARM.