Setsockopt behaves strangely, process_vm_readv and ptrace don't work in Docker

Nix’s Linux builds (https://www.github.com/nix-rust/nix) have recently begun failing. The last successful build finished on Sep 5, 2019 19:42:37 and the first failing build finished on Sep 6, 2019 4:41:11 (I don’t know what timezone; I can’t find it in Travis’s logs). We’re seeing the exact same failures in both the Trusty and Bionic images. The failures are:

  1. Calling setsockopt on an AF_ALG socket returns ENOPROTOOPT
  2. Calling setsockopt on an AF_INET socket to set TCP_CONGESTION returns ENOENT
  3. Calling process_vm_readv returns ENOSYS
  4. ptrace seems unable to catch SIGTRAP, at least in one particular case

The Travis Changelog didn’t say anything relevant, and the build logs show that the Docker images haven’t been rebuilt. So I’d guess that Travis rolled back to an older kernel for their Docker hosts, but didn’t announce it.

Could somebody please clarify what changed, and what users can do to access the newer kernels again?

Comparing https://travis-ci.org/nix-rust/nix/jobs/581470892 (pass) and https://travis-ci.org/nix-rust/nix/jobs/581609018 (fail) shows that kernels are the same – 4.4.0-101-generic.

The man page says that process_vm_readv has been supported since kernel 3.2, and yet it still returns ENOSYS. Did Travis disable it somehow?

Googling “docker process_vm_readv” shows that this is a Docker’s security feature: https://docs.docker.com/engine/security/seccomp/

I can’t find a docker run command in your build log so can’t say if anything changed there.

@native-api thanks for that pointer; that answers the question about process_vm_readv and ptrace. But that’s far from the only problem.

  1. Digging deeper into the TCP_CONGESTION failure, I see that getsockopt is returning “cubi” (probably should be “cubic”), and setsockopt accepts neither “cubi” nor “cubic”. Do you know anything about that? Google wasn’t helpful.

  2. execveat is failing with ENOSYS, even though other variants, including fexecve, work. Do you know why? The linked document says nothing about it.

  3. I don’t know if you’re affiliated with Travis or not, but why didn’t Travis announce when they turned on seccomp? It’s a major change to the Linux environment.

  4. setsockopt now returns ENOPROTOOPT. That seems like it’s also seccomp related (from your link: " All socket and socketcall calls are blocked except communication domains AF_UNIX , AF_INET , AF_INET6 , AF_NETLINK , and AF_PACKET"). It seems like the wrong error code, but I suppose it is what it is.

The TCP_CONGESTION bug has disappeared. Travis must’ve fixed it.

At this point I’ve got checks to disable all of the things that break when Seccomp is on. Thanks to @native-api for the help, and -1 to Travis for not announcing a change like this.