Builds started to hang for no reason at all, but not in debug mode

so about two days ago, travis ran a build on my private repo. the build succeeded, everything was fine. yesterday, my builds started to hang mid script phase for seemingly no reason. i ran the build in debug mode to try and figure it out. my .travis.yml only uses the install and script phases so i ran travis_run_install and travis_run_script. everything then ran perfectly fine passing all tests. being puzzled, i restarted the build that succeeded two days ago, and saw it now hanging the same way new builds hang. i have no clue anymore of what is causing the hang and how to fix it.

here is where it hangs:
the script phase runs a python script, which calls p = subprocess.Popen() and later p.communicate(stdin) on a loop. always, the sixth time this is done the process starts correctly, p.communicate sends the stdin and waits for the process to exit. the process though, while it starts it never receives the stdin and hangs waiting for input.

if anyone knows what is going on or how to fix it please let me know, i have been pulling my hair out all night.

1 Like

If your process is waiting for input from STDIN (or a TTY), then your CI job will not get it, but your debug session will. It is not clear to me why it was working before, but I suspect something in your dependencies changed.

the main test script doesnt take any stdin. it spawns processes and sends data into their stdin, so it should be working regardless. this only uses python built in modules so any dependencies dont affect it. the strangest thing is that it always succeeds at starting 5 of these processes, sending them stdin, them receiving it and doing their jobs but at the 6th the process never gets the stdin. we even tried changing the travis config to run on bionic instead of xenial and using a different python version but the problem persists. works perfectly on all of our machines locally and in a travis debug session, but fails when it counts. its not even an issue with the worker process it spawns since we shuffle their order and the 6th always fails.