Builds hang, with output truncated mid-line

it happens for last several days

1 Like

This command is problematic:

travis_wait scripts/build-ssl.sh > build-ssl.log 2>&1 || (cat build-ssl.log && exit 1)

It redirects all output to file – even that of travis_wait itself. So it will terminate your build if this step takes more than 10 minutes.

To redirect output just from the command, use

travis_wait bash -c 'scripts/build-ssl.sh >build-ssl.log 2>&1' || (cat build-ssl.log && exit 1)
1 Like

suggested fix was applied, indeed I messed with redirection.

it works in my own fork

https://travis-ci.com/github/chipitsine/haproxy/jobs/299492771

however, it is still broken in original repo

https://travis-ci.com/github/haproxy/haproxy/jobs/299472322

I localized the output truncation problem and reported it at Output is truncated heavily in ARM64 when a command hangs.

It is indeed caused by a hanging command and only happens in Multi-CPU architecture environments, but the truncation happens long before the offending line.

So currently, it’s only possible to identify the offending command by elimination.

The difference between your fork and the original repo (assuming you are building exactly the same code) is secret variables and cache. So this is where I would look first.

that build does not depend on variables.

thank for the hint, I’ll clear cache and try again

They could still matter because when they are present, the output is not a tty – this may be affecting code behavior.

Hello, I am also experiencing the issue of my arm64 builds timing out.

The interesting thing I have noticed is that the job will complete and pass if I don’t have a script I’m running in my travis.yml file attempt to build and test a helloworld program. But, if I try to trigger that build and test (manually or thru a Github commit) it hangs and fails on the install of a piece of software that is required to run the helloworld service. Again It seems to install without issue if it’s not attempting to run this service.

The second thing that seems strange is the output of the “Building system information” when running with the arm64 architecture vs amd64 architecture. With arm64 there is no “Build image provisioning date and time” or “Operating System Details” (or anything else) even when the travis.yml completes and passes, which seems to be consistent jay0lee’s build as well.

I’ll attach some builds for more information.

arm64 fail, attempting to build and test helloworld showing timeout:
https://travis-ci.com/t-fine/examples/builds/137645521

arm64 pass, NOT attempting to build or test helloworld:
https://travis-ci.com/t-fine/examples/builds/137645166

amd64 pass, showing to demonstrate the difference in the “Building system information”:
https://travis-ci.com/t-fine/examples/builds/137490512

Not sure if there’s any new information on this issue, but I just came across it and thought I’d share my experiences here as well.

Thanks!

Hello,

I will bring this up with the internal team.

Montana
Travis CI

In https://travis-ci.org/github/native-api/test_travis/jobs/663728118, I accidentally read stdin with cat -n "${FF[@]}" when $FF happened to be empty.

It caused the build to hang (understandably), but output got truncated mid-line and long before the offending code.
(A sibling job in AMD64 environment, with exact same settings, correctly displayed all output up to the offending line.)

In another build with this defect, https://travis-ci.org/github/native-api/test_travis/jobs/663783018, the output got truncated at exactly the same place – which suggests that the output defect is connected to stream buffering.

2 Likes

The stream buffering issue also seems to cause timeouts for Arm64 jobs.

And since the log is incomplete (due to buffering) it looks like the job hung in the middle of an earlier command, sending the customer down a false trail.

2 Likes

Came here to add a +1 to the buffering issue. I’ve even had lines get cut off in the middle due to buffering, and then the build fails because, I suspect, not enough text gets put in the buffer in time.

https://travis-ci.org/github/Kansattica/msync/jobs/675085943
https://travis-ci.org/github/Kansattica/msync/jobs/675423402

Same problem here: https://travis-ci.org/github/solvespace/solvespace/jobs/679218467

It’s failing all the time with the last line in the log being truncated. The build times out then due to no new output for ten minutes.

Any ideas? I’ve fiddled way too much with output rate limiting, different flags for stdout & stderr and whatnot. Nothing helps.

Maybe the buffering issue should be filed separately.

Hi,
Any updates about this issue ?
I have met same problem, I am trying to build and running tests of Storm on ARM platform with Travis, the CI job easily hangs and no explicit reason, even the job has already run sucessfully. here are 2 examples:
https://travis-ci.com/github/liusheng/storm/jobs/323797329
https://travis-ci.com/github/liusheng/storm/jobs/323797330

The same CI jobs run OK on x86 platform.

1 Like

I wish there was… this issue has been reported may times since October 2019; I tried to list all posts doing so here Multiarch builds (lxd) always time out on failure instead of exit

See also Travis Build gets stalled for ARM64 job in pandas package here for the workaround that I use (but it’s horrible)