ARM builds started getting stuck 18 February

My latest hypothesis is that the timeout occurs late in the job, but the job log appears to be truncated during an earlier phase. This might be due to stream buffers not getting flushed.

In Ruby project, the latest 7 ARM builds were okay. So, keep watching it a few more days.

Here’s how I’ve worked around the issue:

  • I split my arm64 job into 2 jobs, one for single-precision and one for double-precision.
  • I applied “travis_wait 20” to the longest-running command in my script, to reduce the chance of timeouts. This seems necessary even though the command only runs for 8 minutes.

The key insight: for arm64 jobs that time out, you can’t assume that the last command in the log is the one that triggered the timeout.

I still occasionally get timeouts from my arm64 jobs. Restarting the job usually resolves the issue.