Bogus jobs are started when a build is cancelled and there are pending jobs (2)

native-api · August 7, 2019, 6:58am

Bogus jobs are started when a build is cancelled and there are pending jobs has resurfaced.

E.g. in https://travis-ci.org/native-api/opencv-python/builds/568519173, jobs 69 and 70 were running when the build was cancelled. After that, jobs 71 and 73 started for a few minutes.

BanzaiMan · August 7, 2019, 2:07pm

What exactly is the problem? The cancel request was received at 6:52:04 UTC, at which point some jobs in the “Final” stage were already running. Unless a job in the “Final” stage started before a job in the “S1” stage was finished, I don’t see a problem here.

Jobs are placed in parallel queues, and the workers pick them up in parallel. There is no guarantee within a single stage that job with smaller number starts before one with a larger number.

If you want the guarantee, jobs will have to be in different stages.

native-api · August 7, 2019, 2:41pm

The problem is that when I cancel the build, the jobs currently running terminate – as they should, but at the same time a few pending jobs following them are started – while they shouldn’t.

BanzaiMan · August 7, 2019, 2:55pm

Some may have already been queued and not cancelable.

native-api · August 7, 2019, 3:06pm

Whatever is the case, the end result is visibly incorrect behavior and the fact that build status is not changed to cancelled for a few minutes (which may delay the following queued builds and such).

BanzaiMan · August 7, 2019, 3:29pm

Another consideration is that your build has 88 jobs. I think the current logic is to fire cancel requests sequentially and it can take some time for the jobs to be actually canceled.

native-api · August 7, 2019, 7:54pm

It’s the same with 16 jobs.

The cause as I see it from behavior is the machinery that manages the jobs is not instructed not to start new jobs for this build before the UI begins cancelling them.

The simplest solution would be to cancel jobs in backwards order or to cancel pending jobs first.
The robust one would be to somehow dequeue the build or pending jobs atomically.

Topic		Replies	Views
Bogus jobs are started when a build is cancelled and there are pending jobs Travis CI Discussions & Feedback web-ui , bug	2	818	May 3, 2019
Cancelling a job may cause the next build stage to run Travis CI Discussions & Feedback bug , job-scheduler	2	614	August 3, 2020
Cancel build sometimes doesn't cancel all jobs Travis CI Discussions & Feedback	7	1471	April 27, 2023
Cancel all jobs immediately once one fails Feature Requests	1	1473	July 31, 2019
Allowed to fail job, if canceled, the whole build is marked as canceled Travis CI Discussions & Feedback web-ui , bug	5	1467	November 20, 2020

Bogus jobs are started when a build is cancelled and there are pending jobs (2)

Related topics