Feature Request: Limit concurrent builds by branch / tag / PR

It would be great to be able to limit concurrent builds by branches, tags, and pull requests

For example:

.travis.yml:

concurrent_builds:
  pull_request: 0 # unlimited
  branch: 0  # unlimited
  tags: 1

Name specific branches

concurrent_builds:
  pull_request: 0 # unlimited
  branch:
    - master: 1  # Limit to 1 for the master branch only
    - "*": 4  # All other branches limited to 4
  tags: 1

A specific use case would be Continuous deployment for commits to master, or deployment on tag/release

Ref.: https://github.com/travis-ci/travis-ci/issues/6547
Posting here because: https://github.com/travis-ci/travis-ci/issues/10026#issuecomment-443386988

Ideally it would allow “locking” stages. We have three stages:

  • build and run unit tests (runs everywhere, takes ~ 5 minutes)
  • deploy to testing and run integration tests in the actual server environment (tags & master, takes ~ 1h)
  • deploy to production (master only, takes ~ 10minutes)

Ideally we would have the 2nd and 3rd stage locked to only one instance, but keep the 1st unlimited. That would allow us to build PRs, even though someone just merged master and there is at least an hour build running.

What you actually want is to queue deploy requests for your test environment. Which is not limited to just a number of homogeneous environments, so the requested feature is not a very flexible solution.

A flexible solution would be to make jobs wait for an external event before starting – e.g. the same way GitHub’s check status reports work. (Travis would send a build request to a webhook in your environment which would queue it and report back when it’s ready.)

I’m not sure how good of a fit this is for Travis CI. Here, you are supposed to run all the functionality on the CI’s machine. Maybe an on-site installation of Travis CI with a custom worker type representing your test environment would be a better fit for your workflow.

Not really. Even if I were to queue the actual deploy I still need to sync between my deployment service and Travis. That will be source of bugs and unnecessary complexity.

  • build started for [1]
  • “1st stage” of [1] finished
  • deploy [1] to testing queued and running
  • build started for [2]
  • “1st stage” of [2] finished
  • deploy [2] to testing queued
  • deploy [1] finished
  • < here it must not start deploying to testing for [2] >
  • notify travis that testing env is finished and it’s time to run “2nd stage” for [1]
  • run test on deployed testing server for [1]
  • finished “2nd stage” for [1]
  • only now can deployment to testing for [2] start

Building some notification framework to talk between Travis and the Deploy process would be a big undertaking. While having a function to make something a “critical section for the repo” (allowing it to run only in one thread for the whole repo) is a relatively contained feature that affects only Travis and does not require any synchronization with outside services.

I would agree that this might not be part of continuous integration. But it certainly is a part of continuous deployment :slight_smile: And I think that it’s actually a great fit for Travis!

Still not possible AFAIK, after half year. Don’t you hate it when you google a problem and your own question pops up?!

The closest thing you can get within Travis’ execution model is to make a bunch of Docker images representing your servers and bring them up on the build machine in cohesion using docker-compose.

Unfortunatelly does not work for me. For release tests we need some shared cloud resources that can’t be represented reliably locally in docker.

I have precisely the same problem as OP. It’s immediately caused problems around our newly minted CD setup and is forcing us to rework this using other providers (exploring codepipeline, for example).

Can you elaborate on why this is not a fit for travis? It’s unclear how you could implemented continuous deploy without the ability to ensure that commits do not race each other to prod.

It’s the job of a deployment provider, not CI, to enforce atomicity so that “commits do not race each other to prod”. All current stock deployment providers do that, this way or another.

But this is not the case of deploy race condition.

Let’s explore a scenario where deployment race conditions are nonexistent and still inability to set up travis correctly is a problem:

Travis setup: a) run unit tests b) deploy to testing c) run black box integration tests on testing d) deploy to production

  • PR1 is merged.
  • PR1-a) tests
  • PR1-b) deploy of PR1 to testing
  • PR2 is merged
  • PR2-a) tests
  • PR2-b) deploy of PR2 to testing (blocked by still running deploy to testing)
  • PR1-b) finished, unblocking PR2-b)
  • PR2-b) deploy of PR2 to testing started
  • PR1-c) integration of PR1 on testing started
  • PR2-b) deploy of PR2 to testing finished - testing is now in PR2 state, even though tests for PR1 are still running, some of them maybe failed due to the deploy

I hope that from this it’s obvious that even with atomic deploys it’s responsibility of the CI to ensure that two builds don’t run over each other. Why? Because it’s owning the pipeline. Deploy process has no knowledge of the CI. How could it manage it’s locks based on current travis status? That makes no sense.

An interesting suggestion. This may be put on the product roadmap in the future, but I cannot say with any certainty at this time.

Generally speaking, limiting the concurrency across the builds is a repository-level concern, and not a build-level (and therefore commit-level) concern. Therefore, the information about them should not be put in .travis.yml. If we are to implement this, the configuration will be in the Settings page.


Now, for @tomasfejfar’s use case, there may be a workaround. Not one out of the box, but hopefully a workable one.

What you need is something that acts as a traffic controller that knows more about your build pipeline logic than what we currently offer. If I were to implement your workflow, I would try something like this:

  • Split the logic into two parts:
    1. build and run unit tests
    2. deploy (to testing, to production)
  • The first will be executed by push and pull_request event types, whereas the latter will be triggered by the API.
  • Write a web app that acts as a liaison between the first and the second parts, the controller that receives the webhook notification and decide whether or not to trigger the second part. (This app can send an API query to find if there is any build of the second type is running, to aid in deciding.)

I hope this helps.

1 Like

@BanzaiMan Thanks for the suggestion. What you’re describing makes sense, I just wish it came out of the box with Travis rather than having to implement it. We’re looking to migrate off travis since we want to avoid the liability of maintaining a bespoke deploy orchestrator.

@native-api

Could you give us the benefit of the doubt? I have years of experience working with internal and 3rd party CI/CD systems and maintained deploys for far more complex workflows than the simple one I’m trying to develop with travis. It just doesn’t have the necessary feature set to ensure safe deploys, at least for my provider.

Tomas has also repeatedly explained the issue with concrete, clear examples. Even if I weren’t running into the same problem, I’d find his requirements compelling. This is my first time posting here and it’s unclear if you’re an employee—I really don’t think being dismissive of requirements is appropriate if you are.

We reviewed the feature set of our providers carefully before seeking support. What you’re describing isn’t possible, at least with AWS codedeploy, our provider. Codedeploy only knows what revision you’ve asked a group to deploy, not the state of other deployment groups, nor any concept of revision order. Since both Travis and Codedeploy support running arbitrary code, we can certainly solve this problem by doing so. What I’m looking for is an abstraction that solves it without writing code, which is why I’m paying for the service.

Unfortunatelly, what you’re suggesting is “you should build your own Travis”. If we used travis only for unit tests and build, we could build our own CI. Our build is completely in docker anyway. So if we were to build the deploy+integration tests part ourselves then doing the rest would be rather simple.

We’re actually looking at different CI/CD solutions instead. That’s much more simple for us than building anything in house.

Update: Oh, sorry I misread. Building the logic outside of Travis is a workaround. But still - it would be single source of failure and would need to run somewhere and we’d need to support it.

1 Like

@tomasfejfar I’m curious what else you’re looking at? Would help guide my search.

If we were about to change anything, we’d like to consolidate, so we’re exploring Azure Pipelines and AWS CodeBuild. We had Jenkins in the past, but hated it. And we played with Github Actions, but that did not work for us so far (for different reasons from what we miss at Travis).

Code pipeline supports serialized execution:

https://docs.aws.amazon.com/codepipeline/latest/userguide/concepts-how-it-works.html#concepts-how-it-works-executions

If it helps, the features described above are precisely the features we’re looking for in a CD system.

FWIW I looked at Github actions for this as well, but it doesn’t seem to support serialized execution.

I have the exact same requirements as @tomasfejfar and @clin88. We are bringing up real needs. The feature seems to be quite simple to implement for a CI/CD provider. For us, the lack of this relatively simple feature disqualifies Travis and we are forced to look at different options.

Apart from Travis, we also looked at a number of other CI/CD providers (CircleCI, GitHub Actions) but unfortunately, neither of them seem to have native support for the feature as of today.

We found CodeFresh and Gitlab CI do support this natively and will be looking at these options.

1 Like

This is honestly just an expression of frustration with the travis deploy tooling, as my team has already decided to migrate off. It would be reassuring to hear that y’all recognize these as problems and have a plan to address them, however.

My team adopted the deploy feature to deploy a fairly simple application that consists of a.) running migrations b.) firing off two codedeploy deployment groups, in that order.

We’ve run into countless problems since:

  1. There is no way to limit concurrent builds on master. As a result, we frequently run into issues where two people merge around the same time and the deploy fails, since the first deploy won’t have finished before the second one starts, and dpl errors out. We have to manually intervene and retry in this case. This happens about once a day at 5-10 deploys a day.

  2. Further, because two builds can race to production, it’s possible for an older commit to deploy after a newer commit, simply because the newer commit finishes faster. If we manage to detect this, it requires manual intervention. This happens about once a week at 5-10 deploys a day.

  3. An error at one stage of deploy doesn’t break the build, but allows the next stage to proceed. This is still an outstanding issue in the codedeploy plugin for dplv2, although my team is looking at making a PR to fix this since it caused a major outage (but would appreciate if your team could address it first).

  4. DPLv2’s error handling is still flawed. Besides the issue in #3, there were several times during development where the codedeploy plugin simply failed without explanation. In one case, I believe I misspecified the github repository.

  5. AFAICT there is no support for common deploy workflows, like canarying or human approval.

Our interactions with support and this forum have also been really discouraging.

First, one support staff told me I shouldn’t use YAML anchors, even though that’s recommended in the docs and was unrelated to our problem.

Another person seemed to miss the context of what I’d posted them and asked us to tell them what we’d changed in the config—nothing had changed, the system failed after weeks of not failing in that way, which was apparent from the support message.

The feature request thread to fix the issue with concurrent deploys didn’t promise to roadmap anything, but told us to build our own tooling instead: Feature Request: Limit concurrent builds by branch / tag / PR - #11 by BanzaiMan

Y’all, there is no company on earth that wants continuous deploy tooling that doesn’t enforce order. Ensuring that deploys go out in the correct order is a minimally viable feature of any deploy orchestrator. Except for teams that never deploy more than once a day, practically nobody wants a tool that doesn’t support this.

Yet this basic functionality is literally impossible in travis.

P.S. To be clear, we are paying customers to the tune of several k a year and projected 5-8k spend by EOY, if we were to stick around.

1 Like

Don’t even bother writing. Since they were bought by Idera, they practically gave up on any non-trivial support and are just waiting out the inevitable death of the cash-cow. We’re migrating all our repos away as well.

I repeat, it’s a job of a deployment provider to enforce atomicity.

There are any number of builds of different kinds running at the same time and reaching the deployment step at arbitrary moments. How are you expecting things to proceed here?

@wadim Could you link to some info on how CodeFresh is enforcing deployment in order? I couldn’t find anything relevant skimming through Deploy · Codefresh | Docs.