Using unified cache / Control cache identity

ladar · December 20, 2018, 8:08am

I’d like to comment on GH issue #7590 … with my builds, installing dependencies can take between 2 and 15 minutes. Because my actual build process takes around ~40 minutes, that variance is the difference between a pass and failure. So I worked around this issue by creating a cache job, which runs first. Because my build matrix is dictated by environment variables, I need to create two identical jobs for every set of environment variables. The first warms the dependency cache, the second performs the test build. Because the dependencies are shared, they could, and probably should all use a single cache job.

I understand the cache corruption issue discussed in GH issue #7590 and would like to propose several possible solutions.

First adding the ability to dictate a cache identifier through an environment variable (TRAVIS_CACHE_IDENTIFIER), or yaml value (cache_id) should be implemented. BUT to avoid corruption I would propose several solutions. Namely, if multiple jobs share a cache identifier, they run single file by default. This would avoid the corruption, at the expense of performance.

The single file buld behaviour should be used UNLESS jobs with a shared identifier are in multiple stages. In this scenario, only the first stage with a particular identifier would run single file. Subsequent stages wouldn’t start until the first was completed, and would be run in parallel by default.

I would also propose that if the identifier is controlled by the yaml file (preferred), a sub key boolean could be created (read_only). With this approach Travis would build jobs lacking a “read_only: true” in single file mode, but know that it was safe to build any of the “read_only: true” jobs in parallel. Naturally read only jobs would still pull/use the cache, but would not update the cache when they complete. In addition, you could optionally dictate that if “read_only: false” is explicitly false, jobs would run in single file, even if they span build stages.

With my libcore project, I use Travis to ensure compatibility. To do that I’m building this library with 7 different compilers (GCC 4.8, 4.9. 5, 6, 7, 8 and Clang 6.0). For each compiler I’m verifying builds work with all 4 optimization flags, -O0, -O1, -O2, -O3. I’m also ensuring that both pedantic, and production configurations build properly. The result is 56 build jobs, which could, in theory share a single cache job. Unfortunately, because I currently can’t control the cache identifier, I have to create 56 additional cache jobs. That’s a lot of extra processing that could and should be avoided.

With my larger magma project, I have a smaller build matrix, because each entry takes ~40 minutes per run, but I’m still building 28 different variants. That means I need 28 cache jobs, for my 28 build jobs.

So in my case, implementing this feature means I could literally eliminate 82 build jobs.

Please. Pretty please.

vladimiry · April 29, 2019, 1:32pm

See here an idea of sharing the public dynamic data between the jobs Is it guaranteed that jobs ids of the same build are sequential (share data between the jobs purpose)?

native-api · April 29, 2019, 1:55pm

Explicitly specifying cache identity conflicts with the principle of build matrix.

Related: Allow a next-stage job read-only access to the cache of a previous-stage one; or make exported/imported artifacts

vladimiry · April 29, 2019, 2:03pm

@native-api, of course, that would solve a need, but we need to live somehow until such feature is implemented.

native-api · July 18, 2019, 2:02am

should be solving this.

Topic		Replies	Views
What is the cache ID for a job generated from? How can staged jobs use the same cache? Travis CI Discussions & Feedback docs , caching	2	1921	November 14, 2018
Use all parameters of a job to calculate cache ID; retire "shared caches" Feature Requests caching , build-matrix	1	633	June 23, 2020
Allow a next-stage job read-only access to the cache of a previous-stage one; or make exported/imported artifacts Feature Requests caching	1	1694	July 18, 2019
How can I share coverage information between build stages in PRs from forks? Travis CI Discussions & Feedback	25	5236	September 13, 2019
Multiple job run in single job script Travis CI Discussions & Feedback	2	1121	September 30, 2019

Using unified cache / Control cache identity

Related topics