A Rust job somehow uses wrong compiler version installed by another job

Something is going very wrong with this job: it says that the rustc version is rustc 1.46.0-nightly (4fb54ed48 2020-06-14), which is 1 week outdated, but at the same time rustup update says

  nightly-x86_64-unknown-linux-gnu unchanged - rustc 1.46.0-nightly (a8cf39911 2020-06-21)

To me it looks like whatever caching travis is doing when I say - rust: nightly has gone seriously wrong.

Wow, even after adding

      # Make sure we really have the latest nightly.
      - rustup toolchain uninstall nightly
      - rustup toolchain install nightly
      - rustc -Vv

I still get a nightly from a week ago. How is that even possible?

Oh that is odd, I somehow have two toolchains:

$ rustup show

Default host: x86_64-unknown-linux-gnu

rustup home:  /home/travis/.rustup

installed toolchains

--------------------

nightly-2020-06-15-x86_64-unknown-linux-gnu (default)

nightly-x86_64-unknown-linux-gnu

active toolchain

----------------

nightly-2020-06-15-x86_64-unknown-linux-gnu (default)

rustc 1.46.0-nightly (4fb54ed48 2020-06-14)

(these extra newlines are just how travis copy-and-paste works unfortunately…)

I ended up with this work-around:

      # Make sure we really have the latest nightly.
      # For some reason Travis puts us on a snapshot by default...
      - rustup default nightly
      - rustc -Vv
      - cargo -V

It looks like this default was set by your script, ci/miri.sh, in an earlier build, Travis CI - Test and Deploy with Confidence.

Rustup installs versions into ~/.cargo/, so everything that you change there is cached between builds.

As such, cleaning your cache and avoiding messing up Travis’ stock functionality without knowing what you are doing should fix the problem.

That’s a different job though? It’s the Miri job, not the all-features job.
Are caches shared across all jobs…?

Travis doesn’t provide a good way to compute the nightly version we need, so there’s no avoiding messing with rustup unfortunately. I know pretty well what I am doing with rustup, but I didn’t expect travis to confuse the caches of different jobs.

I didn’t check if that was the exact job that poisoned the cache or not (I’ll look into that now). This is just very strong evidence that you are doing something funny and as a result, are yourself responsible for diagnosing the resulting errors rather than (or at least before) reporting them as Travis bugs. A build log reports cache IDs that jobs save and restore so you are able to pull the thread back from a faulty job.

If you don’t want Travis’ language installation logic for whatever reason, use language: generic and install all the necessary tooling yourself (generic image doesn’t have Rustup but you can install and cache it). nightly by definition means “rolling release” so it’s logical that Travis doesn’t allow to select a specific build.

I don’t have any idea how to go about debugging this, but my theory for now is that indeed state somehow got mixed up between jobs, so I changed the other job to set the default back to nightly before the cache gets created. Thanks for pointing that out!

There are various good reasons for not wanting to always test the latest nightly, but some specific nightly determined e.g. by the contents of some file.

Yeah I did that elsewhere, also because Travis tends to cache a bit too much for Rust. But that always feels like a hack.

Here’s the thread pull:


So jobs 161.1 and 161.8 use the same cache ID!

That’s because their configuration only differs in job name and script. Cache ID is computed from OS image, language version and environment variables as specified in .yml (see What is the cache ID for a job generated from? How can staged jobs use the same cache? for source code links). Job name and script are not considered!


It looks like whoever was using MIRI= envvar in Travis CI - Test and Deploy with Confidence (huh, that was actually you) was doing that for a reason! But they forgot to leave a comment in yml explaining their decision, letting that knowledge become lost!

I added it because the other jobs did, and I think it was just a hack to get the name to display, but I might be wrong. Anyway the MIRI env var actually became meaningful for the Miri tool, so we had to stop using it.

Thanks a lot! That is very helpful. So I’ll add some dummy env var then to ensure the cache is separate.

I created a feature request to elimitate this source of hard-to-catch errors:

You may wish to upvote it if you are interested in seeing it implemented