Specify CPU type

jrwishart · December 21, 2019, 1:47am

Is it possible to specify the CPU being used in the travis build? I am aware of the following settings at
Building on Multiple CPU Architectures - Travis CI However, they seem too broad for my use case (happy to be corrected).

In particular I have a branch that sometimes passes or fails unit tests and I can narrow down the cause to the CPU being used due to messages produced by tensorflow.

E.g. The passing build has these messages

2019-12-17 05:57:57.694007: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-12-17 05:57:57.713689: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-12-17 05:57:57.714934: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x27ac2b20 executing computations on platform Host. Devices:
2019-12-17 05:57:57.714960: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
2019-12-17 05:58:08.328999: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 17067520 exceeds 10% of system memory.
2019-12-17 05:58:08.341346: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 17067520 exceeds 10% of system memory.
2019-12-17 05:58:08.355016: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 17067520 exceeds 10% of system memory.
2019-12-17 05:58:08.373354: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 17067520 exceeds 10% of system memory.
2019-12-17 05:58:08.391213: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 17067520 exceeds 10% of system memory.

However the failing build has these messages.

2019-12-19 03:42:03.611822: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2019-12-19 03:42:03.623020: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2800180000 Hz
2019-12-19 03:42:03.623234: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x28340c00 executing computations on platform Host. Devices:
2019-12-19 03:42:03.623259: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
2404c2413

From the messages it seems the 2.3GHz CPU produces a pass and 2.8Ghz CPU produces a failure.

I would like to debug a previously passing build for reproduce ability.
Currently all the debug builds I have launched always seem to be run on a 2.8GHz system/docker container. Is it possible to force the 2.3GHz settings? I have tried 20 repeated debug builds recently and they always seem to build on a 2.8GHz system.

Montana · January 3, 2020, 10:10am

Hello jrwishart,

Do you think it’s possible to link the logs here? Have you considered adding bazel into your .travis.yml file?

Here’s a good example repository on GitHub on how to do that: Bazel repository.

-Montana

jrwishart · January 9, 2020, 9:37pm

Hi Montana, Sorry for the late reply. I stopped actively checking the thread after the new year and thought there might be notifications to my email but there was none. Unfortunately I can’t link to the logs since its a private repo. I could try adding Bazel. I’m unfamiliar with it but it seems powerful based off the wiki page.

For those people reading this thread and facing a similar issue, the problem was due to numerical precision of the output of tensorflow. The different hardware caused the tensorflow output to be different (in the 6th decimal place). This is small but aggregated over the models we were fitting to produce noticeable different output (to 1 decimal place). The solution was to round the tensorflow output before it was used in follow up models.

noloader · March 10, 2020, 4:09pm

Here’s a good example repository on GitHub on how to do that: Bazel repository.

I’m not a Bazel expert, but I believe Bazel uses -march=native. That builds for the native machine, and can cause problems if the binary is moved to another machine. And I seem to recall it is difficult to get Bazel to stop using -march=native.

Also see How to cross-compile tensorflow-serving on docker build image with bazel when copts contain spaces on Stack Overflow.

If you know how to modify a Bazel configuration, then you might try -mno-avx2 to disable AVX2. But it has been my experience disabling all of the obscure features with -mno-xxx can be tricky. I’ve found it best to avoid -march=native, and explicitly specify what you want like -mavx (for AVX but not AVX2).

Topic		Replies	Views
About the Multi CPU Architecture category Multi CPU Architecture	27	4626	October 23, 2020
Determining s390x CPU for C++ compilation with -march Multi CPU Architecture	1	1504	October 22, 2020
Custom values for arch to use for cross-compilation Multi CPU Architecture	1	655	October 24, 2020
I had change dist to jammy in .travis.yml, but travis still build by foca Deployment python , bug	10	75	February 18, 2025
Requesting Xenial, getting Trusty instead Linux	5	1136	April 29, 2019

Specify CPU type

Related topics