Python 3.6 - sqlite3.OperationalError: disk I/O error

Today just the my Python 3.6 builds are failing (source). All other Python builds work fine (source).

Any advise would be fantastic! :slight_smile:

Edit: Merged with master and build is still failing :slightly_frowning_face:
Build Status

It appears to be an issue on the Travis CI end (specifically the stable.xenial Python v3.6.7 AMD64 test container) - master build fail source.

Some builds are successful (e.g., https://travis-ci.org/caronc/apprise-api/jobs/632762260) so I tend to think this is some sort of timing issue. If you restart them, do they all fail? I’m happy to enable debug feature on for this repository, so you can troubleshoot further.

1 Like

@BanzaiMan, thank you for getting back to me! :slightly_smiling_face:

I’d definitely appreciate if you could enable debugging if at all possible. If it’s a timing issue, do i just need to add a sleep 5s at the end of the .travis.yml and/or tox.ini file then for the Python 3.6 calls?

Seems weird how this worked perfectly before (2 weeks ago). Also, Python 3.5, 3.7, and 3.8 build without any problems at all. Here is the master one i just tried to restart (and failed again). Maybe stick debugging here?

Thanks again for your speeding response and help!

Chris

The debug feature has been enabled for this repo.

Nothing has changed for this (xenial, Python 3.6) has changed in recent weeks. There might have been some underlying changes to your dependencies, so I’d suggest examining those as well. (If, for example, if you restart a previously successful build and it now fails, then the changes in your dependencies are very likely to blame.)

I took your advice and went back to an old build that went fine (link)

When I rebuilt just the Python 3.6 (in question) it failed (for same reason):

Thoughts?

Please check your dependencies. There should be some changes that triggered this failure.

The only change evident is coverage==5.0.1; the builds do not explicitly set a specific version (screenshot taken from here):

Otherwise everything is identical; the failing build shows coverage==5.0.2 (screenshot taken from here)

When i get home, I’ll explicitly create a branch forcing coverage to version 5.0.1 and see what happens. thanks again for all your help so far! I really hope you’re right and the problem is that easy!

Edit: Opened up this ticket here with coverage

@BanzaiMan: Just out of curiosity, to help with the open github ticket with coverage, are the containers you guys host internally available to us? The developer isn’t able to reproduce the issue. Alternatively, can you stick an strace on the call?

You can use https://hub.docker.com/repository/docker/travisci/ci-sardonyx. I’ve reproduced the issue with travisci/ci-sardonyx:packer-1542104228-d128723 in particular. (This image may not be exactly the same as one that is in use, but should be similar enough to troubleshoot.)

cid=$(docker run -dti --privileged=true --entrypoint=/sbin/init -v /sys/fs/cgroup:/sys/fs/cgroup:ro travisci/ci-sardonyx:packer-1542104228-d128723)
docker exec -it $cid /bin/bash

You’d be root. Switch to travis (su - travis) and run the essential commands:

git clone --depth=50 --branch=python38-ci-testing https://github.com/caronc/apprise-api.git caronc/apprise-api
cd caronc/apprise-api/
export TOXENV=py36
source ~/virtualenv/python3.6/bin/activate
python --version
pip --version
pip install codecov
pip install -r dev-requirements.txt
pip install -r requirements.txt
tox
2 Likes

The developer ended up determining that it’s not the specific version of his software that is doing it. It’s tied to pytest-cov instead which is the same version in all of my past passing builds and failing new ones. Forcing coverage==5.0.1 proved to be just a red herring after-all.

He also made an observation that it is tied to an alpha version of SQLite (5.0a2) that is installed. This kind of hints that the container has changed from the past (and to now). Is there anyway we can roll back from the alpha build and use the previous stable version instead?

Perhaps you have some suggestions? :slight_smile:

Without seeing the underlying error, it’s impossible to say what exactly is wrong.

If you build Python from source with https://github.com/python/cpython/pull/1108 applied, you should be able to see it.

Alternatively, strace should show the OS-level error (but you’ll need to filter it out from tons of system calls and due to the trace size, will probably have to upload it somewhere for examination).

SO questions with this error suggest that this can happen if you call commit() in a tight loop (should rather call it outside the loop) or if something else is using the .db-journal file at the same time (unlikely in Travis, but auditing should show this).

I had been searching for this error from almost an hour. thanks for the comment. tried it 2-3 times. it worked fine 3rd time. THANKYOU