Python 3.6 UnicodeDecodeError: Invalid continuation byte - Ubuntu Trusty

My System:

Ubuntu 14.04
Python 3.6.7

.travis.yml

language: python
python: '3.6'
cache: pip
before_install:
- openssl aes-256-cbc -K $encrypted_721a23b33185_key -iv $encrypted_721a23b33185_iv
  -in keyfile.enc -out keyfile -d
install:
- pip install -r requirements.txt
- sudo bash install_gitcrypt.sh

jobs:
  include:
  - stage: test
    name: Tests
    script: python manage.py test

before_deploy: git-crypt unlock keyfile
deploy:
  provider: script
  script: "./zappa_deploy.sh"
  on:
    all_branches: true
    condition: $TRAVIS_BRANCH =~ ^develop|master|travis-test$

Error

$ python manage.py test
Traceback (most recent call last):
  File "manage.py", line 22, in <module>
...
File "/home/travis/build/Tarteel-io/tarteel-api/tarteel/settings.py", line 18, in <module>
    env.read_env(str(ROOT.path('tarteel/.env')))
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/environ/environ.py", line 635, in read_env
    content = f.read()
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xee in position 15: invalid continuation byte

Which can be seen in this build.

Description

I’m running into an issue running the Django test suite for my repo that I can only replicate in the live build on Travis-ci where the python manage.py test script fails in the test stage of the jobs matrix.

This issue is similar to this Github Issue. I am able to run python manage.py test on my local machine in a virtualenv, as well as using the travis-ci-garnet-trusty-1512502259-986baf0 docker image as suggested by the docs.

I also confirmed that the .env file where the error is pointing to is UTF-8 safe using the isutf8 tool.

Any suggestions on how to proceed?

https://github.com/Tarteel-io/tarteel-api/blob/8c26a67b517136b4ae449e9d2167b44c5b620730/tarteel/.env:

GITCRYPT JпѓѓоЬ>6юў¤7щ? ЙaѓbHжбЮ¬ьџФ¬—ЋњЉg-‘(Џ «М6E‘рDђ¤z 7V„;ЭUHБ#єа^IN?ОНтєб
‚ї‰вЬx”а"’™ Qй Я‰ •'ЄЧf`ѕ СџЧјl]ЦЧЬ±Z=Ѕ“зz>уO№іЏ}О€Ь¬ї‹И¤Є‚Ю№EаЊEювWкnжР¬*‡1ф$*µ˜µр©ьBУ—Є§“ђ˜Щ:п6:µ˜ќщЇa5с4’uЪєжq)2ќйhFD ‡
(etc)

This doesn’t look like valid UTF-8 to me. Looks like this file is encrypted, so you need to decrypt it before feeding to your program.

I guess this is why we need pair programming… Thanks for pointing that out! I changed my travis build config so it now looks like this:

language: python
python: '3.6'
cache: pip
before_install:
- openssl aes-256-cbc -K $encrypted_721a23b33185_key -iv $encrypted_721a23b33185_iv
  -in keyfile.enc -out keyfile -d
install:
- pip install -r requirements.txt
- sudo bash install_gitcrypt.sh
script:
  - git-crypt unlock keyfile
  - python manage.py test

deploy:
  provider: script
  script: "./zappa_deploy.sh"
  on:
    all_branches: true
condition: $TRAVIS_BRANCH =~ ^develop|master$

Where the decryption occurs before running the python test suite.
Thanks @native-api!

Unicode String types are a handy Python feature that allows you to decode encoded Strings and forget about the encoding until you need to write or transmit the data. Python tries to convert a byte-array (a bytes which it assumes to be a utf-8-encoded string) to a unicode string (str). This process of course is a decoding according to utf-8 rules. When it tries this, it encounters a python byte sequence which is not allowed in utf-8-encoded strings (namely this 0xff at position 0). One simple way to avoid this error is to encode such strings with encode() function as follows (if a is the string with non-ascii character):

a.encode('utf-8').strip()

Or

Use encoding format ISO-8859-1 to solve the issue.