� characters in Travis console log


#1

So I have two multiline javascript strings which should be equal that I am comparing. However, Travis is interpreting one of the strings with multiple replacement characters (�).

I’m guessing it might have something to do with utf8 encoding. The table characters like https://graphemica.com/┘ are represented with 3 bytes but utf8 encodes characters with 1 byte (8 bits).

Here is an example of what the string looks like:

    Listing open pull requests on protoEvangelion/gh
    protoEvangelion/gh
    master (1)
    ┌─────────┬───────────────────────────────────────────────┬─────────────────────┬───────────────┬────────┐
    │    #    │ Details                                       │ Author              │ Opened        │ Status │
    ├─────────┼───────────────────────────────────────────────┼─────────────────────┼───────────────┼────────┤
    │   #55   │ Title:                                        │ @protoEvangelion    │ a year ago    │   ✓    │
    │         │                                               │                     │               │        │
    │         │ pr title                                      │                     │               │        │
    │         │                                               │                     │               │        │
    │         │ Body:                                         │                     │               │        │
    │         │                                               │                     │               │        │
    │         │ pr description                                │                     │               │        │
    │         │                                               │                     │               │        │
    │         │ https://github.com/protoEvangelion/gh/pull/55 │                     │               │        │
    └─────────┴───────────────────────────────────────────────┴─────────────────────┴───────────────┴────────┘

On my local mac machine tests pass. I have tried everything I could think of to fix this for days but have not been able to solve it.

This issue is also reported here in 2016 but is still an issue: https://github.com/travis-ci/travis-ci/issues/7024

Here is an example job that you can view a failing test that includes the replacement character: https://travis-ci.org/node-gh/gh/jobs/486625312


#2

Hi Ryan, thanks for re-raising this issue. Those � characters are still disturbing our test run trees: https://travis-ci.org/junit-team/junit5/jobs/486539761#L819

Hoping that some go expert will take a look at: https://github.com/travis-ci/worker/blob/master/amqp_log_writer.go … as I think some flush() is happening within the boundary of encoded UTF-8 chars that span more then a single byte.

/cc @BanzaiMan


#3

If the flush() system call is chopping the multibyte characters into unrecognizable chunks, I am afraid there is not very much we can do here. Our logs are streamed over an SSH connection, and if that can’t handle such streams, we are unable to capture the original character.


#4

So I think the issue is happening with certain characters that I am using to build a table. Here is one such character which is 3 bytes: https://graphemica.com/┘

This is a problem for me because I have a cli tool that outputs a table of information. My tests are asserting that the table is formatted properly. But nothing I am doing is getting the test to pass.

Any ideas or workarounds or hacks?


#5

Perhaps try to flush() after \n characters?


#6

At the same time, if flush() is truly to blame, then I expect this to happen on your local machine, too, with some frequency. There might be something else at play, but I currently have no idea what that might be.


#7

I have tried about a hundred different combos to get the replacement characters to render and they always work on local but never on linux or mac travis vms.


#8

The problem can’t have anything to do with the way Travis processes logs because that’s not what the test sees. You need to inspect the values that the tests see, and track them back along the code logic to get to the source of the discrepancy. Running Build in Debug Mode and using an interactive debugger should help here. (Without that, you’re stuck with debug printing.)

E.g. in the original message, judging by https://travis-ci.org/node-gh/gh/jobs/486625312#L597 , the problem has nothing to do with replacement characters and is rather in the “pr description” line – likely some whitespace character (so you need to see the hex to make sense of it).


#9

Thanks @native-api so the problem was that there were hidden ansi characters. I ran the output through a function that strips those characters.

I assumed the problem was with the replacement character because it appeared to be the reason for the failure.

So you are right that the issue is not with the replacement character itself. I am able to get my tests to pass now so I am VERY happy, however something is still strange because those ansi characters are not present locally.


#11

Try this very simple bash “program” on Travis CI, @native-api

#!/bin/bash

for run in {1..1000}
do
  echo " ▓ ✔✔✔✔✔✔✔✔✔✔ a b c d e f  a b c d e f ✔✔✔✔✔✔✔✔✔✔ ▒ "
done

Scroll through the log and you’ll find replacement characters like in:

 ▓ ✔✔✔✔✔✔✔✔✔✔ a b c d e f  a b c d e f ✔✔✔✔✔✔✔✔✔✔ ▒ 
 ▓ ✔✔✔✔✔✔✔✔✔✔ a b c d e f  a b c d e f ✔✔✔✔✔✔✔✔✔✔ ▒ 
 ▓ ✔✔���✔✔✔✔✔✔✔ a b c d e f  a b c d e f ✔✔✔✔✔✔✔✔✔✔ ▒ 
 ▓ ✔✔✔✔✔✔✔✔✔✔ a b c d e f  a b c d e f ✔✔✔✔✔✔✔✔✔✔ ▒ 
 ▓ ✔✔✔✔✔✔✔✔✔✔ a b c d e f  a b c d e f ✔✔✔✔✔✔✔✔✔✔ ▒

Source: https://github.com/travis-ci/travis-ci/issues/7024#issue-195231635


#12

@sormuras These are Travis’ glitches (according to @BanzaiMan, these are incomplete characters, so the problem must be that Travis doesn’t wait to receive a complete character from the socket stream before writing it to the log). But they don’t affect the build in any way because this happens after the build has produced the output and fed it to Travis.


#13

I fully agree. They are just glitches disturbing the view of the log, a build does not fail because of those.


#14

Correct, but in my case it wasn’t just an issue of aesthetics. Because those replacement characters where in my failing test diff, it obfuscated the real reason for failure.