I’ve read that most commands are (under the hood) called with travis_retry. Can a similar strategy apply to dpl?
If I restart the build, the deployment succeeds.
A 500 means a problem at the remote server. Thus retrying immediately isn’t going to solve it in the general case, only mitigate it in some select cases. So a retry won’t really be a solution that will let you rest easy.
A real solution would be something that will retry after some significant time, by which whatever problem there was at the server might have been resolved.
This, however, opens a perverse incentive to retry regardless of the problem – even if it’s on your side, thus wasting CI’s resources.
I believe the current situation is worse than automatic retrying, as the thing I currently do is restarting the whole build, which takes 20+ minutes of full CPU usage. Not exactly a good way to save CI resources and shorten the time to delivery.
I believe a retry after a minute would mitigate many cases. Also, having a chance to restart a single phase of the build would be great (possibly, deploy only).