I had a server continue to fail a few seconds after starting (and doing the health check), so it just got into an endless loop of rollbacks that never worked. What is the solution for this case?
Once your healthcheck endpoint shows success, we consider the deploy to be successful and will terminate your previous deploy. If your new deploy fails later, we will not roll back to an old deploy and will instead just restart the existing deploy, as you’re seeing.
If your application has a more complex startup process that can fail within the first couple of seconds of starting, I’d recommend adding a custom healthcheck endpoint (like
/health) that will wait to return a 200 status code until your application has verified that it has fully started, is ready to start serving traffic, and won’t restart before being able to serve.