Some percentage of requests are returning 502

Hello,

On my server https://dashboard.render.com/web/srv-bthsgsqpp1jog4onotp0 some smallish percentage of requests return a 502 error. We especially notice this on a page that has a 10s polling interval.

I don’t know what’s wrong!

Thanks,
Brandon

Hi @flybayer,

I took a look at your service, and the 502s that I’ve found are all “connection reset by peer” errors, which implies to me that your process is either terminating the connections, or your process itself is restarting.

I noticed that once you deployed a new version of your service, the 502s stopped. Do you think there might have been an issue with the process that had been running before then?

Hmm, not that I know of. And I don’t know why that would be happening. :thinking:

@dan can you do some more digging? I’m unable to find any issues with my code. My code works perfect locally, but on Render we are having this constant 502 problem.

I don’t know how to debug any further. I’m happy to provide any info that can help.

I’ve taken another look at the service, and in its event log I see a few “Server failed” events with the “Out of memory (used over 512MB)” reason. If you look at your service’s metrics page, you can see a memory spike corresponding with the most recent “out of memory” event.

When your service instance exceeds the memory limit of the service’s plan, it is terminated and restarted automatically. Since your service has 1 instance, it will be unavailable for a short period of time while it is restarted. I recommend one of two options:

  • Choose a larger plan for your service so it has enough available memory to avoid restarts
  • Add one or more additional instances so your service is still available when one of the instances is restarting

Yes, I understand that, but the 502 issue I’m having is not the restarts. The 502 error happens quite frequently while the app is running.

Got it, yeah, I see that the 502s are happening much more frequently than the restarts.

I just tried removing our routing layer and making HTTP requests to your process directly, and am seeing connection timeouts and 500s. Revisiting our routing layer’s logs, this matches the “connection reset by peer” errors I had mentioned above.

I’ve also taken a look at your service’s logs, and do see TimeoutErrors, but I don’t know for certain that that’s the same error that’s presenting as a connection timeout when I make a request to your process.

I’m not certain what difference between your server running in Render and your local server would result in these timeouts, but I can narrow down the timeouts to coming from your server. It may be valuable to add extra logging to your server to try to get additional information for each connection.