Some percentage of requests are returning 502

Hello,

On my server https://dashboard.render.com/web/srv-bthsgsqpp1jog4onotp0 some smallish percentage of requests return a 502 error. We especially notice this on a page that has a 10s polling interval.

I don’t know what’s wrong!

Thanks,
Brandon

Hi @flybayer,

I took a look at your service, and the 502s that I’ve found are all “connection reset by peer” errors, which implies to me that your process is either terminating the connections, or your process itself is restarting.

I noticed that once you deployed a new version of your service, the 502s stopped. Do you think there might have been an issue with the process that had been running before then?

Hmm, not that I know of. And I don’t know why that would be happening. :thinking:

@dan can you do some more digging? I’m unable to find any issues with my code. My code works perfect locally, but on Render we are having this constant 502 problem.

I don’t know how to debug any further. I’m happy to provide any info that can help.

I’ve taken another look at the service, and in its event log I see a few “Server failed” events with the “Out of memory (used over 512MB)” reason. If you look at your service’s metrics page, you can see a memory spike corresponding with the most recent “out of memory” event.

When your service instance exceeds the memory limit of the service’s plan, it is terminated and restarted automatically. Since your service has 1 instance, it will be unavailable for a short period of time while it is restarted. I recommend one of two options:

  • Choose a larger plan for your service so it has enough available memory to avoid restarts
  • Add one or more additional instances so your service is still available when one of the instances is restarting

Yes, I understand that, but the 502 issue I’m having is not the restarts. The 502 error happens quite frequently while the app is running.

Got it, yeah, I see that the 502s are happening much more frequently than the restarts.

I just tried removing our routing layer and making HTTP requests to your process directly, and am seeing connection timeouts and 500s. Revisiting our routing layer’s logs, this matches the “connection reset by peer” errors I had mentioned above.

I’ve also taken a look at your service’s logs, and do see TimeoutErrors, but I don’t know for certain that that’s the same error that’s presenting as a connection timeout when I make a request to your process.

I’m not certain what difference between your server running in Render and your local server would result in these timeouts, but I can narrow down the timeouts to coming from your server. It may be valuable to add extra logging to your server to try to get additional information for each connection.

Just wanted to document for others encountering this issue.
We were encountering a similar issue with our NodeJS server hosted on Render. What helped in our case was to increase the server.keepAliveTimeout and server.headersTimeout to 120 seconds, as recommended by Render.
The idea is that the Render proxy reuses the connection to the server for up to 120 seconds, but the Node HTTP server will close the connection after 5 seconds of inactivity by default. When those 5 seconds pass, the Node server will forcibly close the connection, but Render will still forward requests through that connection. This then causes 502 errors, since Render sees that the connection is forcibly closed (with a TCP RST)

1 Like

Hi Hannes,

It’s so helpful when someone shares what worked for them with the community like this.

Thanks for the contribution!

Regards,
Mike


Render Support Engineer, MT (UTC-7)