What's next after multiple outages in 2024?

Brendan · March 26, 2024, 5:52pm

We are hosting in the Oregon region and today’s outage is the third major outage we’ve observed in 2024. It appears that this one is the second involving upstream issues with Cloudflare.

I have been a big fan and advocate of Render for the past year but we are losing patience quickly. What, if anything, will be done to prevent these moving forward?

Paul_Tannenbaum1 · March 26, 2024, 6:01pm

Start looking at a new hosting provider. We are at least.

We have enjoyed renders pricing, but the outages have cost us more than if we just went with a more stable provider (Heroku comes to mind). At the very least we are looking into having some kind of redundancy in place so we can switch over to a different provider if this happens again.

Maz_Dk · March 26, 2024, 6:09pm

Same for us, we are moving to Heroku, seems to be a better option

jansedlon · March 26, 2024, 6:10pm

Does heroku offer zero downtime deployment and automatic deployments?
PS: Aaah… Heroku seems extremely pricy

John_B · March 26, 2024, 6:13pm

Folks, we’re working as fast as we can to get everyone back up - we can’t apologize enough.

This incident is around services that make use of disks. Typically, this impacts our Postgres and Redis the most.

Given the size and impact of this a thorough post mortem will be done and the findings shared publically,

John B
Render Support, UTC

jansedlon · March 26, 2024, 6:18pm

John, I completely understand. I love Render by heart and we (at least me) don’t blame you for this. These things happen. I joined this discussion not to leave Render but to have some kind of backup. I have not find a better DX focused service than Render There’s certainly a difference between talking to each other as a people and as someone who needs to protect its app and business

pokeminers · March 26, 2024, 6:28pm

I look forward to the post-mortem and appreciate the transparency and communication. I love Render for its features, DX experience, and how it makes dev ops so easy, but the reliability has also been getting me concerned more lately, especially as I am a few months away from another product launch and ideally I would like to stay with Render. Thank you for the update and looking forward to hearing more.

John_B · March 27, 2024, 11:41am

You will absolutely get that.

Personally, I’ve been here for nearly 3 years, and this is the first incident of this magnitude I’ve seen - we have regional incidents but nothing like this, which was across all our regions and was on us to resolve. The timing of the Cloudflare incident was coincidental and not the cause here. The initial recovery of stateless services (ie services without disks) was fast, however, it was services that made use of disks, via mounted disks, managed Postgres, or managed Redis that were most impacted and were the longest to recover.

John B
Render Support, UTC

Paul_Tannenbaum1 · April 12, 2024, 7:47pm

Was a post mortem ever created for the outage? Looking around but have not seen it anywhere.

John_B · April 12, 2024, 8:09pm

Yes, posted on the original status incident

system · May 12, 2024, 8:09pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Constant Outages?	2	284	April 12, 2023
Outage December 6, 2020	4	604	December 17, 2020
We are getting 520s and 525s on all our Render services?	7	37	October 1, 2024
502 errors for Oregon	5	287	November 20, 2022
[URGENT] Cloudflare + Render - sudden traffic drop in last few days	5	1042	January 31, 2022

What's next after multiple outages in 2024?

Related topics