Two days ago we enabled auto scaling in our production environment (service name is
pyvott-api-prod) with 60% target CPU utilization and 60% target memory utilization.
For about 36 hours we only had one instance, but last night at 07:48:38 PM MDT that instance had a CPU spike up to 83.2%, so the service scaled up to 2 instances.
After the auto scale happened, however, most requests to our service began timing out. Occasionally a request would succeed, but it appeared to us that our app was basically down. Strangely, nothing in our logs indicated that there were any problems, and our health check returned 200 OK.
After disabling auto scaling about 30 minutes ago, all requests are succeeding again. Can anyone from Render investigate this issue and help us understand what went wrong?