Downloading data to disk on server start

downard-levity · September 12, 2023, 12:22pm

I’ve been enjoying using Render for my projects, but I’ve run into a bit of a challenge, and I’m hoping some of you might have faced something similar or can offer some guidance.

Here’s the situation:

I have a dataset that’s around 2-5GB in size that I need to download to disk just once.
When I try to fetch this at server startup, I’m running into timeouts, which is causing my service to fail to start properly.
Currently, the workaround is to attach a disk, modify my server code, SSH into the Render instance, manually download the data, and then revert my code back to its standard operations. It’s a cumbersome process, and I’d rather not go through the back-and-forth each time I need to set things up.

al_ps · September 12, 2023, 3:29pm

Hi,

The complication here is that disks can only be attached to one instance, so the download would currently have to take place on runtime instance, as you mentioned.

If the download only had to be done once, does your workaround in point 3 suffice or are you intending to set this project up often? Maybe you could be updated to be more resilient to a missing dataset?

However, if you do need something more reproducible, then maybe you can hang off a environment variable, e.g. DOWNLOAD_DATASET. If that is present (or set to “true”, etc.), then your code reacts to that - e.g. prevent the code looking for the data that isn’t yet present and kick off the download.

I’ve not tried this, but maybe if you want to get super-fancy it feels feasible that you could create a shell script to use as Start Command. This could then detect the env var and start a basic HTTP server with a holding/maintenance (“Downloading…”) page to get a successful deploy before a time. But could also kick off the download process. Once the download is complete you the script could then use the Render API to update the env var and trigger a new deploy to have it come back up with the dataset present.

Alan

downard-levity · September 12, 2023, 5:08pm

Thanks Alan! Is there a way to increase the timeout on the server so that it doesn’t timeout during the download process?

Jason-Render · September 12, 2023, 8:52pm

You need to implement a “please wait” step into your application, such that the service itself starts up enough for Render to report it is ok, but the application itself is not fully accessible to users yet.

Like the developer/publisher/etc. screens when you boot a game? Those mask an immense amount of loading under the hood (usually). Get creative with the mechanism to ask users to wait because your service just started. The service is up enough to receive and respond to requests even if it can’t do all the exciting stuff because it hasn’t downloaded the pre-requisite data yet.

downard-levity · September 13, 2023, 1:01am

This is great. Thank you, Jason. This will decrease the deploy overhead significantly.

system · October 13, 2023, 1:01am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Files in Render disk are being lost with Starter service	8	858	January 6, 2024
Rails: disk needed during deploy or after server starts	2	382	May 27, 2022
Configure Django to use Render disk	16	2409	November 22, 2023
A dump question, how to reboot a server on render?	9	6830	August 26, 2022
How to update files locally	2	502	August 29, 2023

Downloading data to disk on server start

Related Topics