Timeout when preloading Pytorch model using Flask on Render.com

In my app.py I have a function that uses a pretrained Pytorch model to generate keywords

def get_keywords():
    generated_keywords = ml_controller.generate_keywords()
    return jsonify(keywords=generated_keywords)

and in ml_controller.py I have

def generate_keywords():
	model = load_keywords_model()
	output = model.generate()
	return output 

This is working fine. Calls to /get_keywords correctly return the generated keywords. However this solution is quite slow since the model gets loaded on each call. Hence I tried to load the model just once by moving it outside my function:

model = load_keywords_model()

def generate_keywords():
	output = model.generate()
	return output 

But now all calls to /get_keywords time out when I deploy my app to Render.com. (Locally it’s working.) Strangely the problem is not that the model does not get loaded. When I write

model = load_keywords_model()
testOutput = model.generate()

def generate_keywords():
	output = model.generate()
	return output 

a bunch of keywords are generated when I boot gunicorn. Also, all other endpoints that don’t call ml_controller.generate_keywords() work without problems.

For testing purposes I also added a dummy function to ml_controller.py that I can call without problems

def dummy_string():
    return "dummy string"

Based on answers to similar problems I found, I’m starting Gunicorn with

gunicorn app:app --timeout 740 --preload --log-level debug

and in app.py I’m using

if __name__ == '__main__':
    app.run(debug=False, threaded=False)

However, the problem still persists.

Hi @Jakob_Greenfeld ,

If your other endpoints are working, then that does point to an issue internal to your app, and we’re unfortunately unable to give much guidance on that. I’d recommend seeing what additional logging you can set up, or log levels you can add, that let you trace what your application is doing. Can you add extra logging inside the model.generate() function?

Found the issue. The problem is that there’s some bug that occurs for Pytorch models when Gunicorn is started with the --preload flag.

Render secretly adds this flag and doesn’t show it in the settings which is why it took me days to figure this out. You can see all settings Render adds by calling printenv in the console.

To resolve the issue add a new environment variable

  GUNICORN_CMD_ARGS: '--access-logfile - --bind='

which overwrites Render.com’s standard settings

  GUNICORN_CMD_ARGS: '--preload --access-logfile - --bind='

I’m glad you were able to figure out the issue, and I’m sure it was frustrating to have to figure this out. I looked into our docs on this, and see that we have a change to the docs in-progress that adds all of the documentation, so hopefully that will go live soon. In the mean time, I snagged a screenshot of the env vars we set for Python apps, in case this is helpful to you: