In my app.py I have a function that uses a pretrained Pytorch model to generate keywords
@app.route('/get_keywords')
def get_keywords():
generated_keywords = ml_controller.generate_keywords()
return jsonify(keywords=generated_keywords)
and in ml_controller.py I have
def generate_keywords():
model = load_keywords_model()
output = model.generate()
return output
This is working fine. Calls to /get_keywords correctly return the generated keywords. However this solution is quite slow since the model gets loaded on each call. Hence I tried to load the model just once by moving it outside my function:
model = load_keywords_model()
def generate_keywords():
output = model.generate()
return output
But now all calls to /get_keywords time out when I deploy my app to Render.com. (Locally it’s working.) Strangely the problem is not that the model does not get loaded. When I write
model = load_keywords_model()
testOutput = model.generate()
print(testOutput)
def generate_keywords():
output = model.generate()
return output
a bunch of keywords are generated when I boot gunicorn. Also, all other endpoints that don’t call ml_controller.generate_keywords() work without problems.
For testing purposes I also added a dummy function to ml_controller.py that I can call without problems
def dummy_string():
return "dummy string"
Based on answers to similar problems I found, I’m starting Gunicorn with
gunicorn app:app --timeout 740 --preload --log-level debug
and in app.py I’m using
if __name__ == '__main__':
app.run(debug=False, threaded=False)
However, the problem still persists.