I’m encountering a memory-related issue while trying to transcribe audio files using Flask and the whisper_timestamped (GitHub - linto-ai/whisper-timestamped: Multilingual Automatic Speech Recognition with word-level timestamps and confidence) library. I’ve deployed my app on Render and I’m using a /transcribe endpoint to handle audio transcription requests. However, I’m consistently running into the following error:
[ERROR] Worker (pid:67) was sent SIGKILL! Perhaps out of memory?
I’ve provided the relevant code snippet below:
from flask import Flask, request, jsonify
import whisper_timestamped
import tempfile
import os
app = Flask(__name__)
@app.route('/')
def hello():
name = "Hello World"
return name
def transcribe_audio(audio_file_path):
model = whisper_timestamped.load_model("base")
audio = whisper_timestamped.load_audio(audio_file_path)
result = whisper_timestamped.transcribe(model, audio, language="en")
transcribed_text = result["text"]
word_timestamps = []
for segment in result["segments"]:
for word in segment["words"]:
word_timestamps.append({"word": word["text"], "startTime": word["start"], "endTime": word["end"]})
return {"transcribedText": transcribed_text, "wordTimestamps": word_timestamps}
@app.route('/transcribe', methods=['POST'])
def transcribe():
if 'audio' not in request.files:
return jsonify({'error': 'No audio file provided.'}), 400
audio_file = request.files['audio']
# Create a temporary directory to store the audio file
temp_dir = tempfile.mkdtemp()
# Save the audio file temporarily to the temporary directory
temp_audio_path = os.path.join(temp_dir, "temp_audio.wav")
audio_file.save(temp_audio_path)
result = transcribe_audio(temp_audio_path)
# Remove the temporary audio file and directory after transcription
os.remove(temp_audio_path)
os.rmdir(temp_dir)
return jsonify(result), 200
if __name__ == "__main__":
app.run(debug=True)
I’m using the whisper_timestamped library to transcribe audio and obtain word timestamps. The error occurs when the worker process handling the request is terminated due to memory issues. I suspect that the library or model might be memory-intensive, causing the app to exceed its memory allocation.
Can anyone offer suggestions on how to diagnose and resolve this memory-related error? Are there any strategies or best practices I can follow to prevent my app from running out of memory during audio transcription?
Any help or insights would be greatly appreciated.