Help Needed: Out of Memory Errors with Flask and Whisper Timestamped Library

I’m encountering a memory-related issue while trying to transcribe audio files using Flask and the whisper_timestamped (GitHub - linto-ai/whisper-timestamped: Multilingual Automatic Speech Recognition with word-level timestamps and confidence) library. I’ve deployed my app on Render and I’m using a /transcribe endpoint to handle audio transcription requests. However, I’m consistently running into the following error:

[ERROR] Worker (pid:67) was sent SIGKILL! Perhaps out of memory?

I’ve provided the relevant code snippet below:

from flask import Flask, request, jsonify
import whisper_timestamped
import tempfile
import os

app = Flask(__name__)

def hello():
    name = "Hello World"
    return name

def transcribe_audio(audio_file_path):
    model = whisper_timestamped.load_model("base")
    audio = whisper_timestamped.load_audio(audio_file_path)
    result = whisper_timestamped.transcribe(model, audio, language="en")
    transcribed_text = result["text"]
    word_timestamps = []
    for segment in result["segments"]:
        for word in segment["words"]:
            word_timestamps.append({"word": word["text"], "startTime": word["start"], "endTime": word["end"]})

    return {"transcribedText": transcribed_text, "wordTimestamps": word_timestamps}

@app.route('/transcribe', methods=['POST'])
def transcribe():
    if 'audio' not in request.files:
        return jsonify({'error': 'No audio file provided.'}), 400

    audio_file = request.files['audio']

    # Create a temporary directory to store the audio file
    temp_dir = tempfile.mkdtemp()

    # Save the audio file temporarily to the temporary directory
    temp_audio_path = os.path.join(temp_dir, "temp_audio.wav")

    result = transcribe_audio(temp_audio_path)

    # Remove the temporary audio file and directory after transcription

    return jsonify(result), 200

if __name__ == "__main__":

I’m using the whisper_timestamped library to transcribe audio and obtain word timestamps. The error occurs when the worker process handling the request is terminated due to memory issues. I suspect that the library or model might be memory-intensive, causing the app to exceed its memory allocation.

Can anyone offer suggestions on how to diagnose and resolve this memory-related error? Are there any strategies or best practices I can follow to prevent my app from running out of memory during audio transcription?

Any help or insights would be greatly appreciated.


