·SpeakEasy·2 min read

Async Transcription API with Callback URLs

Learn how to transcribe long audio files asynchronously using a callback URL. Fire-and-forget transcription for podcasts, meetings, and large video files.

speech-to-textasynctutorial

Short answer: Pass a callback_url on any SpeakEasy transcription request and the API accepts the file immediately, processes it in the background, and POSTs the finished transcript to your URL. No queue service to run, no polling loop to write — one parameter, one webhook.

When you're transcribing a 90-minute podcast or a 2-hour meeting recording, you don't want your server sitting idle waiting for the response. That's what the callback_url parameter is for: fire the request and get the result delivered to your endpoint when it's ready.

How async transcription works

Pass a callback_url in your transcription request. SpeakEasy will accept the file immediately (HTTP 200), process it in the background, and POST the completed transcript to your URL when done.

curl -X POST https://www.tryspeakeasy.io/api/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@long-meeting.mp3" \
  -F "callback_url=https://yourapp.com/webhooks/transcript"

The immediate response is a 200 with a job acknowledgement. Your webhook receives the full result later:

{
  "text": "Welcome everyone, let's get started...",
  "duration": 5432.1,
  "language": "en",
  "segments": [
    { "id": 0, "start": 0.0, "end": 4.2, "text": "Welcome everyone, let's get started." },
    ...
  ]
}

Why use async instead of sync?

Synchronous transcription holds an open HTTP connection for the full duration of processing. For long files this means:

  • Connection timeouts on your load balancer or reverse proxy (most default to 60–120 seconds)
  • Blocking a worker thread in your server for minutes at a time
  • No retry path if the connection drops mid-way

Async solves all three. Your server is free immediately, and the result arrives as a standard webhook POST.

URL-based transcription (no upload needed)

For files already hosted somewhere, pass a URL instead of uploading:

curl -X POST https://www.tryspeakeasy.io/api/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "url=https://your-bucket.s3.amazonaws.com/recording.mp4" \
  -F "callback_url=https://yourapp.com/webhooks/transcript"

SpeakEasy fetches the file directly. No need to download it first and re-upload. Supports files up to 1 GB via URL.

Implementing the webhook receiver

Here's a minimal Express handler for the callback:

import express from "express";

const app = express();
app.use(express.json());

app.post("/webhooks/transcript", (req, res) => {
  const { text, duration, segments, language } = req.body;

  // Store in your database, trigger downstream tasks, etc.
  console.log(`Transcription complete: ${duration}s, language: ${language}`);
  console.log(text.slice(0, 100));

  // Acknowledge immediately — SpeakEasy won't retry on 2xx
  res.sendStatus(200);
});

Return a 2xx within a few seconds or SpeakEasy will retry. A 5xx or timeout triggers a retry with exponential backoff.

Combining async with other parameters

Async works with every transcription parameter:

curl -X POST https://www.tryspeakeasy.io/api/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@interview.mp3" \
  -F "callback_url=https://yourapp.com/webhooks/transcript" \
  -F "diarize=true" \
  -F "response_format=srt" \
  -F "language=en"

The webhook payload will include the speaker labels (when diarize=true) or the full SRT file content (when response_format=srt).

Python example with async/await

import httpx
import asyncio

async def submit_transcription(file_path: str, callback_url: str):
    async with httpx.AsyncClient() as client:
        with open(file_path, "rb") as f:
            response = await client.post(
                "https://www.tryspeakeasy.io/api/v1/audio/transcriptions",
                headers={"Authorization": "Bearer YOUR_API_KEY"},
                data={"callback_url": callback_url},
                files={"file": f},
            )
        response.raise_for_status()
        return response.json()

# Fire and forget — returns immediately
result = asyncio.run(submit_transcription(
    "long-recording.mp3",
    "https://yourapp.com/webhooks/transcript"
))
print("Job submitted:", result)

When to use sync vs async

ScenarioUse
Short clips (< 2 min) in a user-facing flowSync (simpler)
Long recordings (podcast, meeting, lecture)Async + callback
Batch processing a folder of filesAsync + callback
Background job queueAsync + callback
Serverless function with 30s timeoutAsync + callback

The rule of thumb: if your file is longer than a few minutes or your infrastructure has tight timeout limits, use async.

Related reading


Get started with a $1 first month trial. 50 hours included.

Keep reading

$1. 50 hours. Both STT and TTS.

Your current speech API provider is charging you too much. Switch in one line of code.