How to Use the Prompt Parameter to Improve Whisper Transcription Accuracy

Short answer: The prompt parameter lets you prime Whisper with up to 224 tokens of context — brand names, technical terms, speaker names — so it resolves ambiguous words toward your vocabulary. Pass a comma-separated list of the exact spellings you want preserved. Best for proper nouns, acronyms, and domain jargon.

If you want to see Whisper transcribing a real file before you start tweaking prompts, audiotranscribe.app/python is a free playground hitting the same model — useful as a baseline to compare against once your prompt is dialled in.

Whisper is excellent out of the box, but it occasionally stumbles on domain-specific vocabulary: product names, technical terms, acronyms, and proper nouns. The prompt parameter gives you a direct line to Whisper's context window — use it to prime the model with the vocabulary it's about to encounter.

What the prompt parameter does

When you pass a prompt, Whisper treats it as preceding context for the audio. It effectively says: "the audio you're about to transcribe is related to this." The model uses it to resolve ambiguous words toward the vocabulary in your prompt.

curl -X POST https://www.tryspeakeasy.io/api/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@meeting.mp3" \
  -F "prompt=SpeakEasy API, transcriptions endpoint, Whisper large-v3, STT, TTS"

When it makes a real difference

Brand and product names

Without a prompt, Whisper might transcribe "SpeakEasy" as "Speak Easy" or "speakeasy". With the prompt, it sees the correct spelling and uses it.

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_SPEAKEASY_KEY",
    base_url="https://www.tryspeakeasy.io/api/v1"
)

with open("product-demo.mp3", "rb") as f:
    result = client.audio.transcriptions.create(
        model="whisper-large-v3",
        file=f,
        prompt="SpeakEasy, tryspeakeasy.io, Lemonfox, Deepgram, AssemblyAI, OpenAI Whisper"
    )

print(result.text)

Technical vocabulary

Medical, legal, and engineering domains have specialized vocabulary that Whisper may not see often. Priming with domain terms improves accuracy:

# Medical transcription
result = client.audio.transcriptions.create(
    model="whisper-large-v3",
    file=open("consultation.mp3", "rb"),
    prompt="metformin, HbA1c, eGFR, creatinine, SGLT2 inhibitor, GLP-1, insulin resistance"
)

# Software engineering standup
result = client.audio.transcriptions.create(
    model="whisper-large-v3",
    file=open("standup.mp3", "rb"),
    prompt="Kubernetes, kubectl, Helm chart, Terraform, CI/CD pipeline, GitHub Actions, Datadog"
)

Proper nouns and names

Speaker names are a common source of errors. If you know who's speaking, put them in the prompt:

result = client.audio.transcriptions.create(
    model="whisper-large-v3",
    file=open("podcast.mp3", "rb"),
    prompt="Hosts: Lex Fridman, Sam Altman. Topics: AGI, Claude, GPT-5, alignment research"
)

Formatting and punctuation style

The prompt can influence how Whisper formats output. If your prompt uses certain punctuation patterns, the model tends to follow them:

# Encourage sentence-per-line format
result = client.audio.transcriptions.create(
    model="whisper-large-v3",
    file=open("lecture.mp3", "rb"),
    prompt="This is a lecture transcript. Each sentence is on its own line."
)

Prompt length guidelines

The prompt is fed into Whisper's 448-token context window. Keep it under ~200 words — it doesn't need to be a complete sentence. A comma-separated list of key terms works well:

# Good — concise, term-focused
prompt="React, Next.js, Vercel, Tailwind CSS, shadcn/ui, Supabase, Prisma"

# Less effective — too long, dilutes the signal
prompt="This audio recording is from a meeting about web development where we discuss React, Next.js..."

Combining prompt with language

The prompt works alongside language when you want both accuracy and consistent output language:

curl -X POST https://www.tryspeakeasy.io/api/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@earnings-call.mp3" \
  -F "language=en" \
  -F "prompt=Q4 earnings, EBITDA, ARR, churn rate, NRR, YoY growth, guidance"

What the prompt cannot do

It won't force Whisper to transcribe words it doesn't hear in the audio
It can't correct audio quality issues (background noise, low bitrate)
Long prompts may crowd out useful information — keep it tight

The prompt is a hint, not a command. It biases the model toward specific vocabulary; it doesn't override what Whisper hears.

Full example: technical meeting transcription

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_SPEAKEASY_KEY",
    base_url="https://www.tryspeakeasy.io/api/v1"
)

TECH_TERMS = (
    "API, REST, GraphQL, WebSocket, OAuth, JWT, "
    "PostgreSQL, Redis, S3, Lambda, CloudFront, "
    "CI/CD, Docker, Kubernetes, Terraform, Datadog"
)

def transcribe_meeting(audio_path: str) -> str:
    with open(audio_path, "rb") as f:
        result = client.audio.transcriptions.create(
            model="whisper-large-v3",
            file=f,
            prompt=TECH_TERMS,
            response_format="text",
        )
    return result

transcript = transcribe_meeting("sprint-planning.mp3")
print(transcript)

How to Use the Prompt Parameter to Improve Whisper Transcription Accuracy

What the prompt parameter does

When it makes a real difference

Brand and product names

Technical vocabulary

Proper nouns and names

Formatting and punctuation style

Prompt length guidelines

Combining prompt with language

What the prompt cannot do

Full example: technical meeting transcription

Related reading

Keep reading

Translate Audio to English in One API Call

Generate SRT and VTT Subtitle Files Directly from Audio

Async Transcription API with Callback URLs

$1. 50 hours. Both STT and TTS.