·SpeakEasy

How to Use the Prompt Parameter to Improve Whisper Transcription Accuracy

The prompt parameter lets you feed Whisper context about your audio — brand names, speaker names, technical vocabulary. Here's how to use it effectively.

speech-to-textwhisperaccuracytutorial

Whisper is excellent out of the box, but it occasionally stumbles on domain-specific vocabulary: product names, technical terms, acronyms, and proper nouns. The prompt parameter gives you a direct line to Whisper's context window — use it to prime the model with the vocabulary it's about to encounter.

What the prompt parameter does

When you pass a prompt, Whisper treats it as preceding context for the audio. It effectively says: "the audio you're about to transcribe is related to this." The model uses it to resolve ambiguous words toward the vocabulary in your prompt.

curl -X POST https://api.tryspeakeasy.io/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@meeting.mp3" \
  -F "prompt=SpeakEasy API, transcriptions endpoint, Whisper large-v3, STT, TTS"

When it makes a real difference

Brand and product names

Without a prompt, Whisper might transcribe "SpeakEasy" as "Speak Easy" or "speakeasy". With the prompt, it sees the correct spelling and uses it.

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_SPEAKEASY_KEY",
    base_url="https://api.tryspeakeasy.io/v1"
)

with open("product-demo.mp3", "rb") as f:
    result = client.audio.transcriptions.create(
        model="whisper-large-v3",
        file=f,
        prompt="SpeakEasy, tryspeakeasy.io, Lemonfox, Deepgram, AssemblyAI, OpenAI Whisper"
    )

print(result.text)

Technical vocabulary

Medical, legal, and engineering domains have specialized vocabulary that Whisper may not see often. Priming with domain terms improves accuracy:

# Medical transcription
result = client.audio.transcriptions.create(
    model="whisper-large-v3",
    file=open("consultation.mp3", "rb"),
    prompt="metformin, HbA1c, eGFR, creatinine, SGLT2 inhibitor, GLP-1, insulin resistance"
)

# Software engineering standup
result = client.audio.transcriptions.create(
    model="whisper-large-v3",
    file=open("standup.mp3", "rb"),
    prompt="Kubernetes, kubectl, Helm chart, Terraform, CI/CD pipeline, GitHub Actions, Datadog"
)

Proper nouns and names

Speaker names are a common source of errors. If you know who's speaking, put them in the prompt:

result = client.audio.transcriptions.create(
    model="whisper-large-v3",
    file=open("podcast.mp3", "rb"),
    prompt="Hosts: Lex Fridman, Sam Altman. Topics: AGI, Claude, GPT-5, alignment research"
)

Formatting and punctuation style

The prompt can influence how Whisper formats output. If your prompt uses certain punctuation patterns, the model tends to follow them:

# Encourage sentence-per-line format
result = client.audio.transcriptions.create(
    model="whisper-large-v3",
    file=open("lecture.mp3", "rb"),
    prompt="This is a lecture transcript. Each sentence is on its own line."
)

Prompt length guidelines

The prompt is fed into Whisper's 448-token context window. Keep it under ~200 words — it doesn't need to be a complete sentence. A comma-separated list of key terms works well:

# Good — concise, term-focused
prompt="React, Next.js, Vercel, Tailwind CSS, shadcn/ui, Supabase, Prisma"

# Less effective — too long, dilutes the signal
prompt="This audio recording is from a meeting about web development where we discuss React, Next.js..."

Combining prompt with language

The prompt works alongside language when you want both accuracy and consistent output language:

curl -X POST https://api.tryspeakeasy.io/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@earnings-call.mp3" \
  -F "language=en" \
  -F "prompt=Q4 earnings, EBITDA, ARR, churn rate, NRR, YoY growth, guidance"

What the prompt cannot do

  • It won't force Whisper to transcribe words it doesn't hear in the audio
  • It can't correct audio quality issues (background noise, low bitrate)
  • Long prompts may crowd out useful information — keep it tight

The prompt is a hint, not a command. It biases the model toward specific vocabulary; it doesn't override what Whisper hears.

Full example: technical meeting transcription

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_SPEAKEASY_KEY",
    base_url="https://api.tryspeakeasy.io/v1"
)

TECH_TERMS = (
    "API, REST, GraphQL, WebSocket, OAuth, JWT, "
    "PostgreSQL, Redis, S3, Lambda, CloudFront, "
    "CI/CD, Docker, Kubernetes, Terraform, Datadog"
)

def transcribe_meeting(audio_path: str) -> str:
    with open(audio_path, "rb") as f:
        result = client.audio.transcriptions.create(
            model="whisper-large-v3",
            file=f,
            prompt=TECH_TERMS,
            response_format="text",
        )
    return result

transcript = transcribe_meeting("sprint-planning.mp3")
print(transcript)

Try it yourself — $1 for your first month, 50 hours included.

SPEAKY