How to Use the Prompt Parameter to Improve Whisper Transcription Accuracy
The prompt parameter lets you feed Whisper context about your audio — brand names, speaker names, technical vocabulary. Here's how to use it effectively.
Whisper is excellent out of the box, but it occasionally stumbles on domain-specific vocabulary: product names, technical terms, acronyms, and proper nouns. The prompt parameter gives you a direct line to Whisper's context window — use it to prime the model with the vocabulary it's about to encounter.
What the prompt parameter does
When you pass a prompt, Whisper treats it as preceding context for the audio. It effectively says: "the audio you're about to transcribe is related to this." The model uses it to resolve ambiguous words toward the vocabulary in your prompt.
curl -X POST https://api.tryspeakeasy.io/v1/audio/transcriptions \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@meeting.mp3" \
-F "prompt=SpeakEasy API, transcriptions endpoint, Whisper large-v3, STT, TTS"
When it makes a real difference
Brand and product names
Without a prompt, Whisper might transcribe "SpeakEasy" as "Speak Easy" or "speakeasy". With the prompt, it sees the correct spelling and uses it.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_SPEAKEASY_KEY",
base_url="https://api.tryspeakeasy.io/v1"
)
with open("product-demo.mp3", "rb") as f:
result = client.audio.transcriptions.create(
model="whisper-large-v3",
file=f,
prompt="SpeakEasy, tryspeakeasy.io, Lemonfox, Deepgram, AssemblyAI, OpenAI Whisper"
)
print(result.text)
Technical vocabulary
Medical, legal, and engineering domains have specialized vocabulary that Whisper may not see often. Priming with domain terms improves accuracy:
# Medical transcription
result = client.audio.transcriptions.create(
model="whisper-large-v3",
file=open("consultation.mp3", "rb"),
prompt="metformin, HbA1c, eGFR, creatinine, SGLT2 inhibitor, GLP-1, insulin resistance"
)
# Software engineering standup
result = client.audio.transcriptions.create(
model="whisper-large-v3",
file=open("standup.mp3", "rb"),
prompt="Kubernetes, kubectl, Helm chart, Terraform, CI/CD pipeline, GitHub Actions, Datadog"
)
Proper nouns and names
Speaker names are a common source of errors. If you know who's speaking, put them in the prompt:
result = client.audio.transcriptions.create(
model="whisper-large-v3",
file=open("podcast.mp3", "rb"),
prompt="Hosts: Lex Fridman, Sam Altman. Topics: AGI, Claude, GPT-5, alignment research"
)
Formatting and punctuation style
The prompt can influence how Whisper formats output. If your prompt uses certain punctuation patterns, the model tends to follow them:
# Encourage sentence-per-line format
result = client.audio.transcriptions.create(
model="whisper-large-v3",
file=open("lecture.mp3", "rb"),
prompt="This is a lecture transcript. Each sentence is on its own line."
)
Prompt length guidelines
The prompt is fed into Whisper's 448-token context window. Keep it under ~200 words — it doesn't need to be a complete sentence. A comma-separated list of key terms works well:
# Good — concise, term-focused
prompt="React, Next.js, Vercel, Tailwind CSS, shadcn/ui, Supabase, Prisma"
# Less effective — too long, dilutes the signal
prompt="This audio recording is from a meeting about web development where we discuss React, Next.js..."
Combining prompt with language
The prompt works alongside language when you want both accuracy and consistent output language:
curl -X POST https://api.tryspeakeasy.io/v1/audio/transcriptions \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@earnings-call.mp3" \
-F "language=en" \
-F "prompt=Q4 earnings, EBITDA, ARR, churn rate, NRR, YoY growth, guidance"
What the prompt cannot do
- It won't force Whisper to transcribe words it doesn't hear in the audio
- It can't correct audio quality issues (background noise, low bitrate)
- Long prompts may crowd out useful information — keep it tight
The prompt is a hint, not a command. It biases the model toward specific vocabulary; it doesn't override what Whisper hears.
Full example: technical meeting transcription
from openai import OpenAI
client = OpenAI(
api_key="YOUR_SPEAKEASY_KEY",
base_url="https://api.tryspeakeasy.io/v1"
)
TECH_TERMS = (
"API, REST, GraphQL, WebSocket, OAuth, JWT, "
"PostgreSQL, Redis, S3, Lambda, CloudFront, "
"CI/CD, Docker, Kubernetes, Terraform, Datadog"
)
def transcribe_meeting(audio_path: str) -> str:
with open(audio_path, "rb") as f:
result = client.audio.transcriptions.create(
model="whisper-large-v3",
file=f,
prompt=TECH_TERMS,
response_format="text",
)
return result
transcript = transcribe_meeting("sprint-planning.mp3")
print(transcript)
Try it yourself — $1 for your first month, 50 hours included.