Blog

Tutorials, guides, and product updates from the SpeakEasy team.

Apr 16, 2026·Python

Voice Activity Detection in Python: The Complete Guide

Learn how to implement voice activity detection in Python using webrtcvad, Silero VAD, and pyannote.audio — then pipe detected speech straight into a transcription API.

Apr 14, 2026·Text-to-Speech

Best Open Source TTS Models in 2026: Kokoro, Chatterbox, Fish Audio Compared

Kokoro, Chatterbox, Fish Audio, Dia2, and VibeVoice — tested and ranked. Find out which open source TTS model fits your use case and when to reach for a cloud API instead.

Apr 9, 2026·Voice Agents

How to Build an AI Voice Agent with a Speech-to-Text API

Voice agents need real-time STT, an LLM, and TTS working together. Here's how the architecture works, what to watch out for, and how to build one without overpaying.

Apr 8, 2026·Pricing

Speech-to-Text API Pricing: The Hidden Costs Most Guides Skip (2026)

Streaming, diarization, and concurrency can triple your speech-to-text bill. See the real per-minute costs across OpenAI, Deepgram, AssemblyAI, and more — with SpeakEasy's transparent comparison.

Apr 1, 2026·speech-to-text

Translate Audio to English in One API Call

Pass audio in any of 99+ languages, get back English text. No intermediate translation step, no extra API, no extra cost. Just set translate=true.

Mar 28, 2026·speech-to-text

Generate SRT and VTT Subtitle Files Directly from Audio

SpeakEasy is the only affordable speech API that returns SRT and VTT subtitle files natively. No post-processing, no custom formatter — one API call.

Mar 25, 2026·speech-to-text

How to Use the Prompt Parameter to Improve Whisper Transcription Accuracy

The prompt parameter lets you feed Whisper context about your audio — brand names, speaker names, technical vocabulary. Here's how to use it effectively.

Mar 20, 2026·speech-to-text

Async Transcription API with Callback URLs

Learn how to transcribe long audio files asynchronously using a callback URL. Fire-and-forget transcription for podcasts, meetings, and large video files.

Mar 15, 2026·Python

Python Speech-to-Text API: Transcribe Audio in 5 Lines of Code

Transcribe audio in Python using the OpenAI SDK pointed at SpeakEasy. Get transcripts with speaker labels, timestamps, and async processing in under 5 lines.

Mar 10, 2026·JavaScript

Speech-to-Text API in JavaScript: Complete Guide (2026)

The complete speech-to-text JavaScript guide — Node, browser, streaming, diarization, error handling, and production patterns. OpenAI SDK compatible via SpeakEasy.

Mar 8, 2026·Comparison

Best Speech-to-Text APIs in 2026 Compared

We tested 8 leading speech-to-text APIs on accuracy, price, latency, and developer experience. Here's exactly what we found — with real benchmark data.

Mar 5, 2026·Comparison

Whisper API Alternative: The Shopping Guide (2026)

A shopping-intent guide for developers replacing OpenAI Whisper. Migration checklists for Docker, CI/CD, serverless, and batch pipelines. Real cost math, not marketing.

Mar 1, 2026·Speech-to-Text

Speaker Diarization: Identify Who Said What in Audio

Learn what speaker diarization is, why it matters, and how to use the SpeakEasy API to automatically identify speakers in audio recordings with code examples.

Feb 25, 2026·Text-to-Speech

Build a Text-to-Speech App in 5 Minutes

Generate natural TTS audio in under 5 lines of Python via the OpenAI SDK pointed at SpeakEasy. 6 voices, streaming, HD mode, and 5 formats at $0.20/hour.

$1. 50 hours. Both STT and TTS.

Your current speech API provider is charging you too much. Switch in one line of code.

Start for $1 →Read the Docs