Speech-to-Text API in JavaScript: Complete Guide
Complete guide to using a speech-to-text API in JavaScript and Node.js. Learn file uploads, URL-based transcription, and streaming with the SpeakEasy API.
Speech-to-Text API in JavaScript: Complete Guide
Building a speech to text API integration in JavaScript? This guide covers everything from basic file transcription to streaming, using both the native fetch API and the OpenAI Node SDK with SpeakEasy.
Prerequisites
- Node.js 18+ (for native
fetchsupport) or any modern browser - A SpeakEasy API key (get one here)
Option 1: Using the OpenAI Node SDK
The simplest path. SpeakEasy is a drop-in replacement for the OpenAI API, so the official SDK works out of the box.
npm install openai
import OpenAI from "openai";
import fs from "fs";
const client = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://api.tryspeakeasy.io/v1",
});
const transcript = await client.audio.transcriptions.create({
model: "whisper-large-v3",
file: fs.createReadStream("audio.mp3"),
});
console.log(transcript.text);
That's it. Three lines of meaningful code and you have a working transcription.
Option 2: Using Fetch
If you prefer not to install a dependency, use the native fetch API directly.
const formData = new FormData();
formData.append("model", "whisper-large-v3");
formData.append("file", new Blob([await fs.promises.readFile("audio.mp3")]));
const response = await fetch(
"https://api.tryspeakeasy.io/v1/audio/transcriptions",
{
method: "POST",
headers: {
Authorization: "Bearer YOUR_API_KEY",
},
body: formData,
}
);
const result = await response.json();
console.log(result.text);
Getting Timestamps
For subtitle generation or audio search, request verbose JSON output with timestamps:
const transcript = await client.audio.transcriptions.create({
model: "whisper-large-v3",
file: fs.createReadStream("interview.mp3"),
response_format: "verbose_json",
timestamp_granularities: ["segment"],
});
for (const segment of transcript.segments) {
console.log(`[${segment.start}s] ${segment.text}`);
}
URL-Based Transcription
Already have audio hosted somewhere? Skip the upload and pass a URL instead:
const response = await fetch(
"https://api.tryspeakeasy.io/v1/audio/transcriptions",
{
method: "POST",
headers: {
Authorization: "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "whisper-large-v3",
url: "https://example.com/recording.mp3",
}),
}
);
const result = await response.json();
console.log(result.text);
This is particularly useful when transcribing files stored in S3, GCS, or any public URL.
Streaming Transcription
For real-time use cases like live captioning, SpeakEasy supports streaming responses. Results are returned as server-sent events:
const response = await fetch(
"https://api.tryspeakeasy.io/v1/audio/transcriptions",
{
method: "POST",
headers: {
Authorization: "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "whisper-large-v3",
url: "https://example.com/recording.mp3",
stream: true,
}),
}
);
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
process.stdout.write(decoder.decode(value));
}
Error Handling
Always handle API errors gracefully in production:
try {
const transcript = await client.audio.transcriptions.create({
model: "whisper-large-v3",
file: fs.createReadStream("audio.mp3"),
});
console.log(transcript.text);
} catch (error) {
if (error.status === 401) {
console.error("Invalid API key. Check your credentials.");
} else if (error.status === 413) {
console.error("File too large. Maximum size is 25 MB.");
} else {
console.error("Transcription failed:", error.message);
}
}
What's Next?
You're now equipped to integrate speech-to-text into any JavaScript application, whether it's a Node.js backend, a serverless function, or a browser-based tool. Explore speaker diarization to identify individual speakers, or check out text-to-speech to generate audio from your content.
Sign up for SpeakEasy and start building with a generous free tier today.