Whisper V3 Large API & Playground

Whisper V3 Large API Features

Serverless

Whisper V3 Large is available via Fireworks' serverless API, where you pay per token. There are several ways to call the Fireworks API, including Fireworks' Python client, the REST API, or OpenAI's Python client.

Available Serverless

Run queries immediately, pay only for usage

$0.0015

Per Audio Minute (billed per second)

Whisper V3 Large FAQs

What is Whisper V3 Large and who developed it?

Whisper V3 Large is a multilingual, Transformer-based automatic-speech-recognition (ASR) and speech-translation model created by OpenAI and hosted on Fireworks AI.

What applications and use cases does Whisper V3 Large excel at?

Whisper V3 Large is best suited for:

•High-accuracy speech transcription
•Zero-shot speech-to-English translation across 99 languages

What is the maximum context length for Whisper V3 Large?

The model's receptive field is 30 seconds of audio per inference window.

What is the usable context window?

Fireworks recommends chunking longer audio into 30-second segments (with optional overlap) for stable performance.

Does Whisper V3 Large support quantized formats (4-bit/8-bit)?

Yes. 16 quantized variants (including 4-bit & 8-bit) are supported for Whisper V3 Large.

What are known failure modes of Whisper V3 Large?

Known limitations of Whisper V3 Large include:

•Possible hallucinated text
•Uneven accuracy on low-resource languages or certain accents
•Occasional repetitive outputs

How many parameters does Whisper V3 Large have?

Whisper V3 Large has approximately 1.54 billion parameters.

What rate limits apply on the shared endpoint?

On-demand deployments of Whisper V3 Large run on dedicated GPUs with no rate limits.

Whisper V3 Large

Whisper V3 Large API Features

Serverless

Available Serverless

Whisper V3 Large FAQs

Metadata

Specification

Supported Functionality