The gpt-audio model is our first generally available audio model. It accepts audio inputs and outputs, and can be used in the Chat Completions REST API.
Specifications
Context
128K
Maximum Output
16.4K
Inputtext, audio
Outputtext, audio
Performance (7-day Average)
Collecting…
Collecting…
Collecting…
Pricing
Input$2.75/MTokens
Output$11.00/MTokens
Input Audio$44.00/MTokens
Output Audio$88.00/MTokens
Availability Trend (24h)
Performance Metrics (24h)
Similar Models
$2.75/$11.00/M
ctx128Kmax16Kavail—tps—
InOutCap
Preview version of GPT-4o with integrated web search for enhanced real-time knowledge and information access.
$2.75/$11.00/M
ctx128Kmax16Kavail—tps—
InOutCap
Latest preview of GPT-4o enhanced with web search capabilities for accessing up-to-date information.
$2.20/$8.80/M
ctx200Kmax100Kavail—tps—
Our smartest reasoning model, trained to think for longer before responding. Excels at programming, business/consulting, and creative ideation with breakthrough performance on complex tasks.
$2.20/$8.80/M
ctx200Kmax100Kavail—tps—
InOutCap
Snapshot of o3 from April 16, 2025. Our smartest reasoning model with breakthrough performance on complex tasks.