Real-time AI models, a price comparison

I was updating pricing information from AI models in real time (speech-to-speech interaction with the model, without intermediate STT or TTS conversion, with low latency) and comparing them to have up-to-date data. Sample: 1 Hr. call duration. Created with Grok 4 Fast. Here it is:

ModelEconomic VersionCost 1 Hour (USD)Notes
Google Gemini2.5 Flash Live~0.68Based on 45k audio tokens in/out (25/sec); $3/1M input audio, $12/1M output audio. Source: https://cloud.google.com/vertex-ai/generative-ai/pricing
OpenAI Realtimegpt-realtime-mini~1.35Based on 45k audio tokens in/out (25/sec); $10/1M input audio, $20/1M output audio. Source: https://openai.com/api/pricing/
Microsoft Azure SpeechStandard Real-time~1.92STT $1.20/h + TTS ~$0.72/h. Source: https://azure.microsoft.com/en-us/pricing/details/cognitive-services/speech-services/
Hume.ai EVIStarter~4.40$3 for 40 min + $0.07/min additional. Source: https://www.hume.ai/pricing
ElevenLabsStarter Agents~6.00$5 for 50 min, equivalent $0.10/min. Source: https://elevenlabs.io/pricing


Today, October 11, 2025, Google Gemini 2.5 Flash Live could be the most affordable option for real-time agents. Good to know!

Greetings!