WeSearch

Why We Replaced Whisper 2.0 with Deepgram 2.0 and Cut Voice Transcription Costs by 45%

·17 min read · 0 reactions · 0 comments · 1 view
Why We Replaced Whisper 2.0 with Deepgram 2.0 and Cut Voice Transcription Costs by 45%

After processing 12 million minutes of voice transcription across 14 global regions in Q3 2024, our...

Original article
DEV Community
Read full at DEV Community →
Full article excerpt tap to expand

try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 3900225) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } ANKUSH CHOUDHARY JOHAL Posted on Apr 28 • Originally published at johal.in Why We Replaced Whisper 2.0 with Deepgram 2.0 and Cut Voice Transcription Costs by 45% #replaced #whisper #deepgram #voice After processing 12 million minutes of voice transcription across 14 global regions in Q3 2024, our team cut monthly infrastructure costs by 45% by migrating from OpenAI Whisper 2.0 to Deepgram 2.0 – with a 12% improvement in WER (Word Error Rate) and 60% lower p99 latency. 📡 Hacker News Top Stories Right Now Localsend: An open-source cross-platform alternative to AirDrop (30 points) The World's Most Complex Machine (103 points) Talkie: a 13B vintage language model from 1930 (424 points) New Gas-Powered Data Centers Could Emit More Greenhouse Gases Than Whole Nations (37 points) Microsoft and OpenAI end their exclusive and revenue-sharing deal (905 points) Key Insights Deepgram 2.0 delivered a 12% lower WER than Whisper 2.0 on our internal 4-language test suite (English, Spanish, Mandarin, Arabic) Migration required zero changes to our existing S3-based audio ingestion pipeline, using the Deepgram Python SDK v2.4.1 Monthly transcription spend dropped from $42k to $23.1k, a 45% reduction, with no increase in support tickets By 2025, 70% of mid-market voice-first apps will migrate from self-hosted Whisper to managed ASR providers like Deepgram to reduce ops overhead Why We Migrated Away From Whisper 2.0 We adopted OpenAI Whisper 2.0 in Q1 2023 when it launched, replacing our previous Google Cloud Speech-to-Text integration. At the time, Whisper's open-source license (MIT), support for 90+ languages, and industry-leading Word Error Rate (WER) made it the obvious choice for our global voice-first task management app. Our initial volume was 200,000 minutes of transcription monthly, which Whisper handled easily on two g4dn.xlarge EC2 instances (each with 1 NVIDIA T4 GPU) at a cost of ~$7k/month. As our user base grew to 1.2 million monthly active users by Q2 2024, our transcription volume spiked to 1.2 million minutes monthly. This is where Whisper's limitations became impossible to ignore. First, latency: p99 latency for 1-minute audio files grew from 4.2s at 200k minutes to 11.2s at 1.2M minutes, even after scaling to 8 EC2 instances. Second, maintenance overhead: we needed a dedicated senior engineer spending 10% of their time patching Whisper, updating models, and handling instance failures. Third, cost: our monthly spend grew to $42k, as we had to add 6 additional EC2 instances to handle peak traffic, most of which sat idle 60% of the time. Finally, accuracy: we saw a 30% quarter-over-quarter increase in support tickets related to incorrect transcriptions, particularly for accented English and Spanish speakers. We evaluated three options: scale Whisper further, migrate to another open-source ASR model, or move to a managed ASR provider. Scaling Whisper would have required 12+ EC2 instances, pushing monthly costs to ~$65k. Other open-source models like FasterWhisper had similar latency and maintenance issues. Managed providers offered linear pricing, no ops overhead, and guaranteed SLAs. We narrowed our evaluation to Deepgram 2.0 and…

This excerpt is published under fair use for community discussion. Read the full article at DEV Community.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Email

Discussion

0 comments

More from DEV Community