WeSearch

How I use an LLM as a translation judge

·1 min read · 0 reactions · 0 comments · 13 views
#translation#ai#llm#quality#automation
How I use an LLM as a translation judge
⚡ TL;DR · AI summary

The article discusses the use of GEMBA-MQM v2 to evaluate translation quality in speech-to-speech translation. It highlights how this system employs an LLM to automate the annotation process, providing structured error breakdowns similar to human reviewers. However, it also notes the variability in scores produced by LLMs and suggests running multiple passes to achieve more reliable results.

Key facts
Original article
DEV.to (Top)
Read full at DEV.to (Top) →
Opening excerpt (first ~120 words) tap to expand

try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 3939997) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } Yahya Saleh Posted on May 22 • Originally published at voicefrom.ai How I use an LLM as a translation judge #opensource #ai #llm I use GEMBA-MQM v2 to evaluate translation quality in my live speech-to-speech translation pipeline. MQM (Multidimensional Quality Metrics) is an open industry standard for grading translations. Instead of a single score, it classifies every error by type (mistranslation, omission, hallucination, grammar, etc.) and severity (critical, major, minor).

Excerpt limited to ~120 words for fair-use compliance. The full article is at DEV.to (Top).

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from DEV.to (Top)