I Added Three Rules to Gemma 4. The MoE Searched. The Dense Model Refused.
The author tested Gemma 4's 26B mixture-of-experts and 31B dense models against GPT-4o-mini and GPT-4o in an Arabic e-commerce chatbot setting, focusing on customer-facing replies. Both Gemma models initially showed reluctance to respond rather than hallucinate, with the MoE model improving after adding three prompt rules while the dense model increased false refusals. The results suggest architectural differences significantly affect behavior, even with identical instructions.
- ▪The test involved six Arabic customer scenarios run through a production chat router, with only the reply-generation model varied.
- ▪Gemma 4's mixture-of-experts model adapted better to new prompt rules, providing more grounded responses, while the dense model increasingly refused valid queries.
- ▪Latency and reasoning differences could not be fully isolated due to lack of control over Gemma 4's internal thinking process in the API.
Opening excerpt (first ~120 words) tap to expand
try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 3867337) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } Ali Afana Posted on May 16 I Added Three Rules to Gemma 4. The MoE Searched. The Dense Model Refused. #ai #llm #opensource #gemmachallenge Gemma 4 Challenge: Write about Gemma 4 Submission TL;DR: I run an AI sales chatbot for Arabic-speaking merchants. I wanted to know if Gemma 4 could replace GPT-4o-mini on the customer-facing reply.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at DEV.to (Top).