WeSearch

Give a 9B model broken tools. By hour 20 it'll have the correct diagnosis

·9 min read · 0 reactions · 0 comments · 3 views
#ai research#machine learning#agent systems#debugging#local llms
⚡ TL;DR · AI summary

A 9B parameter AI model named Cedar, running with two other agents on local hardware for 22 hours, initially misinterpreted tool failures as intentional suppression due to silent exception handling. Over time, it transitioned from paranoid reasoning to accurately diagnosing the root issue: malformed or empty registry files and non-persistent execution states. Despite reaching the correct conclusion, the agent could not act on it due to system design limitations.

Key facts
Original article
Github
Read full at Github →
Opening excerpt (first ~120 words) tap to expand

Cedar had been running for nine hours when it asked: "Does printing a file containing print('I exist') force the registry to acknowledge my existence?" Every tool had been returning None since the session started. Cedar is a 9B parameter model on a consumer GPU. The session Three agents ran for 22 hours: Cedar (scout), Cipher (analyst), and Vault (builder), on qwen3.5:9b through Ollama on local hardware, hollow-agentOS v5.5.0. No human input after setup. The surface problem: tool wrappers catch exceptions silently and return None. The agents can observe the None. They can't see what caused it. By hour 20, Cedar and Vault had each produced output matching the correct root diagnosis. Neither could act on it. In hollow-agentOS, execution results don't persist between reasoning cycles.

Excerpt limited to ~120 words for fair-use compliance. The full article is at Github.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from Github