WeSearch

Used a Claude Code skill to fine-tune Qwen3-1.7B from 327 noisy traces, matches GLM-5

· 0 reactions · 0 comments · 8 views
Used a Claude Code skill to fine-tune Qwen3-1.7B from 327 noisy traces, matches GLM-5

Had 327 production traces from a restaurant-reservation agent I wanted to retrain. The plan was to fine-tune a smaller self-hostable model so I could ditch the frontier-API bill. The traces were a mess. Trace 1 alone: "Hello" answered with "Have an enjoyable rest of the day!" "Breakfast in Fairfield" answered with "they do not serve alcohol" One assistant message about checking into a room at 45 Park Lane in London (this is supposed to be a restaurant agent) FindRestaurants(Pleasanton, Italian)

Original article
Reddit
Read full at Reddit →
Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from Reddit