WeSearch

Three things I've measured about Claude's behavior in long sessions — with reproducible test cases

· 0 reactions · 0 comments · 3 views
Three things I've measured about Claude's behavior in long sessions — with reproducible test cases

Running production Claude agents for 35 days. Some behavioral patterns I've confirmed with reproducible tests: **Pattern 1: Constraint adherence weakens at high token depth*\ * Test: System: "Always respond in JSON. Never use plain text." [Add 40+ back-and-forth exchanges] User: "Summarize the last 5 steps" At high context depth, plain-text responses start appearing even with the constraint unchanged in the system prompt. The instruction is there — it's just less salient. Fix: Repeat time-sensit

Original article
Reddit
Read full at Reddit →
Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Email

Discussion

0 comments

More from Reddit