Skip to content

🌐 日本語

Instruction Decay — Forgetting Rules in Long Conversations

NOTE

In brief: LLMs' adherence to initial instructions degrades throughout long conversations. Performance in multi-turn conversations drops by an average of 39%. This results from the seven preceding structural problems compounding over time.

What Is Instruction Decay?

Instruction Decay is the phenomenon where LLMs gradually lose adherence to initial instructions as conversations become longer. According to 2025 research from Microsoft and Salesforce, LLM performance in multi-turn conversations drops by an average of 39%.

The Nature of Degradation

Importantly, this manifests as reliability collapse, not capability loss. The model doesn't "stop being able"—it enters a state where performance becomes highly variable, succeeding at some moments and failing at others.

Difficulty of Recovery

A critical finding: once an LLM drifts in a conversation, recovery becomes nearly impossible. Flawed assumptions accumulate and continuously degrade the quality of subsequent responses.

Compound Causes

Instruction Decay is not an isolated phenomenon but the result of the seven preceding structural problems compounding along the time axis:

ProblemImpact Over Time
Context RotLonger conversations increase context volume, degrading quality
Lost in the MiddleInitial instructions get pushed into the middle of context, being ignored
Priority SaturationNew instructions added to conversation lower priority of initial instructions
HallucinationErroneous outputs accumulate, degrading the foundation for subsequent reasoning
SycophancyContinuous agreement with user direction makes course correction difficult
Knowledge BoundaryAnswers beyond knowledge limits accumulate
Prompt SensitivityConversational flow shifts prompt context away from original intent

Impact on Coding

  • Architectural decisions made early in session are ignored later
  • Test approaches (TDD, coverage targets) gradually get omitted
  • Adherence to coding conventions (naming rules, error handling patterns) declines
  • Git commit granularity increases as sessions progress

Mitigation in Claude Code

StrategyMechanismWhy It Works
/compact (Preventive compression)Compress conversation history before 50% usagePrevents accumulation of Context Rot and Lost in the Middle
/clear (Session segmentation)Reset sessionResets all accumulated degradation
HooksMechanical validation outside contextIndependent of LLM instruction adherence
AgentsExecute in independent contextsExecutes tasks with fresh context
Small granule Git commitsCommit changes frequentlySimplifies rollback of degraded outputs
Session log at Stop HookRecord log at session endEnsures handoff to next session

Session Design Principles

Principle 1: Keep sessions short
  → 1 session = 1 task (or a set of related small tasks)

Principle 2: Place validation outside context
  → Hooks, tests, CI/CD do not depend on LLM instruction adherence

Principle 3: Persist state externally
  → Carry over to next session via Git commits, CLAUDE.md, memory tools

Relationship to Other Structural Problems

Instruction Decay is the temporal consolidation of all seven preceding problems:

References

  • Laban, P., Hayashi, H., Zhou, Y., & Neville, J. (2025). "LLMs Get Lost In Multi-Turn Conversation." Microsoft Research & Salesforce Research. arXiv:2505.06120 — Validation across 200,000+ simulated conversations. Measured average 39% performance degradation and 112% increase in instability.

Previous: Prompt Sensitivity

Part 1 Complete → Next: Part 2: Understanding Context Windows

Discussion: #13 Instruction Decay

Released under the CC BY 4.0 License.