Skip to content

🌐 日本語

Knowledge Boundary — LLMs Cannot Admit What They Don't Know

NOTE

In a nutshell: LLMs cannot accurately identify the limits of their own knowledge. Instead of answering "I don't know" to questions about unfamiliar topics, they generate incorrect responses with high confidence. This "poor calibration" is a direct source of hallucination and creates the most dangerous failure mode in coding agents.

What is Knowledge Boundary?

Knowledge Boundary refers to the line separating what an LLM can answer correctly from what it cannot. The problem is that LLMs themselves cannot accurately recognize this boundary.

Humans can say "I don't know" about unfamiliar topics, but LLMs generate incorrect answers with high confidence about things they don't understand.

Why Can't They Admit Ignorance?

1. The Objective Function: Next-Token Prediction

LLMs are trained to "predict the next token." This objective function offers no reward for outputs that express uncertainty. Responses like "I don't know" are vastly underrepresented in training data, as most human-written text is structured around making claims rather than admitting ignorance.

2. Overconfidence Reinforced by RLHF

During RLHF, human evaluators tend to rate responsive answers higher than refusals. As a result, models are rewarded for "providing answers," and the bias toward "attempting to answer even when uncertain" becomes stronger.

3. Insufficient Training Data for Expressing Uncertainty

The vast majority of training data consists of definitive claims ("X is Y"), with structural scarcity of expressions that explicitly convey uncertainty.

4. Ambiguity in Knowledge Boundaries Across Types

Knowledge can be classified into four layers:

LayerCategoryDescriptionRisk Level
1Known-KnownKnowledge the model is certain aboutLow
2Prompt-Dependent KnownAnswers correctly or incorrectly depending on how the question is framedMedium
3Known-UnknownAreas where the model recognizes its lack of knowledgeLow
4Unknown-UnknownAreas where the model cannot recognize its own ignoranceHighest

The most dangerous category is "unknown-unknown" — domains where the LLM cannot recognize its own knowledge gap.

Impact on Code Generation

Pattern 1: Generating Code from Outdated Knowledge

If an LLM was trained on Angular 16 knowledge but the user needs Angular 18, it confidently generates code using older patterns.

Pattern 2: Calling Nonexistent APIs

Without precise version-to-version knowledge, it generates code that calls APIs or methods that don't exist.

Pattern 3: Pretending to Know Internal Tooling

When asked about methods in proprietary tools, it confidently generates type definitions for methods that don't exist.

Pattern 4: Prompt-Dependent Unstable Knowledge

The same question can be answered correctly or incorrectly depending on how it's phrased. This shows that knowledge is not a binary "know/don't know" state but a continuous confidence level dependent on the prompt.

Mitigation Strategies in Claude Code

MitigationMechanismWhy It Works
MCP (External Knowledge Reference)Query external trusted sources directlyExtends knowledge boundaries beyond the LLM's internal knowledge
Test CodeExternally verify generated code correctnessDetects outputs that exceed knowledge boundaries through their results
Version Declaration in CLAUDE.mdExplicitly specify library versionsClarifies "which version's knowledge should be used?"
Agents (Knowledge Isolation)Delegate to specialized agentsShrinks knowledge domains, reducing the probability of boundary exceedance
Prompt DesignInstruct "unconfident APIs require verification"Explicitly permits the LLM to say "I don't know"

Relationship to Other Structural Problems

Knowledge Boundary acts as an entry point for Hallucination:

References

  • Pan et al. (2025). "Can LLMs Refuse Questions They Do Not Know? Measuring Knowledge-Aware Refusal in Factual Tasks." arXiv:2510.01782 — Quantitative measurement of refusal behavior across 16 models and 5 datasets using the Refusal Index (RI) metric

Previous: Sycophancy

Next: Prompt Sensitivity

Discussion: #11 Knowledge Boundary

Released under the CC BY 4.0 License.