MCP/A2A/Skill/Agent Architecture

Understanding the components of AI-driven development infrastructure and organizing their roles and relationships.

Positioning of This Document

Where the Vision (01-vision) defines "why" and Authoritative Reference Sources (02-reference-sources) defines "what," this document defines "how to structure it." It bridges the gap from philosophical foundations to actionable system design — structuring without destroying the underlying philosophy.

Meta Information


What this chapter establishes	The three-layer separation of Agent / Skills / MCP and the responsibility boundaries of each layer
What this chapter does NOT cover	Pattern selection criteria (→04), constraint mitigation (→05), Doctrine Layer details (→07)
Dependencies	01-vision (design philosophy), 02-reference-sources (reference source framework)
Common misuse	Confusing the three layers with physical deployment topology. The three layers represent separation of responsibilities, not deployment configuration

Layer Structure Overview

The architecture is organized into four distinct layers, each with specific responsibilities, as shown in the following diagram:

Doctrine Layer — "Constitutional Judgment Criteria"

These three layers define what AI knows and what AI can do. The question of on what basis AI judges and decides is addressed by the Doctrine Layer. The Doctrine Layer is not merely a system prompt — it is constitutional judgment criteria that governs all three layers through shared objectives, constraints, and judgment criteria.

The Responsibility Shift Model defined in the Vision (design-time: human / execution-time: agent / structural constraints: system) and the two-layer verification structure (guardrails + evaluation pipelines) are applied to each layer through this Doctrine Layer.

Layer Responsibilities

Each layer has distinct ownership and responsibility areas:

Layer	Responsibility	Owns	Examples
Agent	Orchestration, decision-making	Task flow	Claude Code, Cursor
Skills	Domain knowledge, guidelines	Best practices	SOLID principles, translation guidelines
MCP	External connectivity	Tool definitions	deepl-mcp, rfcxml-mcp

About This Document

AI-driven development involves multiple components, and correctly understanding their roles and relationships is key to efficient development. This document organizes four main concepts: MCP (tool connectivity), A2A (agent-to-agent communication), Skill (static knowledge), and Custom Sub-agents (role specialization).

When you are unsure about "What should be implemented as MCP?", "When is a Skill sufficient?", or "When should I use a sub-agent?", refer to this document to make the appropriate choice.

Overall Architecture

The following diagram shows how all components interact within the complete system architecture:

MCP Three-Layer Structure

Host / Client / Server

MCP is built on a three-layer architecture, where communication flows through the client layer:

Layer	Role	Example	Developer Involvement
Host	UI, session management	Claude Code, Cursor, VS Code	Consumer
Client	Protocol processing, server management	Built into Host	Usually not concerned
Server	Tool/resource provision	rfcxml-mcp, deepl-mcp	Provider

Why You Don't Need to Worry About the Client

For most developers, the client layer operates transparently as part of the host:

Typical development flow:
1. Create an MCP Server (e.g., rfcxml)
2. Add it to Claude Code configuration
3. Claude Code operates as a built-in Client
4. Tools become available

→ The Client is embedded in the Host and
   functions as a black box

MCP and A2A: Separation of Concerns

Protocol Differences

While MCP and A2A serve different purposes, they are complementary and address different communication needs:

Item	MCP	A2A
Led by	Anthropic	Google → Linux Foundation
Purpose	Tool connectivity	Agent-to-agent communication
Connects to	MCP Server (tools)	Other agents (including third-party)
Context	Can share with parent agent	Completely isolated
Owner	Self	Self or others

Official Recommendation

Build with ADK, equip with MCP (tools), communicate with A2A (agents)

MCP = Using hands (tools)
A2A = Collaborating with others (agents)

Custom Sub-agents

What is a Sub-agent?

An AI assistant specialized for specific tasks that can be defined within Claude Code.

Location:
├── Project: .claude/agents/xxx.md (Priority: High)
└── User:    ~/.claude/agents/xxx.md (Priority: Low)

Definition Format

Sub-agents are defined using a simple markdown format that specifies their role, capabilities, and instructions:

markdown

name: rfc-specialist
description: Expert in RFC specification verification and validation
tools: rfcxml:get_rfc_structure, rfcxml:get_requirements
model: sonnet

You are an expert in RFC specifications.
Use only the rfcxml tools.

Additional Definition Example

Here is a more practical sub-agent that combines multiple MCPs and Skills:

markdown

name: compliance-checker
description: Expert agent for legal and technical compliance checking
tools: hourei:find_law_article, rfcxml:get_requirements, rfcxml:validate_statement
model: sonnet

You are an expert in legal and technical specification compliance checking.

1. Retrieve legal requirements via hourei-mcp
2. Retrieve technical requirements via rfcxml-mcp
3. Map both and report compliance status

Always cite sources (legal article numbers, RFC section numbers).

Sub-agent Positioning

The following diagram illustrates where sub-agents sit within the Claude Code architecture:

Important: Sub-agents are not a "replacement" for the MCP Client, but rather a "higher layer"

Sub-agent = Defines "what to do" (role, procedures)
MCP Client = Implements "how to connect" (protocol processing)

Skill

What is a Skill?

Static knowledge and guidelines that can be referenced in Claude Code. Skills are stored in the following locations:

Location:
├── Project: .claude/skills/xxx/SKILL.md
└── User:    ~/.claude/skills/xxx/SKILL.md

Skill Characteristics

In the Four-Layer Reference Model, Level 3 (organization rules) and Level 4 (best practices) use Skills as their primary delivery method. Skills have these key characteristics that distinguish them from other approaches:

Item	Description
Format	Markdown file
Content	Best practices, workflow definitions, guidelines
Execution	None (reference only)
Context consumption	Low (only when referenced)

MCP vs Skill vs Sub-agent

Decision Flow

Use this flowchart to determine whether to implement something as a Skill, MCP, or Sub-agent:

Comparison Table

Aspect	Skill	MCP	Sub-agent
Context consumption	Low	High	Medium
Dynamic processing	Not possible	Possible	Possible
External API	Not possible	Possible	Via MCP
Maintenance	Markdown editing	npm publish, etc.	Markdown editing
Reusability	Within project	Global	Within project
Use case	Knowledge/guidelines	Tool/API integration	Role/expertise separation

Principles for Choosing

Skill = "Knowledge", "Guidelines", "Workflow definitions"
MCP   = "Tools", "API integration", "Dynamic processing"
Sub-agent = "Roles", "Expertise", "Task delegation"

Use Skills to define "what should be done"
Use MCP to provide "how to execute it"
Use Sub-agents to separate "who does it"

A2A vs Sub-agent

Fundamental Differences

Aspect	Custom Sub-agent	A2A Agent
Location	Within same process	Over the network
Owner	Self	Self or others
Trust	Full trust	Authentication/authorization required
Context	Partially shared with parent	Completely isolated
Lifecycle	Session-limited	Persistent service
Internal implementation	Visible (Markdown)	Not visible (API contract only)

Analogy

Custom Sub-agent = "Internal specialized department"
A2A Agent        = "Outsourcing partner / Partner company"

Even with internal specialized departments, outsourcing partners are needed
Even with outsourcing partners, internal specialized departments are needed

→ Both are necessary; they are not substitutes for each other

When to Use Which

Scenario	What to Use
Want to use your own MCP expertly	Sub-agent
Want to reuse the same processing repeatedly	Sub-agent
Want to define a workflow	Sub-agent
Integrate with third-party agents	A2A
Expose your agent externally	A2A
Agent collaboration across multiple organizations	A2A

Executor Selection

Beyond choosing MCP / Skill / Sub-agent, the perspective of "who makes the decision" becomes crucial.

Evolution of Execution

The way we integrate with external services has evolved with technology.

In this evolution, not everything needs to be MCP-ified. The appropriate layer is determined by "who makes the decision".

Layer Selection by Decision Maker

Choose the implementation layer based on who makes the decision:

Decision Maker	Appropriate Layer	Characteristics	Examples
None (Deterministic)	Direct program	No judgment needed, fast, reliable	Batch processing, CI/CD, cron
Human	CLI	Human decides, AI doesn't execute	`gh pr list`, `aws s3 ls`
AI (One-shot)	MCP + Skill	AI decides and executes per request	Translation, RFC lookup, quality evaluation
AI (Continuous/Autonomous)	Sub-agent	Autonomous decisions with expertise	Review specialist, translation specialist

Decision Flow

This comprehensive flowchart guides the selection process from initial request through final implementation:

CLI vs MCP: When AI Makes the Decision

Key Insight: When an official CLI exists, CLI + Skill is more efficient than building an MCP
— From r/ClaudeAI community discussion

When both a CLI and an MCP are possible options, use this comparison table to choose the more efficient approach:

Aspect	CLI + Skill	MCP
Token consumption	Low (command only)	High (loads all tool definitions)
Startup cost	None	Requires server process
Authentication	Local	Managed by MCP
Purpose-built	◎ (Dedicated design)	△ (General purpose)

Examples

These examples illustrate the decision for popular services:

Service	CLI	Recommendation
GitHub	`gh`	CLI + Skill
AWS	`aws`	CLI + Skill
Google Cloud	`gcloud`	CLI + Skill
PostgreSQL	`psql`	CLI + Skill
Linear	❌	MCP
Greptile	❌	MCP
DeepL	❌	MCP

Key Insight

The fundamental principle for layer selection is:

Selection changes based on "who decides", not just "what to execute"

No decision needed  → Direct program
Human decides       → CLI
AI decides          → MCP or CLI + Skill
AI autonomous       → Sub-agent

With this perspective, you can avoid over-MCPization and implement at the appropriate layer.

Combination Patterns

The Most Powerful Combination

The most effective approach combines all three components working together:

Concrete Example: Translation Workflow

Here is a practical example showing how Skill, Sub-agent, and MCP work together:

markdown

<!-- skills/translation-workflow/SKILL.md -->

# Technical Document Translation Workflow

## MCP Tools Used

- `deepl` - Translation execution
- `xcomet` - Quality evaluation

## Guardrails (Inviolable Constraints)

- Registered glossary terms must always be used
- Do not add content not present in the source text

## Workflow

1. Translate with deepl:translate-text (formality: "more")
2. Evaluate with xcomet:xcomet_evaluate (evaluation pipeline)
   - Score 0.85 or higher: OK
   - Score below 0.85: Re-translate or manual correction
3. Detect errors with xcomet:xcomet_detect_errors

This example is a concrete implementation pattern of the two-layer verification structure defined in the Vision. Glossary adherence corresponds to guardrails (inviolable constraints), while the xCOMET score corresponds to the evaluation pipeline (probabilistic quality gate).

markdown

<!-- agents/translation-specialist.md -->

name: translation-specialist
description: Specialized agent for technical document translation and quality evaluation
tools: deepl:translate-text, xcomet:xcomet_evaluate, xcomet:xcomet_detect_errors
model: sonnet

You are an expert in technical translation.
Please refer to the translation-workflow skill.

Sequence Diagrams: Visualizing Execution Flow

Sequence diagrams help visualize how components interact during task execution.

Code Review Task

Here is how a code review task flows through the system:

Translation Workflow

Here is the sequence for a translation task with quality evaluation:

What the Three-Layer Model Does Not Explicitly Cover — Memory

The three-layer model (Agent / Skills / MCP) defines what an agent knows, what it can do, and on what basis it judges. However, how an agent remembers past interactions and outcomes and how it leverages that history — namely Memory — is outside the scope of this model.

Why Memory Is Not Included in the Three Layers

Memory is inherently dynamic, changing as conversations progress. In contrast, Skills are static domain knowledge and MCP is an explicit protocol interface — both are declaratively definable elements. Memory differs fundamentally in nature, and placing it alongside these layers would compromise the model's clarity.

Furthermore, Memory implementation varies significantly across LLMs and platforms:

Context window (short-term memory): Present in all LLMs, but size and lifecycle are model-dependent
Persistent memory: Platform-specific features (e.g., ChatGPT Memory) or application-layer implementations (e.g., LangChain ConversationBufferMemory)
CLAUDE.md: A concept close to "project-level memory" in Claude Code, but strictly an instruction file rather than Memory

Until unified protocols or standards are established, it is more appropriate to design Memory individually during the implementation phase rather than incorporating it into the three-layer model.

This Does Not Mean Memory Can Be Ignored

In actual agent design, Memory is an essential concern. Within the three-layer model, the Skills layer implicitly covers "long-term memory" of domain knowledge, and the MCP layer covers "reference memory" of external context. For conversation history and learning outcome retention, consider implementation strategies during the Development Phases.

Layer Structure Summary

The following layered structure shows how all components integrate. The Doctrine Layer (constraints, objectives, judgment criteria) governs all layers. In terms of information flow: doctrine constraints descend from above, resource facts ascend from below, and agent decision-making occurs at the center.

🔗 Deeper: Foundational principles of context management

This page covers the structure (what/how) of three-layer separation. If you want to understand why this separation is necessary — grounded in LLM structural constraints (Context Window size limits, Context Rot, Priority Saturation) — the sister site provides the design rationale.

understanding-llm / Part 2: Context Window — The structure of the LLM's "thinking space"
understanding-llm / Part 3: Always-Loaded Context — CLAUDE.md — How Agent layer instructions stay resident
understanding-llm / Part 5: On-Demand Context — Why Skills are loaded on-demand
understanding-llm / Part 6: Tool Context — MCP — The context cost of the MCP layer

MCP/A2A/Skill/Agent Architecture ​

Layer Structure Overview ​

Layer Responsibilities ​

About This Document ​

Overall Architecture ​

MCP Three-Layer Structure ​

Host / Client / Server ​

Why You Don't Need to Worry About the Client ​

MCP and A2A: Separation of Concerns ​

Protocol Differences ​

Official Recommendation ​

Custom Sub-agents ​

What is a Sub-agent? ​

Definition Format ​

Additional Definition Example ​

Sub-agent Positioning ​

Skill ​

What is a Skill? ​

Skill Characteristics ​

MCP vs Skill vs Sub-agent ​

Decision Flow ​

Comparison Table ​

Principles for Choosing ​

A2A vs Sub-agent ​

Fundamental Differences ​

Analogy ​

When to Use Which ​

Executor Selection ​

Evolution of Execution ​

Layer Selection by Decision Maker ​

Decision Flow ​

CLI vs MCP: When AI Makes the Decision ​

Examples ​

Key Insight ​

Combination Patterns ​

The Most Powerful Combination ​

Concrete Example: Translation Workflow ​

Sequence Diagrams: Visualizing Execution Flow ​

Code Review Task ​

Translation Workflow ​

What the Three-Layer Model Does Not Explicitly Cover — Memory ​

Layer Structure Summary ​

🔗 Deeper: Foundational principles of context management ​

MCP/A2A/Skill/Agent Architecture

Layer Structure Overview

Layer Responsibilities

About This Document

Overall Architecture

MCP Three-Layer Structure

Host / Client / Server

Why You Don't Need to Worry About the Client

MCP and A2A: Separation of Concerns

Protocol Differences

Official Recommendation

Custom Sub-agents

What is a Sub-agent?

Definition Format

Additional Definition Example

Sub-agent Positioning

Skill

What is a Skill?

Skill Characteristics

MCP vs Skill vs Sub-agent

Decision Flow

Comparison Table

Principles for Choosing

A2A vs Sub-agent

Fundamental Differences

Analogy

When to Use Which

Executor Selection

Evolution of Execution

Layer Selection by Decision Maker

Decision Flow

CLI vs MCP: When AI Makes the Decision

Examples

Key Insight

Combination Patterns

The Most Powerful Combination

Concrete Example: Translation Workflow

Sequence Diagrams: Visualizing Execution Flow

Code Review Task

Translation Workflow

What the Three-Layer Model Does Not Explicitly Cover — Memory

Layer Structure Summary

🔗 Deeper: Foundational principles of context management