Skip to content

Vision for AI-Driven Development

This document outlines the philosophy underlying AI agent architecture (MCP, Skills, and Agent integration) and the fundamental approach to AI-driven development.

Audience: Engineers interested in AI-driven development. Whether you're a practitioner evaluating MCP/Skills adoption or a decision-maker considering team-wide integration, this document provides a foundational perspective.

Position of This Page

👉 This page (WHY — Why we need authoritative references)

Meta Information
What this chapter establishesAI's four fundamental limitations (accuracy, currency, authority, accountability) and the need for "authoritative references"
What this chapter does NOT coverSpecific reference source selection (→02), architecture design (→03), implementation techniques (→04)
DependenciesNone (starting point of the Concepts section)
Common misuseConcluding "AI is unusable." This chapter's argument is "recognize the constraints, then structurally compensate"

Core Understanding

AI is "Not Omnipotent"

While AI capabilities are rapidly advancing, it is crucial to correctly recognize their limitations. To avoid over-reliance on AI and use it appropriately, we need to understand the following constraints.

AI generates outputs probabilistically from training data, but cannot guarantee the following:

AI LimitationDescription
AccuracyHallucination problem - may generate information that differs from facts
CurrencyDoes not have information beyond the training data cutoff
AuthorityCannot guarantee official interpretation of specifications
AccountabilityCannot provide grounds for legal or ethical judgments

Therefore, we need to connect to reliable sources.

The Essence of AI-Driven Development

AI-driven development ≠ Having AI write code
AI-driven development = Utilizing AI throughout all processes while humans retain judgment and creativity

What "All Processes" Means

This is not limited to coding. It refers to the broad range of management domains involved in software development — including project management, product management, SRE, security, and data management — as outlined in Management of Software Systems and Services.

However, realizing this ideal requires a prerequisite. For AI to be truly useful "across all processes," it must be placed in an environment where it can make correct judgments.

The Reality During This Transitional Period

The era where AI autonomously completes every process has not yet arrived. AI excels at "generating plausible output," but it cannot judge on its own whether that output is correct.

At the same time, the code AI generates today depends on abstraction layers — frameworks and libraries — whose foundations are the standards and specifications that humanity has accumulated over time.

Although the code AI generates is built on top of these standards, AI cannot directly reference them (dashed line). This is the core problem.

For AI to function correctly, it must be able to reference the same standards and specifications that its generated code ultimately depends on. This is why "unwavering reference sources" are necessary.

The Importance of "Unwavering Reference Sources"

Why Reference Sources Are Needed

AI ChallengeWhat Reference Sources Solve
Fixed point-in-time training dataAccess to authoritative up-to-date sources
HallucinationProvision of verifiable evidence
Interpretation variance by contextConsistent decision criteria
Lack of latest informationRetrieval of current specifications

Two Means to Achieve "Unwavering Reference Sources"

MCP and Skills serve as means to provide AI with "unwavering reference sources."

MeansRoleExamples
MCPDynamic access to external authoritative sourcesRFC, legislation, W3C standards
SkillsSystematization of domain knowledge and best practicesDesign principles, workflows, coding standards

A note on "Skills" terminology

In this document, "Skills" refers to Markdown-based systematization of domain knowledge, following the format defined by vercel-labs/skills. Unlike OpenAI's "Actions" or LangChain's "Tools," Skills are not executable code — they are structured knowledge and judgment criteria that AI references.

Essential Definition of "Unwavering Reference Sources"

An "unwavering reference source" is a fact retrieved from a verifiable information source, not an LLM's speculation.

Based on this definition, reference sources can be classified into two types:

TypeCharacteristicsExamplesVerification Method
Static ReferenceContent is fixed and immutableRFCs, legislation, W3C specsVersion / section number
Dynamic ReferenceValues change, but are factual at the time of retrievalSensors, APIs, real-time dataTimestamp + data source ID

Both share the property of "not being speculatively generated by an LLM." Dynamic references require separate verification of the data source's authenticity, but retrieving them via MCP ensures clear provenance.

Value of Reference MCP/Skills

  1. AI decisions become verifiable - Can demonstrate the basis for outputs
  2. Consistent quality is supported - Standards-aligned outputs
  3. Vendor lock-in is avoided - Based on open standards
  4. Access to knowledge is democratized - Reach accurate information without being an expert
  5. Domain knowledge becomes reusable - Formalize team know-how as Skills

Democratization of Knowledge

Problems with the Traditional Approach

  • High cost
  • One-way
  • Language barriers

The World MCP/Skills Enables

Development based on accurate information becomes possible without relying on expensive consultants or specialists.

For how to distinguish between MCP and Skills, see skills/vs-mcp.md.

Three Axes of Knowledge Transformation

Knowledge transformation in AI-driven development is not one-directional. This architecture defines the following three transformation axes.

AxisDirectionPurposeExample
① StructuringHuman → AITransform authoritative sources into AI-accessible formatsRFC → rfcxml-mcp
② ComprehensionAI → HumanTransform complex information into understandable formatsRFC 3161 → Checklist
③ VerificationSpec → TestConvert specifications into verifiable criteriaEPUB 3.3 requirements → JSON test suite

Axis ③ "Verification" differs from ① and ② in that it forms a quality closed loop. Simply passing specifications to AI does not make its output verifiable — converting specifications into tests is what makes "driving" possible.

Human → AI (Structuring) Knowledge Transformation

Enable AI to access "unwavering reference sources."

Structuring External Information Sources via MCP

Human KnowledgeStructured FormatAI-Usable Form
Legal texte-Gov APIhourei-mcp
Technical specificationsRFC XMLrfcxml-mcp
Web standardsW3C/WHATWGw3c-mcp
Translation rulesGlossaryDeepL Glossary

Systematizing Domain Knowledge via Skills

Team KnowledgeFormatAI-Usable Form
Design principlesMarkdownfrontend-design skill
Coding standardsMarkdowncoding-standards skill
WorkflowsMarkdowndoc-coauthoring skill

AI → Human (Comprehension Support) Knowledge Transformation

Enable humans to access accurate knowledge even without being specialists.

Complex Information SourceAI ProcessingHuman-Understandable Form
RFC 3161 (135 requirements)Extraction/ClassificationChecklist
Digital Signature Law + RFCMappingCorrespondence table
Technical specificationsVisualizationMermaid diagrams
English RFCsTranslationExplanations in local language

Division of Roles Between Humans and AI

The Responsibility Shift Model

As abstraction rises, the responsibilities of accuracy, reliability, and judgment do not disappear — they shift who holds them.

Responsibility PhaseOwnerScopeVerification Mechanism
Design-timeHumanSelection of reference sources, structural design, defining judgment criteriaSpec-to-Test conversion
Execution-timeAgentReasoning and task execution based on reference sourcesEvaluation pipeline (probabilistic quality gate)
Structural constraintsSystemConsistency of references, access control, audit trailsGuardrails (inviolable constraints)

If these responsibility boundaries remain ambiguous as abstraction increases, a situation arises where "no one is accountable." This architecture aims to make these boundaries explicit at the design level, using two verification layers:

Verification LayerNatureExample CriteriaRole
GuardrailsInviolable (boundary-based)ESLint errors = 0, type checks passDefines "lines that must not be crossed"
Evaluation PipelineProbabilistic (threshold-based)xCOMET >= 0.85, test coverage >= 80%Defines "acceptable ranges"

Traditional TDD (Red-Green-Refactor) does not apply directly, but test-first thinking — converting specifications into verifiable forms before implementation — remains fundamentally valuable in AI-driven development.

Complementary Structure

The diagram below illustrates how human capabilities and AI capabilities complement each other to achieve better development outcomes.

Basic flow of MCP, Skills, and Agent

Here is the fundamental flow that shows how user input flows through the agent core and tool integrations to produce results. For the governance layer (judgment criteria, constraints, and objectives) that governs all these layers, see the Doctrine Layer.

Positioning of This Repository

This repository is a place to organize the design philosophy, architecture, and practical know-how of AI agent architecture (MCP, Skills, and Agent integration), and to document strategies for building "unwavering reference sources" as the foundation of AI-driven development.

What This Document Does Not Guarantee (Non-goals)

To clarify the scope of this document, we state the following explicitly.

What this document claimsWhat this document does NOT guaranteeWhat it provides instead
Providing verifiable reference sources improves AI judgment qualityFactual correctness of all AI outputsSpec-to-Test verification pipeline
MCP/Skills provide structural constraintsElimination of human reviewTwo-layer structure: guardrails (inviolable constraints) + evaluation pipeline (probabilistic quality gate)
The design aims to clarify responsibility allocationThat the system assumes legal or ethical liabilityDesign-time responsibility boundaries + runtime audit trails

Reader contract: This document presents a design philosophy for "how to place AI in a trustworthy environment." It does not promise specific quality levels or safety guarantees. However, rather than leaving outputs unverifiable, it adopts an approach of converting specifications into verifiable tests and ensuring quality through a two-layer system of guardrails and evaluation pipelines. Final judgment and accountability always remain with humans.

Core Messages

  1. AI-driven development is not just code generation - Utilize AI throughout all processes
  2. AI needs guidelines for decision-making - The importance of unwavering reference sources
  3. Systematize human engineering knowledge - Formalize as MCP/Skills
  4. Standards-based MCPs are the foundation - Democratize access to RFC, W3C, legislation, etc.
  5. Share domain knowledge via Skills - Make team know-how reusable
  6. Bidirectional knowledge transformation - Human→AI (structuring), AI→Human (comprehension support)
  7. Explicit judgment criteria - Define constraints, objectives, and judgment criteria via the Doctrine Layer to enable autonomous AI decision-making

Released under the MIT License.