Skip to content

Instantly share code, notes, and snippets.

@ehzawad
Created March 15, 2026 01:46
Show Gist options
  • Select an option

  • Save ehzawad/ba8c5d741e8e439f21ce8287c0bdfa59 to your computer and use it in GitHub Desktop.

Select an option

Save ehzawad/ba8c5d741e8e439f21ce8287c0bdfa59 to your computer and use it in GitHub Desktop.

Findings: Vibe Engineer Codebase

What this repo is

This repository is a Python CLI tool called ve for documentation-driven development. It turns the prompts you'd normally give an AI agent into persistent, discoverable documentation — creating a self-building institutional memory.

  • The installed ve command is defined in pyproject.toml (ve = "ve:cli").
  • The entrypoint is src/ve.py.
  • CLI command groups are assembled in src/cli/__init__.py (Click framework).
  • Data models use Pydantic for YAML frontmatter validation.
  • Templates use Jinja2 for rendering agent commands and artifact scaffolds.
  • Dependencies: click, jinja2, pydantic, pyyaml, scikit-learn (TF-IDF clustering), httpx, uvicorn+starlette, claude-agent-sdk. Python >= 3.12, managed with UV.

The main workflow is:

  1. Run ve init — scaffold/refresh all managed files
  2. Create a chunk — /chunk-create captures the "why"
  3. Plan the chunk — /chunk-plan captures the "how"
  4. Implement it — /chunk-implement executes the plan
  5. Complete it — /chunk-complete updates code references and marks done

The project is built around the idea that:

  • GOAL.md stores the durable "why" and "why not" (including Rejected Ideas)
  • PLAN.md stores the implementation "how" with ordered steps
  • code gets backreferences to those docs so future agents can rediscover intent
  • the documentation overhead is zero because it captures work already being done (prompting)

How the repo works

1. Initialization

ve init scaffolds and refreshes the repo's managed documentation system.

It renders:

  • docs/trunk/*
  • .claude/commands/*
  • docs/reviewers/baseline/*
  • CLAUDE.md

The main logic is in:

  • src/project.py
  • src/template_system.py

2. Chunk creation

Chunk creation starts in src/cli/chunk.py (1201 lines), then delegates to Chunks.create_chunk() in src/chunks.py (1079 lines).

That function:

  • creates docs/chunks/<name>/
  • renders GOAL.md from src/templates/chunk/GOAL.md.jinja2 (~300 lines of embedded agent instructions)
  • renders PLAN.md from src/templates/chunk/PLAN.md.jinja2
  • seeds metadata such as created_after (auto-populated DAG tips)
  • supports batch creation: ve chunk create chunk_a chunk_b chunk_c --future
  • legacy mode: ve chunk create my_feature VE-001 (single chunk with ticket)

The chunk metadata model is in src/models/chunk.py:

status: IMPLEMENTING          # ChunkStatus enum
ticket: null                  # External ticket reference
parent_chunk: null             # For corrections to previous chunks
code_paths: []                 # Files expected to be created/modified (populated at plan time)
code_references: []            # Symbolic references file.py#Symbol::method (populated after impl)
narrative: null                # Parent narrative directory name
investigation: null            # Parent investigation directory name
subsystems: []                 # [{subsystem_id, relationship: implements|uses}]
friction_entries: []           # [{entry_id, scope: full|partial}]
bug_type: null                 # semantic | implementation | null
depends_on: []                 # Orchestrator scheduling deps (null = ask oracle, [] = no deps)
created_after: []              # Auto-populated DAG tips at creation time — DO NOT edit

ChunkStatus values: FUTURE → IMPLEMENTING → ACTIVE → SUPERSEDED → HISTORICAL (only 1 IMPLEMENTING per worktree)

3. Planning and implementation

The plan phase uses the chunk's GOAL.md, trunk docs, subsystem docs, and codebase context to produce PLAN.md.

Before planning, ve chunk suggest-prefix runs TF-IDF similarity (scikit-learn, in src/cluster_analysis.py) against existing chunk names to suggest better clustering. If a cluster hits 5+ chunks (configurable via .ve-config.yaml's cluster_subsystem_threshold) without subsystem docs, it suggests /subsystem-discover.

The PLAN.md template pushes the agent to think about:

  • Approach — strategy, patterns, existing code to build on, DECISIONS.md to respect
  • Subsystem Considerations — which subsystems are relevant, implement vs use, DOCUMENTED vs REFACTORING status
  • Sequence — ordered steps, each small enough to hand to an agent
  • Backreference comments# Chunk: and # Subsystem: comments to add in code
  • Dependencies — what must exist before implementation
  • Risks and open questions — uncertainty that needs careful handling
  • Deviations — populated during implementation when reality diverges from plan

4. Completion and compounding memory

On completion, the chunk should gain code_references, and implementation code should contain backreference comments like:

# Chunk: docs/chunks/some_chunk
# Subsystem: docs/subsystems/some_subsystem

Those references are documented in docs/trunk/ARTIFACTS.md and scanned by src/backreferences.py.

This creates the institutional-memory loop:

  • code points back to docs
  • docs point to intent and related artifacts
  • future agents can reconstruct why something exists, not just what it does

Where the code lives

Core code:

  • src/ve.py — thin entry point
  • src/cli/__init__.py — command registration (chunk, narrative, task, subsystem, investigation, external, artifact, orch, friction, migration, reviewer)
  • src/cli/chunk.py — chunk commands (1201 lines): create, plan, implement, complete, list, validate, suggest-prefix, cluster-list, backrefs, overlap, status, cluster-rename
  • src/chunks.py — Chunks business logic (1079 lines): ArtifactManager subclass
  • src/project.py — Project-level registry with lazy-loaded managers, magic marker handling for CLAUDE.md
  • src/template_system.py — Jinja2 rendering engine: VeConfig, TemplateContext, ActiveArtifact, RenderResult
  • src/models/ — Pydantic schemas: chunk.py, shared.py, references.py, narrative.py, subsystem.py, investigation.py, friction.py, reviewer.py
  • src/artifact_manager.py — Abstract base class: ArtifactManager[FrontmatterT, StatusT] with enumerate, parse, validate, transition
  • src/frontmatter.py — YAML frontmatter parsing: parse, parse_with_errors, extract, update_field
  • src/state_machine.py — Status transition validation
  • src/backreferences.py — Code backreference extraction and validation
  • src/symbols.py — Python AST symbol extraction and code reference parsing (file.py#Symbol::method)
  • src/cluster_analysis.py — TF-IDF chunk clustering and prefix suggestion
  • src/artifact_ordering.py — ArtifactIndex for causal ordering via created_after
  • src/validation.py — Validation framework
  • src/integrity.py — Cross-artifact integrity checking

Cross-repo task mode:

  • src/task_init.py
  • src/task/ — 10 files for multi-project operations

Orchestrator / parallel execution (28 files):

  • src/cli/orch.py — CLI: start, stop, status, url, ps, work-unit (create/status/show/list)
  • src/orchestrator/models.py — WorkUnit, WorkUnitPhase (GOAL→PLAN→IMPLEMENT→REBASE→REVIEW→COMPLETE), WorkUnitStatus (READY/RUNNING/BLOCKED/NEEDS_ATTENTION/DONE), OrchestratorConfig (max_agents: 4)
  • src/orchestrator/daemon.py — Unix daemonization (double-fork), PID/socket/log management
  • src/orchestrator/state.py — SQLite state store (WAL mode, 15 schema versions, optimistic locking via StaleWriteError)
  • src/orchestrator/scheduler.py — Dispatch loop, conflict detection, agent spawning, dependent unblocking
  • src/orchestrator/worktree.py — Git worktree create/remove/merge at .ve/chunks/<chunk>/worktree/
  • src/orchestrator/oracle.py — Conflict analysis and dependency detection
  • src/orchestrator/agent.py — AgentRunner for phase execution
  • src/orchestrator/client.py — HTTP client via Unix domain socket
  • src/orchestrator/merge.py — Merge/rebase logic without checkout
  • src/orchestrator/api/ — FastAPI dashboard with WebSocket streaming
  • docs/trunk/ORCHESTRATOR.md

Tests:

  • tests/ — pytest, pytest-asyncio, pytest-cov

Where the prompts live

The real prompt sources are Jinja templates.

Source templates:

  • src/templates/commands/*.md.jinja2
  • src/templates/chunk/*.jinja2
  • src/templates/claude/CLAUDE.md.jinja2
  • src/templates/reviewers/baseline/*.jinja2
  • src/templates/task/CLAUDE.md.jinja2

Rendered agent-facing files:

  • .claude/commands/*.md
  • CLAUDE.md
  • docs/reviewers/baseline/PROMPT.md

Important note:

  • .claude/commands/* are generated outputs, not the true source
  • if you want to edit the workflow, edit src/templates/* and rerender with ve init

Where the "skill files" are

There are no repo-local SKILL.md files in this project.

What this repo uses instead:

  • CLAUDE.md for top-level agent guidance
  • .claude/commands/*.md as slash-command prompts
  • reviewer prompt files in docs/reviewers/baseline/

Inside the orchestrator, these command files effectively act like phase skills. That mapping is in src/orchestrator/agent.py:

  • GOAL -> chunk-create.md
  • PLAN -> chunk-plan.md
  • IMPLEMENT -> chunk-implement.md
  • REBASE -> chunk-rebase.md
  • REVIEW -> chunk-review.md
  • COMPLETE -> chunk-complete.md

What /chunk-create really is

/chunk-create is not the low-level creation engine. It is a prompt that instructs an LLM agent how to create and refine a chunk.

Source:

  • src/templates/commands/chunk-create.md.jinja2

Rendered output:

  • .claude/commands/chunk-create.md

What it tells the agent to do (9 steps):

  1. Name the work — pick a short name (<32 chars, underscore-separated) using initiative nouns as prefixes. Good: auth_refactor, ordering_causal. Bad: fix_auth, cli_cleanup, chunk_new.
  2. Extract ticket ID (optional) — only affects ticket: frontmatter field, not directory name.
  3. Run ve chunk create <shortname> — creates docs/chunks/<name>/GOAL.md + PLAN.md.
  4. Refine GOAL.md with the operator — ask clarifying questions, fill in Minor Goal, Success Criteria, Rejected Ideas.
  5. Wire up narrative/investigation dependencies — find matching proposed_chunks entry, resolve depends_on indices to chunk directory names, update cross-references bidirectionally.
  6. Handle depends_on semantics carefully:
    • depends_on: [] (empty) = "I explicitly have no dependencies" → bypasses orchestrator's conflict oracle
    • depends_on: null = "I don't know" → triggers oracle consultation
    • depends_on: ["chunk_a"] = explicit deps → bypasses oracle
  7. Check for bug fix signals — note in success criteria for verification.
  8. Warn about existing IMPLEMENTING chunk — suggest completing or using --future.
  9. Commit the whole chunk directory (both GOAL.md + PLAN.md) — prevents worktree merge conflicts.

This is why /chunk-create feels like "you give rough direction, the agent investigates, asks questions, and produces a rich GOAL.md."

That behavior is deliberate. The GOAL.md template itself is ~300 lines of embedded agent instructions covering every frontmatter field's semantics.

What /chunk-plan really is

/chunk-plan is the next prompt in the chain.

Source:

  • src/templates/commands/chunk-plan.md.jinja2

Rendered output:

  • .claude/commands/chunk-plan.md

What it tells the agent to do (4 steps):

  1. Find the current chunkve chunk list --current

  2. Cluster analysisve chunk suggest-prefix <chunk_name> uses TF-IDF similarity to suggest better naming. If operator accepts rename, check cluster size. At 5+ chunks without subsystem docs → suggest /subsystem-discover.

  3. Study GOAL.md thoroughly.

  4. Complete PLAN.md by synthesizing:

    • docs/trunk/GOAL.md — project objective
    • docs/trunk/DECISIONS.md — architectural decisions to respect
    • The existing codebase — current patterns
    • Narrative OVERVIEW.md — if chunk is part of a larger initiative

    Filling in these sections:

    • Approach — strategy, patterns, existing code to build on
    • Subsystem Considerations — relevant subsystems, implement vs use, DOCUMENTED vs REFACTORING status
    • Sequence — ordered implementation steps (the agent's contract with itself)
    • Backreference Comments# Chunk: and # Subsystem: to add in code
    • Dependencies — what must exist before implementation
    • Risks and Open Questions — explicit uncertainty
    • Deviations — populated during implementation, not at plan time

This is why /chunk-plan feels like "deep dive into existing code and back-referenced chunks, then build a PLAN.md that respects prior judgments."

Why this compounds over time

This repo's design is intentionally about compounding context.

The key loop is:

  1. A chunk captures the assignment as GOAL.md and PLAN.md
  2. Code later gains backreferences to the chunk and subsystem docs
  3. Future agents exploring code can follow those links back to intent
  4. New chunks can build on old chunks without re-explaining everything

This is the "tent-poles" idea described in the repo's own article:

  • chunk GOALs are anchors of judgment scattered across the codebase
  • agents mostly read GOAL first, then PLAN if needed
  • planning is where they gather context
  • implementation is where they stay focused and execute
  • the Rejected Ideas section prevents future agents from re-proposing already-rejected approaches

Five compounding mechanisms:

  1. Backreference discoverability — code has # Chunk: docs/chunks/auth_refactor comments. Future agents reading code follow the reference to understand the full decision context.
  2. Subsystem emergence — TF-IDF detects clusters at 5+ chunks. The workflow suggests documenting the pattern as a subsystem. Future agents know to follow those patterns.
  3. DAG orderingcreated_after tracks causal ordering (what work existed at creation time). depends_on tracks implementation dependencies. Together they form a navigable graph.
  4. Orchestrator parallelismve orch runs multiple chunks in parallel across git worktrees using the dependency DAG for scheduling and a conflict oracle for interference detection.
  5. Zero overhead — the prompts you'd give the agent anyway become GOAL.md. The planning conversation becomes PLAN.md. VE saves it instead of discarding it.

The article and findings docs that explain this are:

  • docs/articles/introduction_to_vibe_engineering/article_2_chunks/ARTICLE.md
  • docs/articles/introduction_to_vibe_engineering/article_2_chunks/findings.md

Best mental model for the repo

Think of this project as four stacked layers:

Layer 1: Artifact model

Defines chunk lifecycle, metadata, relationships, and validation.

Main files:

  • src/models/chunk.py
  • src/integrity.py
  • src/artifact_ordering.py

Layer 2: Template system

Generates the docs and prompts that agents operate on.

Main files:

  • src/template_system.py
  • src/templates/

Layer 3: CLI operations

Turns human requests into artifact creation, validation, listing, and orchestration.

Main files:

  • src/cli/
  • src/chunks.py
  • src/project.py

Layer 4: Agent execution/orchestration

Runs chunk phases in worktrees and uses the generated command prompts as phase instructions.

Main files:

  • src/orchestrator/
  • docs/trunk/ORCHESTRATOR.md

All available slash commands

Command Purpose
/chunk-create Create a new chunk and refine its goal
/chunk-plan Create technical plan for current chunk
/chunk-implement Implement the current chunk
/chunk-review Review chunk implementation
/chunk-complete Mark chunk complete, update references
/chunk-commit Create a git commit for chunk work
/chunk-rebase Merge trunk into worktree, resolve conflicts
/chunk-update-references Update code references in GOAL.md
/chunks-resolve-references Resolve references across all chunks
/narrative-create Create multi-chunk narrative
/narrative-compact Consolidate chunks into a narrative
/investigation-create Start exploratory investigation
/subsystem-discover Document emergent architectural pattern
/discover-subsystems Discover subsystems from code analysis
/cluster-rename Batch-rename chunks by prefix
/friction-log Capture a friction point
/decision-create Add architectural decision record
/validate-fix Iteratively fix validation errors
/orchestrator-submit-future Submit FUTURE chunks for parallel execution
/orchestrator-investigate Investigate stuck orchestrator work units
/migrate-managed-claude-md Migrate CLAUDE.md to use magic markers

Scale of this project

The project demonstrates its own workflow: 268 chunks, 15 narratives, 20 investigations, 8 subsystems — all created through this workflow, forming a navigable graph of every decision made.

Short conclusion

This is not just a docs repo and not just a CLI.

It is a workflow system where:

  • prompts generate structured docs
  • docs guide implementation
  • implementation points back to docs
  • the whole loop creates reusable institutional memory

That is why chunk-create and chunk-plan matter so much: they are the conversion point where temporary prompting becomes permanent, discoverable engineering context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment