All posts

Self-Evolving AI: Teaching Edward to Rewrite Himself

March 18, 20266 min read

Here's a question I couldn't stop thinking about: what if your AI assistant could improve itself? Not through fine-tuning or manual updates, but by actually writing and deploying its own code changes.

Edward's evolution system does exactly that. It's a self-coding pipeline that creates branches, writes code, runs validation, and merges changes — all without human intervention.

The Pipeline

Evolution runs as a managed cycle with clear stages:

  1. Branch — creates a feature branch from main
  2. Code — Claude Code writes the implementation based on an objective
  3. Validate — runs linting, type checking, and basic sanity checks
  4. Test — executes the test suite against the changes
  5. Review — a separate Claude instance reviews the diff for quality
  6. Merge — clean changes get merged to main

When a merge hits main, uvicorn --reload picks up the changes automatically. Edward is running the new code within seconds.

Why Not Just Ship Updates Manually?

Because I wanted to see what happens when the feedback loop is tight enough. Edward can identify patterns in how it's being used — tools that fail often, edge cases in memory retrieval, missing capabilities — and propose fixes for itself.

It's not AGI. It's closer to a CI pipeline where the developer is also an AI. The constraints are important: evolution operates within defined boundaries, changes go through validation, and there's a rollback mechanism if something breaks.

The Safety Model

Giving an AI write access to its own codebase sounds dangerous. In practice, the guardrails make it manageable:

  • Changes happen on branches, not directly on main
  • Validation must pass before merge
  • A separate review step catches issues the author might miss
  • One-click rollback reverts to the previous known-good state
  • The evolution config controls what kinds of changes are allowed

What I've Learned

The most interesting result isn't the code it writes — it's the feedback cycle. Edward surfaces its own limitations through usage, proposes improvements, implements them, and then operates with the improvements in place. It's a closed loop between operation and development.

Is it perfect? No. The code quality varies. Some proposed changes are brilliant; others get caught in review. But the system improves over time, and that's the whole point.

Related Posts