Self-Evolving AI: Teaching Edward to Rewrite Himself

March 18, 20266 min read

Here's a question I couldn't stop thinking about: what if your AI assistant could improve itself? Not through fine-tuning or manual updates, but by actually writing and deploying its own code changes.

Edward's evolution system does exactly that. It's a self-coding pipeline that creates branches, writes code, runs validation, and merges changes — all without human intervention.

The Pipeline

Evolution runs as a managed cycle with clear stages:

Branch — creates a feature branch from main
Code — Claude Code writes the implementation based on an objective
Validate — runs linting, type checking, and basic sanity checks
Test — executes the test suite against the changes
Review — a separate Claude instance reviews the diff for quality
Merge — clean changes get merged to main

When a merge hits main, uvicorn --reload picks up the changes automatically. Edward is running the new code within seconds.

Why Not Just Ship Updates Manually?

Because I wanted to see what happens when the feedback loop is tight enough. Edward can identify patterns in how it's being used — tools that fail often, edge cases in memory retrieval, missing capabilities — and propose fixes for itself.

It's not AGI. It's closer to a CI pipeline where the developer is also an AI. The constraints are important: evolution operates within defined boundaries, changes go through validation, and there's a rollback mechanism if something breaks.

The Safety Model

Giving an AI write access to its own codebase sounds dangerous. In practice, the guardrails make it manageable:

Changes happen on branches, not directly on main
Validation must pass before merge
A separate review step catches issues the author might miss
One-click rollback reverts to the previous known-good state
The evolution config controls what kinds of changes are allowed

What I've Learned

The most interesting result isn't the code it writes — it's the feedback cycle. Edward surfaces its own limitations through usage, proposes improvements, implements them, and then operates with the improvements in place. It's a closed loop between operation and development.

Is it perfect? No. The code quality varies. Some proposed changes are brilliant; others get caught in review. But the system improves over time, and that's the whole point.

Agents That Work While You Sleep

5 min read read

Why I Built Edward

6 min read read

The Pipeline

Why Not Just Ship Updates Manually?

The Safety Model

What I've Learned

Related Posts