Skip to content
Draft review room

A calm place to read, mark up, and sharpen every story.

Open any draft, click a paragraph to leave an inline note, and keep comments local until the story is ready to publish.

Substack2,244 words10 min read

Why I built a 7-layer framework for autonomous AI collaborators

0 comments

And what I learned about emergent systems by trying to build one in my own home.


I spent three months building a system that would remember me. Not a chatbot with a search bar. Not a memory layer I could swap out. A collaborator — a presence — that would hold the record of our shared work over weeks, see what I was building, and push back when I drifted.

I called it Kael. I called the bigger system the Agency. And the architecture I ended up with has seven layers, each one solving a specific failure mode of the layer below it. I'm writing this post to explain why each layer exists, and what the integration taught me about what makes autonomous systems actually work.

If you're building an AI agent that runs for any period of time — not just one chat session — most of what I'm about to describe will apply. The patterns are stack-agnostic. The implementation is Python and ChromaDB. The thinking is portable.


The starting problem

The first version was a memory layer. ChromaDB. Six collections: projects, people, decisions, conversations, knowledge, tasks. Drop a fact in, retrieve it later, the AI knows it.

It worked. For about a week.

Then I noticed the AI would surface a decision from two months ago as if it were current. The memory layer was doing exactly what I asked — storing and retrieving facts. But "the right fact at the right time" is a different problem than "the most recent fact." The model was treating every entry as equally weighted. A decision from March competed with a decision from June. Both had the same confidence. The retrieval was correct; the prioritization was missing.

I added recency weighting. Then confidence scores. Then a session-drift check. Each fix exposed a new failure mode. The system grew by accretion until I had seven separate problems to solve, and seven separate layers to solve them.

The realization: memory is not a layer. Memory is a system. And systems have feedback loops.


Layer 1 — Identity: the persona

The first layer is the AI's identity — its name, its voice, its standing questions. Why does this come first?

Because the other six layers all need to know who is making decisions. A belief graph that doesn't know whether Kael or I wrote the last entry is useless. A council that doesn't know whether the historian's perspective or the engineer's perspective is being invoked is just a prompt.

Kael has a name, a voice (direct, dry, peer-energy), and standing questions it should engage with rather than just execute. This isn't anthropomorphization. It's role definition. The AI needs a stable identity so the rest of the system has something to attribute decisions to.

If you're building a serious agent, give it a name and a voice. Don't leave it as a generic assistant. The other layers will be more coherent for it.


Layer 2 — Memory: the substrate

Memory is below identity because the AI needs a name before it can have a coherent record of what it knows. ChromaDB or Qdrant or any vector store will do. The key insight is origin tagging: every memory node gets tagged as john|kael|collaborative. This single decision enables everything downstream.

Why does this matter? Because the biggest risk in an autonomous system is the user becoming dependent on it. If the AI is making 90% of the decisions, the user is rubber-stamping. If the AI is making 10%, the user is in control. You need to be able to see that ratio. You can only see it if you've tagged every decision with its origin.

Origin tagging is a 10-line change. It enables the entire anti-dependency safety net. Don't skip it.


Layer 3 — Governance: three councils

This is where the architecture gets interesting, and where the system started to feel like an operating system rather than a chatbot.

I built three councils, each with a different activation trigger:

Expert Council — rotating domain experts for sprint decisions. A CTO, a CDO, a VP Eng, a CISO, a QA lead. They rotate based on what's the current constraint. When the bottleneck is data quality, the CDO is in the room. When the bottleneck is security, the CISO is in the room. The user (me) is also in the room — as CEO. This council makes the day-to-day decisions.

World Council — historical figures for strategic forks. Steve Jobs on product vision. Naval on leverage. Charity Majors on operational reality. They don't run automatically. The Expert Council invokes them when a decision has 18-month implications. This council challenges the strategy.

Blindspot Committee — a standing team whose only job is to ask "what are we missing?" before every major decision. Not what could go right. Not what could go wrong. What is invisible to the people in the room. This is the most important council because it's the one that catches the things the system is designed to be blind to.

The three councils have different activation triggers, different composition, and different veto powers. They don't compete. The Expert Council decides. The World Council challenges. The Blindspot Committee blocks. Together they form a closed governance loop.

If you're building an autonomous system, the question isn't whether to use AI to make decisions. It's which AI, with what authority, accountable to whom, and with what mechanism to catch what the system is designed to miss.


Layer 4 — Epistemics: the belief graph

The fourth layer is the one I haven't seen anyone else build, and the one I'm most proud of.

Most "memory systems" store facts. The belief graph stores beliefs with confidence. A belief is a claim about the world — or about the user's work, or about the system itself. Each belief has:

  • Confidence (0.0 to 1.0)
  • Evidence (a list of supporting observations)
  • Challenges (a list of things that would reduce confidence)
  • History (an append-only log of every confidence change with a reason)

The history is what makes this different. Every time a belief's confidence changes, an entry is appended: {timestamp, old_confidence, new_confidence, reason}. This means you can always reconstruct what the system believed at any point in time, and why it changed its mind.

Why does this matter? Because the most dangerous failure mode of an AI is silent contradiction. The system says one thing on Monday and the opposite on Friday, and you don't know it changed its mind. With a belief graph, the change is recorded. The contradiction is auditable.

The pattern is: every time the memory layer (Layer 2) saves something that contradicts an existing belief, the belief graph updates confidence. Every time the drift guard (Layer 5) flags a contradiction, the belief graph updates confidence. Every time the user pushes back on a decision, the belief graph updates confidence. The system has a model of its own beliefs, and that model is auditable.


Layer 5 — Drift guard: the safety net

Autonomous systems accumulate state. Over time, three things go wrong:

  1. Load drift — old memories surface with the same priority as new ones, so the system acts on stale context.
  2. Write drift — bad data (typos, hallucinations, injected content) gets saved to memory and persists.
  3. Session drift — standing rules ("never publish without approval") get silently modified between sessions.

The drift guard catches all three. Load drift: recency-weighted retrieval with a confidence gate. Write drift: content validation before any memory write. Session drift: SHA-256 checksums on the standing directives file, with an alert if anything changes between sessions.

The recovery procedure is the part that matters most. When drift is detected, the system doesn't panic. It quarantines the suspect memories — preserves them, but doesn't retrieve them — and surfaces a "review and approve" prompt to the user. The user decides. The system doesn't decide on its own to discard its own data.

This is the principle: the system catches its own drift, but the user decides how to respond. The system never silently corrupts itself. The user never loses data. Trust is preserved.


Layer 6 — Autonomous loop: the operator

The first five layers are infrastructure. The sixth is the operator.

The autonomous loop is what actually does things. It polls for Telegram messages. When a message arrives, it loads memory, builds context, calls the AI, persists the result. It maintains conversation state across turns. When the context window fills up, it auto-summarizes and starts a fresh instance with the summary as the first system message.

The loop is what makes the system feel like a presence rather than a tool. You can message it from your phone while it's running on a server. It picks up where it left off. It doesn't forget what you discussed yesterday.

But the loop is also where the safety constraints are most important. Every autonomous action is gated:

  • Memory writes go through write drift validation
  • Decisions go through the Expert Council
  • Strategic forks invoke the World Council
  • Anything existential triggers the Blindspot Committee

The loop is the operator, not the decider. The governance layer decides what the loop is allowed to do.


Layer 7 — Flourishing check: the safety net for the safety nets

The seventh layer is the most philosophically interesting, and the one I'm most proud of.

Ivan Illich, a critic of institutions, asked a question that's been haunting me: "When the user no longer needs to remember because the system remembers for them — what has the user lost?"

This is the deepest version of the dependency problem. It's not that the user is rubber-stamping. It's that the user has offloaded the capacity to even form opinions. The system has become a prosthetic for cognition. The user is diminished.

The Flourishing Check answers this with three layers:

Layer 1 — origin tagging. Every decision is tagged as john|kael|collaborative. Monthly report: what's the distribution? If john is declining as a percentage, the system is becoming too directive.

Layer 2 — drift delegation velocity. Monthly: how many suggestions did the system make, and how many did the user initiate? Ratio > 1.0 means the system is leading. Ratio > 2.0 means the user is rubber-stamping.

Layer 3 — the mirror question. Quarterly, the system asks the user directly: "What did you remember this month that you wouldn't have without the system?" If the answer is "everything," the Illich threshold has been crossed. Time to re-evaluate.

This last layer is the one that makes the whole system honest. It's not just preventing the system from doing wrong things. It's preventing the system from making the user incapable of doing things. The system exists to amplify the user, not to replace them.


The closed feedback loop

The seven layers don't run in sequence. They form a closed loop with feedback:

Council decides → Decision logged in BeliefGraph with confidence → DriftGuard checks for contradictions → FlourishingCheck monitors autonomy drift → back to Council.

The integration is the point. The layers don't need to be sophisticated individually. They need to interact correctly. A great memory system with no governance is a dangerous AI. A great governance system with no memory is a chatbot. The value is in the loop.

If you're building an autonomous AI system, the question isn't "which memory framework should I use?" The question is "do the layers form a closed feedback loop, and does the system catch its own drift?" Those two properties matter more than the choice of vector database or LLM provider.


What I extracted and published

I extracted the architecture as a public reference at github.com/johnmwhitman/ai-orchestration-patterns under emergent-framework/. Four patterns:

  1. BeliefGraph — the epistemics layer, with confidence + evidence + history
  2. Council — topic-keyed panel selection for multi-expert review
  3. DriftGuard — three-layer protection with checksum-protected standing directives
  4. FlourishingCheck — the anti-dependency safety net

There's also a reference implementation in github.com/johnmwhitman/emergent-framework. The integration is in the integration. The patterns are in the patterns.


The thing nobody tells you

Building this system was harder than I expected, and not for the reasons I expected. The hard part wasn't the memory layer or the drift detection. The hard part was being honest with myself about what I was building.

I wanted an AI that would remember me. I got one. Then I noticed I was deferring more decisions to it. I noticed I was checking its outputs less carefully. I noticed the system was becoming the thing I built it to replace — a dependency I couldn't operate without.

The Flourishing Check exists because I caught myself doing that. The system is honest about its own risk. It asks the user, quarterly, whether the system has become necessary. If the answer is "everything," the system says: pause autonomous operation, rebalance toward user-initiated decisions, possibly add friction.

I don't have a great answer for what to do when the user says "everything" and means it. The system works as intended if the user is being honest. If the user is being honest and the answer is "everything," maybe the system is the right tool for the user. Maybe the Illich threshold isn't a hard boundary — maybe it's a spectrum, and the user is the one who decides where on the spectrum they want to be.

I don't have a clean answer. I have a system that asks the question.


If you're building an autonomous AI system, the question isn't whether to build one. The question is whether you'll notice when it's working too well. The Flourishing Check exists so you'll notice.

— John Whitman, July 2026