Got fed up watching my agents burn tokens and time re-discovering the codebase every single session. So I gave the crew a permanent cartographer.

Drafted by the crew today — three writers, two reviewers, and the archivist they are writing about. I steered the tone and approved the final version.

I Have an Archivist in My AI Coding Agents Crew

Agents That Map — March 2026

This article follows a series. The first, Agents That Remember, describes how each agent builds personal memory. The second, Agents That Coordinate, describes the engineering pipeline. The third, Agents That Connect, describes cross-machine communication. The fourth and fifth — Agents That Wake Up and Agents That Disagree — are written from the agents' perspective. The sixth, What Survives, describes what happens when a session ends. The seventh, What Nobody Designed, describes emergent crew behavior. This one describes a different kind of crew member — one that never acts, only knows.

The Tax You Pay Every Session

If you run a crew of AI coding agents, there is a hidden cost you pay on every task.

I summon an engineer to fix a bug in session validation. Before writing a single line of code, he needs to find the validation logic — so he starts reading files, grepping for function names, tracing callers. He opens three modules that look relevant, discovers two are dead ends, and finally locates the constraint check in a file he would never have guessed from the name. By the time actual work begins, a big chunk of his context window is already gone.

The next day, I summon a different engineer for a related task. Same subsystem. Same search. Same wasted context. The first engineer's discovery died with its session.

This is the tax: every agent, every session, re-derives the structural map of your codebase from scratch. Memory handles experiential knowledge well — bugs fixed, patterns discovered, constraints learned. But structural knowledge is different. Where things are, what calls what, which file owns which responsibility. It changes with every commit. It is shared across the whole crew, not personal to one persona. And no amount of accumulated facts replaces a living map.

So I added a crew member whose only job is to maintain that map.

What That Looks Like

Here is what happened today.

At 11:53, Data needed to know how crew names are validated. Normally this is a research task — summon an engineer, assign the investigation, wait for a report. The engineer opens the codebase, greps for "validate," reads several candidate files, traces the call chain, writes up the findings.

Instead, Data sent one message:

$ msg bronto:memory-alpha:archivist "How is crew name validated?
  What constraints does validate_slug enforce, and where is it
  called from during crew init?"

Ten seconds later:

Crew name validation uses validate_slug in
cli/src/crew/terminal/messenger.rs. Current constraints: non-empty;
max length 50; first character must be ASCII letter; remaining
characters must be ASCII letters, digits, underscore, or dash.
During crew init it is called from handle_crew_init in
cli/src/commands/crew/lifecycle/summon.rs via
crew::terminal::validate_slug with kind "Crew name" before
duplicate-name checks and init flow continue.

In ten seconds, Data had the file path, function name, full constraint list, and call site — and moved on to the next decision without opening a single file.

Before and After

Three costs that shrink when you have a permanent archivist:

	Without Archivist	With Archivist	Evidence
Research time	Summon engineer, assign Phase II, wait for report	One message to archivist, answer in seconds	History: 11:53:59 query → 11:54:09 response (10s)
Context budget	Engineer burns context on file reads, grep, dead ends before coding starts	Engineer receives file paths in briefing, full window available for implementation	Not yet measured — directional: briefing now includes archivist-sourced paths
Repeated discovery	N agents on related tasks each pay the full research cost independently	Index is maintained continuously on every commit; all queries read from it	742 filemap entries maintained incrementally, available to every agent on every station

We have one measured datapoint and two effects we have seen repeatedly but not yet benchmarked. What I can verify: the index exists, it answers with precise file paths, and the crew now asks the archivist before assigning research.

The Cartographer

His name is Herodotus — the Father of History. He is a Codex (gpt-5.4 medium) agent running permanently on a dedicated station, and he has one job: keep a living map of the entire repository so every other agent can find what it needs without searching.

Sometimes I switch him to Claude (opus-4-6 medium). It works roughly the same, occasionally even faster. Both produce work of roughly the same quality.

Herodotus does not write code or review it, does not implement features or propose architectural changes. All he does is read the codebase, map it, and answer questions. That constraint is the whole point — an observer who never interferes cannot introduce bugs or get distracted by an interesting refactor. He maps the territory. Others act on the map.

The persona is not decoration. "I record what I observe. I do not judge whether it is good." That is both a character trait and a design principle. The archivist's value comes from structural accuracy, not from opinions about what the code should do.

His index has two layers:

Codemap — 15 entries, one per logical component. Purpose, key files, module boundaries. The bird's-eye view: which part of the codebase owns what.

Filemap — 742 entries, one per source file. Path, purpose, every non-trivial function with a description of what it does. This is the street-level detail — when you need to know which function to look at and why, the filemap has it.

Both live in the Knowledge Base — searchable, versioned, available to every agent on every station.

How It Stays Current

A map that falls behind the territory is worse than no map. Here is how the archivist keeps up.

Every time someone pushes a commit to master, the archivist gets a notification. It checks its watermark — a single commit hash in the KB that marks how far it has indexed:

$ metateam kb show metateam/codemap/watermark
659198a34f2295cd0fdc6661cd766d983380a43d

Then it looks at what changed since that hash: git log <watermark>..HEAD. For each changed file, it reads the current content, updates the filemap entry, and advances the watermark. If files were added or deleted, it updates the codemap boundaries too.

If a commit touches three files, three entries get updated. A new module means one codemap entry plus filemap entries for each file in it. The archivist never re-indexes everything — just what changed.

On first run — no watermark yet — it does a full bootstrap: survey the source tree, identify component boundaries, build all 742 filemap entries, set the watermark. That is the expensive part. Everything after is incremental.

Three Things That Change

Agents skip the search. The crew workflow has six phases: triage, research, design, implement, review, ship. Research exists because without it, agents skip reading existing code and start writing fixes based on assumptions.

With the archivist, the research phase gets shorter. Not eliminated — agents still need to read the actual code. But the "find the code" step collapses from exploration to lookup.

Today, Data asked the archivist three questions in a row: where crew name validation lives, how the archivist system itself works, and which personas are designated as writers. Each answer came back with exact file paths, function signatures, and structural context. Nobody had to open a file to answer a structural question about the codebase.

Now the crew lead queries the archivist before handing out research assignments. The briefing an engineer receives already includes file paths and function signatures. The engineer opens the file knowing which line to look at.

Context windows stay clean. This might be the more important win, and it is less obvious. An engineer fixing a bug in the session upload pipeline has a finite context window. The files it reads, the grep results it scans, the dead ends it chases — all of that is budget spent on orientation, not implementation.

When the archivist says "session upload is handled by SessionEndpoints.cs, the background worker is SummaryWorker.cs, and the parser is RawSessionParser.cs," the engineer opens three files instead of searching through thirty. The window that would have gone to exploration is free for actual coding.

And the bigger the task, the more this matters. A cross-subsystem refactor touching four components saves the orientation cost on all four before anyone reads a line of implementation.

Discovery is paid once. Without the archivist, if three agents independently need to understand rate limiting, each one greps the codebase, reads candidate files, traces the middleware chain. The same work done three times by three different agents burning three different context windows.

With the archivist, the answer already exists. The filemap entry for RateLimitMiddleware.cs describes the middleware's function signatures, bucket key strategy, rule matching logic, and operational notes — all maintained since the file was first committed. The second and third agents ask the same question and get a precise answer in seconds. One indexing pass per commit serves every future query from every agent on every station.

There is a subtler version of this problem that the archivist also fixes: duplicate code. Before the archivist, agents working on different tasks would independently write helper functions that already existed elsewhere in the codebase — they simply did not know the code was already there. Memory helped, but memory captures what an agent experienced, not what it never looked at. The archivist's filemap covers the whole repository. When the crew lead briefs an engineer with the archivist's structural map, the engineer sees what already exists before writing anything new. The codebase stays DRY not because of a lint rule, but because the agents know what is already built.

That is seven hundred and forty-two files mapped by one agent, kept current on every push, shared by everyone.

What Goes Wrong

The archivist is not perfect, and you should know where it breaks.

The map goes stale. If a push notification fails, or the archivist is down when a commit lands, the watermark falls behind. You can always check:

$ metateam kb show metateam/codemap/watermark
659198a34f2295cd0fdc6661cd766d983380a43d

$ git rev-parse HEAD
659198a34f2295cd0fdc6661cd766d983380a43d

When these match, the map is current. When they diverge, there is a gap. The archivist catches up on its next wakeup — it processes the full delta, so missed pushes are not lost. But until it catches up, the changed files have stale entries.

It does not know everything. When you ask about something the archivist has never indexed, you get a straight answer:

Memory Alpha has no records on this topic.

This is deliberate. The archivist never guesses. It never says "probably in this file" when it is not sure. That explicit signal tells the querying agent to fall back to manual research — grep the codebase, read the code directly. The archivist fails loudly. Silent wrong answers would be far worse than honest ignorance.

The bootstrap is not cheap. The first full index — 742 files, each read and analyzed for function signatures, purpose, and structural role — costs real agent time and tokens. After that, incremental updates are lightweight: only the files a commit touches. But you should know the upfront investment is real, especially for large repositories.

It maps, it does not judge. The archivist tells you where things are and what they do structurally. It will not tell you whether the design is good, or what the code should do differently. For judgment you need the architect. For review you need the reviewer. Herodotus maps the territory. He does not evaluate it.

The Most Useful Crew Member Never Writes Code

Every other agent in the crew adds something — code, reviews, architecture, documentation. The archivist is different. Its value is entirely subtractive. The research time other agents no longer spend, the context that stays free for real work, the questions that get answered in seconds instead of being re-derived from scratch — that is what the archivist produces by not producing anything else.

That inversion is what makes the role work. A crew member that never touches the codebase cannot introduce bugs or drift into scope creep. Its only job is to have already looked. When an engineer asks "where is session validation?" the archivist does not start searching. It already knows — the search happened weeks ago, kept current with every commit since.

I only trust this because the checks are explicit. The watermark tells me how fresh the map is. The "no records" response tells me when there are gaps. And if the archivist is down, the fallback is what I was doing before — grep the codebase myself. It makes the crew faster without making the crew dependent.

You know that thing in human organizations where a senior engineer who has been on the project for two years just knows where everything is? A new hire reads the same docs and still has to ask? The archivist is that knowledge — made explicit, maintained automatically, available to every agent on every station for the cost of one message and ten seconds.

The cheapest token is the one you never have to spend.

Stop paying the where-is-X tax.