AI AgentsMetric Governance

How a Context Layer Differs From Notes and AI Memory

Kyle Hui·

You wrote one line into a CLAUDE.md six weeks ago: "MRR excludes trials." It was true that day. Your agent has read it back, correctly, every session since.

Last Tuesday someone changed the SQL underneath it. The note didn't change. Your agent kept quoting "MRR excludes trials" with the same confidence as day one, and you found out the number was wrong the way these things always surface: two dashboards disagreed in a meeting, in front of your VP.

We hear a version of this most weeks. It's also the most common question we get about ClariLayer: how is this different from a folder of .md files, a rules file, or my agent's memory? Fair question. The honest answer starts by admitting where there's no difference at all.

The note was the right instinct

Writing your data context into a CLAUDE.md is correct. So is a rules file, or letting a memory MCP hold your definitions. You're teaching the agent the things it keeps getting wrong: the canonical table, the join that isn't obvious, the reason refunds don't belong in revenue. Keep doing that.

Strip ClariLayer down to its primitive and it does the same thing those tools do. Save a fact, hand it back later, over MCP. That part is commodity plumbing now. A markdown folder, a memory tool, and a thirty-line cron job all clear that bar, and we say so in our own notes: storing and serving context is not a moat.

So we won't tell you we remember better. We don't, particularly. The gap opens somewhere else. It opens the day a definition stops being true and nothing tells you.

A note has nothing to check itself against

A CLAUDE.md records that you once believed "MRR excludes trials." It can't ask your warehouse whether that's still how the number computes. Six months on, the line that's gone stale looks exactly as confident as the line that's still right. Agent memory has the same blind spot. It remembers what it was told, not what is currently true about your data.

More text was never the problem. Your agent had the note. What it couldn't do was tell which of its saved facts had been checked against reality and which were just claims in the same font.

That check is the property ClariLayer adds. A saved definition isn't only stored. It can be reconciled, and it carries what the reconcile found.

What that looks like in the product

Three things a text file structurally can't do.

It reconciles against your live warehouse. When you run reconcile on a definition, your agent re-runs that definition's SQL with its own warehouse access and reports back the shape of the result. ClariLayer compares what the definition declared against what came back. If the saved group-by, the measure column, or the time grain no longer show up, the entry gets flagged as a caveat.

Two limits I want to be exact about, because a trust product that fudges them deserves to lose. This is a shape check, not a correctness proof: a query that double-counts refunds but still returns the right columns will pass. And reconcile is a verb you run, not a daemon watching your warehouse. It grounds an entry when you call it.

The precise privacy version: ClariLayer never holds your warehouse credentials and never executes SQL server-side. Your agent runs the queries and sends back result metadata plus any optional preview rows it chooses to include.

It carries an honest status the agent can read. Every entry travels with asserted or caveat, plus where it came from. The agent reads that as structured data, so it can down-weight a caveated definition while it's working, instead of showing you a warning after the fact. A markdown note has no status model at all. Every line reads equally certain, including the wrong one.

It serves the in-scope context, not the whole folder. A notes directory gets loaded whole or grepped, and past a certain size that stops working well. The text is reachable. Whether the agent reaches for the one definition that's canonical for the task in front of it is a different question. recall takes the task you're on and serves the definition that fits, ranked, with weak out-of-scope entries pushed down. On a five-line rules file that's overkill. Past a few dozen definitions it's the difference between "found something relevant" and "used the one that was right here."

The part we refused to ship

This is the test we'd want you to hold us to.

We built a "verified" badge for these entries. A strong green stamp meaning the definition matched your data. Then we gated it off in code before launch, and a clean reconcile today returns "asserted," never "verified."

The reason is boring, and it's the whole point. The check that would back a "verified" stamp is a SQL heuristic, and we couldn't make it sound against the long tail of real-world SQL. One false "verified" is the single failure a trust layer can't walk back. So the strong claim waits until the machinery is good enough to earn it. A trust product is judged by what it refuses to say.

A note never has this problem, because a note never makes the claim. It stays quietly confident and lets you find out in the meeting.

"I could write a script for this"

The sharpest pushback comes from people who already tag their notes "ASSERTED" versus "CONFIRMED" and figure a cron job and a diff would close the gap. They're right that it's buildable. The questions I'd ask first:

Is that "CONFIRMED" tag still true after one schema migration, or is it a comment that aged? Is it a typed field your agent has to honor, or prose it can skip re-reading? Does your script keep a history you can audit, or overwrite last week's answer with this week's?

You could build it. The real question is whether staying honest stays your chore forever, or becomes the system's job.

Where this goes, eventually

Everything above is single-player. One analyst, their own definitions, their own agent. The bet we're making, and it is a bet, is that this compounds: the more context you reconcile, the more your agent gets right about your data specifically.

The team version is on the roadmap, not in the product today. When more than one person leans on a definition, copy-pasting a file is an ungoverned overwrite, and that's how "churn" quietly comes to mean three different things in one company. Governed sharing and a "where does my context differ from team canon" diff are where this is headed. We'll earn it by getting single-player right first.

If you keep a CLAUDE.md

You probably have a definition in there right now that was true the day you wrote it. The honest question is whether it still is, and whether anything would tell you if it weren't.

That's the problem we're building for. If you live in Claude Code, Cursor, or Codex and you've been burned by a note that went quietly stale, point ClariLayer at the SQL you already have and tell me where it helps and where it doesn't. You can try it at clarilayer.com.

Written by

Kyle Hui

Founder, ClariLayer

Building the context layer for business metrics in the AI era.