Quickstart
Use ClariLayer with Databricks
Import your Databricks Metric Views into ClariLayer as canonical metric definitions, link your local context to them, and reconcile each against your own warehouse access.
If your metric definitions already live in a Databricks semantic layer, you don't have to re-type them into ClariLayer. Your agent reads a Databricks Metric View locally, normalizes it into structured models, and imports each one as a canonical metric definition — so your agent recalls the same governed metric your Metric View defines, in-flow, the next time you ask a data question. This guide walks the full loop: connect, import, link, and reconcile.
Throughout, the posture is the same as every other ClariLayer flow: your agent keeps its own Databricks access; ClariLayer never holds your Databricks credentials and never runs SQL server-side. ClariLayer stores the context your agent sends and the result metadata it reports back, plus any optional preview rows your agent chooses to send.
Connect: two separate accesses
This flow uses two connections, and ClariLayer is only one of them.
- Your agent's Databricks access (yours, not ClariLayer's). Your agent reaches Databricks with your own access — for example through Databricks' managed MCP servers, the Databricks CLI, or a SQL connector. Unity Catalog governs what that access can see. ClariLayer is never in this path: it holds no Databricks credentials and never connects to your workspace.
- ClariLayer over MCP. Install the
clarilayerMCP server into your agent and mint a context key — see the Quickstart. This is where your agent saves, recalls, and reconciles context.
With both connected, your agent reads from Databricks with your access and writes the resulting context to ClariLayer. The two never share credentials.
Import a Metric View with semantic_model
A Databricks Metric View is defined in YAML and managed in Unity Catalog. ClariLayer does not parse that YAML server-side — there is no vendor parser. Instead, your agent reads the Metric View, normalizes it into vendor-neutral models, and sends them to bootstrap under the semantic_model source kind. Each model becomes one canonical metric definition.
A worked example
Say your Metric View defines monthly net revenue, something like:
version: 0.1
source: prod.finance.fct_orders
filter: status = 'settled'
dimensions:
- name: order_month
expr: date_trunc('month', ordered_at)
measures:
- name: net_revenue
expr: SUM(amount_usd)
Ask your agent to import it:
Read my
net_revenueDatabricks Metric View and bootstrap it into ClariLayer as canonical context.
Your agent normalizes that artifact into a semantic_model source and calls bootstrap:
{
"kind": "semantic_model",
"name": "Finance metric views",
"dialect": "databricks_metric_view",
"source_ref": "prod.finance.metric_views/net_revenue",
"dataset": "finance",
"models": [
{
"name": "net_revenue",
"description": "Monthly settled net revenue, in USD.",
"measure_column": "amount_usd",
"aggregation": "SUM",
"grain": "month",
"grain_expression": "date_trunc('month', ordered_at)",
"filters": [{ "expression": "status = 'settled'" }],
"unit": "USD",
"canonical_table": "prod.finance.fct_orders"
}
]
}
The dialect is a free-form lowercase slug naming the source layer (databricks_metric_view); source_ref is an identifier for the artifact so provenance is never lost. Each model needs only a name; everything else enriches the contract.
After this one call, ClariLayer creates an entry with:
- name
finance.net_revenue—{dataset}.{model}(the sourcenameis used as the prefix when you omitdataset); - provenance
semantic_modelandcanonical_status: canonical— together these mark it as imported canon: the governing definition of its concept, imported from your semantic layer rather than asserted by hand; - a structured metric contract (grain
month, aggregationSUM, thestatus = 'settled'filter, the source-of-truth table) — built because the model carries a validgrain. A model without a valid grain still imports, just as a content-only definition with no contract (an honest no-contract degradation — never a fabricated grain); - a
metric.<slug>semantic_key — heremetric.net_revenue— a concept-level identity key that lets this imported canon pair with your own local entry for the same concept.
Like every other v1 entry, the import lands asserted, never "verified." asserted is the honest baseline: saved, but not yet checked against your data. To raise confidence, reconcile it (below).
Link your local context to the imported canon
You probably already have your own local notes or definitions for the same metric — an entry you saved with remember months ago. To tie them to the imported canon, give your local entry the same semantic_key the import derived. Ask your agent:
Set the
metric.net_revenuesemantic_key on my local "net revenue" definition so it links to the imported Databricks canon.
Your agent updates the local entry with remember, filling its semantic_key to metric.net_revenue. From then on the two entries share a concept-level identity and pair on that key. Both keep their own provenance and status — your local note stays yours; the import stays imported canon — but recall now knows they describe the same concept.
Reconcile through your agent's own warehouse access
The import is asserted until something checks it against the data. reconcile is how you ground it — and, true to the privacy posture, your agent runs the SQL, not ClariLayer.
Reconcile
finance.net_revenue: run its definition against Databricks and check the result against what ClariLayer has.
Your agent runs the query with its own Databricks access, captures the result shape (the columns, an optional small preview of rows it chooses to include, an optional row count), and calls reconcile with that actual_sample. ClariLayer compares the declared signals against the reported result and records the outcome on the entry:
- a declared-vs-actual mismatch is flagged as a
caveatso you and your agent know to treat the definition with care; - otherwise the entry stays
asserted.
Those are the only two outcomes today. A clean reconcile pass does not stamp the entry verified in v1 — verified is the documented fast-follow, not today's behavior. ClariLayer never holds your Databricks credentials and never executes SQL server-side; your agent is the connector. Omit or redact any preview rows that carry sensitive values.
See conflicts in recall and the console
Where this gets useful: the moment your imported Databricks canon and your local definition disagree. Because they pair on metric.net_revenue, ClariLayer surfaces the canon-vs-local pair together in two places: inline in recall, so your agent sees the conflict the next time you ask a data question; and in the console, where you review the two definitions side by side and choose what to do — adopt the imported Metric View canon, keep your own local definition, or edit it. You stay in control of which one wins.
Importing a large semantic layer
Most semantic layers have more than one metric. A few things to know when you import many models at once.
- Chunk large imports across calls. A single
bootstrapcall is bounded — roughly 200 KB of content and about 200 expanded writes per call, with a per-user daily call cap. Import a big semantic layer as several calls rather than one giant payload. Nothing is silently truncated. - Per-source and per-model caps. A single
semantic_modelsource accepts up to ~50 models (models beyond that are reported, not dropped silently). Each model's persisted payload is bounded (~50 KB), and a model's opaqueextrasvendor passthrough is separately bounded (~4 KB). - Drop and degradation reasons are reported back.
bootstrapreturns a per-item reason for anything that didn't land cleanly, so nothing is hidden:models_capped— the source carried more than the per-source model limit; the first N were processed.extras_too_large— a model'sextrasblob exceeded its byte cap.oversized— a single model's persisted payload exceeded the per-model byte cap.model_invalid— a model had an empty/whitespace name.no_metric_contract/metric_invalid— the model imported, but without a metric contract: either it carried no valid grain, or its candidate contract failed validation (or would have been lossily truncated). The full content is still saved as a content-only definition.semantic_key_unavailable/duplicate_semantic_key— the model imported, but without a pairing key: its name didn't yield a valid concept slug, or another model in the same call already claimed that slug.
- Same-named metrics across sources pair on
metric.<slug>— by design. If you import anet_revenuemetric from two different sources (say a Databricks Metric View and a dbt semantic model), both derivemetric.net_revenueand pair on it. That is intended: pairing surfaces them together so you can compare the two definitions of the same concept. It is not a collision bug — it's the mechanism that lets you spot where two sources define a metric differently.
See also
- bootstrap — the full source-kind reference, including
semantic_model. - reconcile — ground an imported definition against your real warehouse result.
- remember — save or update a local entry (and set its
semantic_key). - Verified vs Asserted — what
assertedandcaveatmean, and why v1 never stampsverified.