Quickstart

Use ClariLayer with Databricks

Import your Databricks Metric Views into ClariLayer as canonical metric definitions, link your local context to them, and reconcile each against your own warehouse access.

If your metric definitions already live in a Databricks semantic layer, you don't have to re-type them into ClariLayer. Your agent reads a Databricks Metric View locally, normalizes it into structured models, and imports each one as a canonical metric definition — so your agent recalls the same governed metric your Metric View defines, in-flow, the next time you ask a data question. This guide walks the full loop: connect, import, link, and reconcile.

Throughout this personal MCP flow, your agent keeps its own Databricks access; ClariLayer never holds your Databricks credentials and never runs SQL server-side. ClariLayer stores the context your agent sends and the result metadata it reports back, plus any optional preview rows your agent chooses to send.

Connect: two separate accesses

This flow uses two connections, and ClariLayer is only one of them.

Your agent's Databricks access (yours, not ClariLayer's). Your agent reaches Databricks with your own access — for example through Databricks' managed MCP servers, the Databricks CLI, or a SQL connector. Unity Catalog governs what that access can see. ClariLayer is never in this path: it holds no Databricks credentials and never connects to your workspace.
ClariLayer over MCP. Connect claude.ai through the OAuth custom Connector, or install the clarilayer MCP server into Claude Code, Cursor, or Codex and mint a context key — see the Quickstart. This is where your agent saves, recalls, and reconciles context.

With both connected, your agent reads from Databricks with your access and writes the resulting context to ClariLayer. The two never share credentials.

Import a Metric View with `semantic_model`

A Databricks Metric View is defined in YAML and managed in Unity Catalog. ClariLayer does not parse that YAML server-side — there is no vendor parser. Instead, your agent reads the Metric View, normalizes it into vendor-neutral models, and sends them to bootstrap under the semantic_model source kind. Each model becomes one canonical metric definition.

A worked example

Say your Metric View defines monthly net revenue, something like:

version: 0.1
source: prod.finance.fct_orders
filter: status = 'settled'
dimensions:
  - name: order_month
    expr: date_trunc('month', ordered_at)
measures:
  - name: net_revenue
    expr: SUM(amount_usd)

Ask your agent to import it:

Read my net_revenue Databricks Metric View and bootstrap it into ClariLayer as canonical context.

Your agent normalizes that artifact into a semantic_model source and calls bootstrap:

{
  "kind": "semantic_model",
  "name": "Finance metric views",
  "dialect": "databricks_metric_view",
  "source_ref": "prod.finance.metric_views/net_revenue",
  "dataset": "finance",
  "models": [
    {
      "name": "net_revenue",
      "description": "Monthly settled net revenue, in USD.",
      "measure_column": "amount_usd",
      "aggregation": "SUM",
      "grain": "month",
      "grain_expression": "date_trunc('month', ordered_at)",
      "filters": [{ "expression": "status = 'settled'" }],
      "unit": "USD",
      "canonical_table": "prod.finance.fct_orders"
    }
  ]
}

The dialect is a free-form lowercase slug naming the source layer (databricks_metric_view); source_ref is an identifier for the artifact so provenance is never lost. Each model needs only a name; everything else enriches the contract.

After this one call, ClariLayer creates an entry with:

name finance.net_revenue — {dataset}.{model} (the source name is used as the prefix when you omit dataset);
provenance semantic_model and canonical_status: canonical — together these mark it as imported canon: the governing definition of its concept, imported from your semantic layer rather than asserted by hand;
a structured metric contract (grain month, aggregation SUM, the status = 'settled' filter, the source-of-truth table) — built because the model carries a valid grain. A model without a valid grain still imports, just as a content-only definition with no contract (an honest no-contract degradation — never a fabricated grain);
a metric.<slug> semantic_key — here metric.net_revenue — a concept-level identity key that lets this imported canon pair with your own local entry for the same concept.

Like every other v1 entry, the import lands asserted, never "verified." asserted is the honest baseline: saved, but not yet checked against your data. To raise confidence, reconcile it (below).

Link your local context to the imported canon

You probably already have your own local notes or definitions for the same metric — an entry you saved with remember months ago. To tie them to the imported canon, give your local entry the same semantic_key the import derived. Ask your agent:

Set the metric.net_revenue semantic_key on my local "net revenue" definition so it links to the imported Databricks canon.

Your agent updates the local entry with remember, filling its semantic_key to metric.net_revenue. From then on the two entries share a concept-level identity and pair on that key. Both keep their own provenance and status — your local note stays yours; the import stays imported canon — but recall now knows they describe the same concept.

Reconcile through your agent's own warehouse access

The import starts asserted. reconcile checks it against the data — and, true to the privacy posture, your agent runs the SQL, not ClariLayer.

Reconcile finance.net_revenue: run its definition against Databricks and check the result against what ClariLayer has.

Your agent runs the query with its own Databricks access, captures the result shape (the columns, an optional small preview of rows it chooses to include, an optional row count), and calls reconcile with that actual_sample. ClariLayer compares the declared signals against the reported result and records the outcome on the entry:

a declared-vs-actual mismatch is flagged as a caveat so you and your agent know to treat the definition with care;
otherwise the entry stays asserted. Applicable checks may have passed or been inconclusive; neither creates a stronger status.

Those are the only two outcomes today. A clean reconcile pass does not stamp the entry verified; the stronger status is gated off and has no public delivery promise. In this personal MCP flow, ClariLayer never holds your Databricks credentials and never executes SQL server-side; your agent is the connector. Omit or redact any preview rows that carry sensitive values.

See conflicts in recall and the console

Where this gets useful: the moment your imported Databricks canon and your local definition disagree. Because they pair on metric.net_revenue, ClariLayer surfaces the canon-vs-local pair together in two places: inline in recall, so your agent sees the conflict the next time you ask a data question; and in the console, where you review the two definitions side by side and choose what to do — adopt the imported Metric View canon, keep your own local definition, or edit it. You stay in control of which one wins.

Importing a large semantic layer

Most semantic layers have more than one metric. A few things to know when you import many models at once.

Chunk large imports across calls. A single bootstrap call is bounded — roughly 200 KB of content and about 200 expanded writes per call, with a per-user daily call cap. Import a big semantic layer as several calls rather than one giant payload. Nothing is silently truncated.
Per-source and per-model caps. A single semantic_model source accepts up to ~50 models (models beyond that are reported, not dropped silently). Each model's persisted payload is bounded (~50 KB), and a model's opaque extras vendor passthrough is separately bounded (~4 KB).
Drop and degradation reasons are reported back. bootstrap returns a per-item reason for anything that didn't land cleanly, so nothing is hidden:
- models_capped — the source carried more than the per-source model limit; the first N were processed.
- extras_too_large — a model's extras blob exceeded its byte cap.
- oversized — a single model's persisted payload exceeded the per-model byte cap.
- model_invalid — a model had an empty/whitespace name.
- no_metric_contract / metric_invalid — the model imported, but without a metric contract: either it carried no valid grain, or its candidate contract failed validation (or would have been lossily truncated). The full content is still saved as a content-only definition.
- semantic_key_unavailable / duplicate_semantic_key — the model imported, but without a pairing key: its name didn't yield a valid concept slug, or another model in the same call already claimed that slug.
Same-named metrics across sources pair on metric.<slug> — by design. If you import a net_revenue metric from two different sources (say a Databricks Metric View and a dbt semantic model), both derive metric.net_revenue and pair on it. That is intended: pairing surfaces them together so you can compare the two definitions of the same concept. It is not a collision bug — it's the mechanism that lets you spot where two sources define a metric differently.