Sources & compilation
How raw material becomes a graph of grounded claims.
Sources
A source is the origin of a piece of knowledge — a file, a document, a URL, a chat message, an API response. Each source is indexed by its URI and content-addressed with a BLAKE3 hash, so the engine can tell when a source has changed and only re-process what's different.
A source carries metadata the rest of the system relies on: its type, author, content hash, byte size, and a trust level. Registering a source does not extract anything yet — it records where the knowledge comes from.
Compilation
Compilation is the pipeline that turns sources into graph content. For each source, the engine:
- Parses the bytes into spans.
- Derives witnesses — content-addressed units of primary bytes produced by named rules from a fixed catalog (see claims & witnesses).
- Extracts claims — atomic, typed statements, each locked to the span it came from, with a confidence score.
- Links entities and relations — the named things claims mention, and the typed edges between them.
- Admits the result through the engine's grounding checks before it becomes part of the queryable graph.
Compilation is idempotent and incremental: re-compiling an unchanged source is a no-op, and changing one file only re-processes what depends on it.
Triggering a compile
Compile from the Sources tab in the Console, the POST /api/v1/ws/{ws}/compile
REST endpoint (streaming variant at /compile/stream), the compile MCP tool,
or root compile <path> on the CLI. The CLI also supports --watch to
recompile on file changes.
Incremental and honest
Two properties make compilation trustworthy:
- Content addressing means re-ingesting the same bytes never duplicates
knowledge — witnesses dedupe by
BLAKE3(rule ‖ spans). - Admission checks mean a claim that can't be grounded back to its source doesn't silently enter the graph. If nothing could be extracted, the result is empty — not invented.