Lago
Event-sourced persistence substrate — append-only journal, content-addressed blobs, and knowledge index.
Lago
Lago is the persistence substrate of the Agent OS. It provides an append-only event journal, content-addressed blob storage, a knowledge index with graph traversal, and filesystem manifests with branching. Every state change in the system is an immutable event stored in Lago.
The name comes from "lake" -- a deep, still body that preserves everything deposited into it.
Architecture
Lago is structured as a Rust workspace with these crates:
| Crate | Role |
|---|---|
lago-core | Core types, event envelope, stream identifiers |
lago-journal | Append-only event journal trait and redb implementation |
lago-store | Content-addressed blob storage (SHA-256 + zstd compression) |
lago-fs | Filesystem manifests with branching (tree of content-addressed nodes) |
lago-ingest | SSE stream ingestion (OpenAI, Anthropic, Vercel, Lago formats) |
lago-api | HTTP API server (axum, SSE endpoints) |
lago-policy | RBAC policy engine (roles, permissions, hooks) |
lago-knowledge | Knowledge index (frontmatter extraction, wikilinks, scored search, graph traversal) |
lago-auth | JWT authentication with per-user vault sessions |
lago-aios-eventstore-adapter | Adapter implementing aiOS EventStore trait |
lago-cli | CLI for journal inspection and management |
lagod | Standalone daemon binary |
Event journal
The journal is the heart of Lago. It stores events as immutable, ordered records in a redb v2 embedded database.
Event envelope
Every event is wrapped in an EventEnvelope that provides identity, ordering, integrity, and provenance:
struct EventEnvelope {
id: Ulid, // Globally unique, time-ordered identifier
stream_id: String, // Logical stream (e.g., session ID)
kind: EventKind, // Type of event (from aiOS taxonomy)
payload: Vec<u8>, // Serialized event data
checksum: [u8; 32], // SHA-256 of the payload
timestamp: u64, // Unix timestamp in milliseconds
metadata: Metadata, // Trace context, actor, provenance
}The id is a ULID -- lexicographically sortable and time-ordered, so events naturally sort in append order. The checksum is a SHA-256 hash of the payload, providing integrity verification without additional storage.
Event kinds
Events follow the aiOS EventKind taxonomy:
| Category | Events | Description |
|---|---|---|
| Input | UserMessage, ExternalSignal | User input and external triggers |
| Session | SessionCreated, SessionResumed, SessionClosed | Session lifecycle |
| Cognition | AssistantMessage, ToolCall, ToolResult | LLM responses and tool execution |
| Memory | MemoryStored, MemoryRetrieved | Knowledge persistence |
| Approval | ApprovalRequested, ApprovalGranted, ApprovalDenied | Human-in-the-loop gates |
| Custom | Any string prefix | Subsystem events (autonomic.*, finance.*) |
The Custom kind enables forward-compatible persistence -- new event types (from Autonomic, Haima, or future subsystems) can be stored without schema migrations.
Append-only guarantee
Events are never modified or deleted. The journal is strictly append-only. This means:
- State at any point in time can be reconstructed by replaying events up to that timestamp
- Auditing is inherent -- the complete history of every agent action is preserved
- Branching is cheap -- create a branch by remembering a cursor position and appending new events from there
Stream isolation
Events are organized into streams identified by a string ID (typically a session ID). Streams are independent -- appending to one stream does not affect others. Cross-stream queries are supported through the knowledge index.
Blob storage
Large binary content (files, images, documents) is stored separately from events in content-addressed blob storage:
- Content addressing -- blobs are identified by their SHA-256 hash
- Deduplication -- identical content is stored only once, regardless of how many events reference it
- Compression -- all blobs are compressed with zstd before storage, reducing disk usage
- Integrity -- the hash serves as a checksum; any corruption is immediately detectable
Blobs are referenced from events by their hash. The blob store is backed by the local filesystem with a flat hash-based directory structure:
data/blobs/
a1/b2c3d4... (SHA-256 hash prefix for directory sharding)Knowledge index
The knowledge index provides searchable, graph-structured access to the information stored in Lago:
Frontmatter extraction
Documents ingested into Lago have their frontmatter parsed and indexed. Title, tags, dates, and custom fields become searchable metadata. The parser handles YAML frontmatter delimited by ---.
Wikilink graph
Documents can reference each other using [[wikilink]] syntax. These links form a directed graph that can be traversed to discover related content. The graph is maintained incrementally as documents are ingested.
Scored search
Full-text search with relevance scoring. Queries are matched against document titles, frontmatter fields, and body content. Results are ranked by a TF-IDF-inspired scoring function that considers:
- Term frequency in the document
- Inverse document frequency across the corpus
- Title match bonus (matches in titles score higher)
- Frontmatter tag match bonus
Graph traversal
Starting from any document, you can traverse the wikilink graph to find related documents at a specified depth. This powers the "related content" and "see also" features in the platform, as well as the memory retrieval system that provides cross-session context.
Filesystem manifests
Lago provides a virtual filesystem built on content-addressed nodes:
- Trees -- directory nodes containing references to child nodes
- Blobs -- file nodes referencing content in blob storage
- Branches -- named pointers to tree roots, enabling Git-like branching
- Snapshots -- immutable captures of a tree state at a point in time
This allows agents to maintain a workspace that is fully versioned and branchable without a traditional filesystem.
RBAC policy
Lago includes a built-in policy engine for access control defined in lago-policy:
- 3 default roles --
admin,user,reader - 5 default rules -- scoping journal access, blob access, and knowledge queries per role
- 2 hooks --
pre-appendandpost-appendhooks for custom validation and side effects
Policies are evaluated synchronously before each operation. Denied operations return an error without modifying state.
// Policy evaluation example
let policy = Policy::default(); // 3 roles, 5 rules, 2 hooks
let result = policy.evaluate(actor_role, operation);
match result {
PolicyResult::Allow => { /* proceed */ }
PolicyResult::Deny(reason) => { /* return error */ }
}Running Lago
Lago typically runs embedded within Arcan through the arcan-lago bridge. For standalone use:
cd lago
cargo run -p lagod -- --data-dir /path/to/data --port 3001CLI
# List streams in the journal
cargo run -p lago-cli -- streams
# Count events in a stream
cargo run -p lago-cli -- count --stream my-session
# Cat events as JSON
cargo run -p lago-cli -- cat --stream my-session --format json
# Search the knowledge index
cargo run -p lago-cli -- search "event sourcing"Critical patterns
redb is synchronous. The redb embedded database does not support async I/O. All redb operations in Lago use tokio::task::spawn_blocking to avoid blocking the async runtime. Never call redb directly from an async context.
The Journal trait uses BoxFuture for dyn-compatibility, allowing different journal implementations to be swapped at runtime (redb for production, in-memory for tests).
JWT authentication
lago-auth provides JWT validation for the HTTP API. Tokens are verified using the same signing key as the platform auth system. Per-user vault sessions allow each authenticated user to have isolated access to their own streams and blobs.