ADR-003: Content-Addressed Storage and Cryptographic Identity
- Status: Accepted
- Date: 2026-04-05
- Depends on: ADR-000 (Zero-Trust Architecture and Capability-Based Access Control)
- Context: Establishing the native storage and identity model — what a “file” is in CambiOS and how authorship, ownership, and provenance work
Problem
Traditional filesystems store bytes at paths. A file has no inherent notion of who created it, who controls it, or whether its contents have been tampered with. Metadata like ownership and permissions are bolted on as external attributes — maintained by the OS, not intrinsic to the data. This creates several structural problems:
No provenance. A file’s history (who created it, who modified it, what it was derived from) exists only in external logs that can be lost, falsified, or simply not kept.
Ownership is ambient. Unix-style ownership is an integer UID assigned by the local system. It has no meaning across machines, can be reassigned by root, and tells you nothing about the actual entity that produced the data.
Integrity is unchecked. A file can be silently modified — by a compromised process, a bug, or a malicious actor — with no detection mechanism unless the application layer separately implements checksums.
Copying strips identity. Copying a file to another system, email, or storage medium severs it from its ownership and permission metadata. The copy is indistinguishable from any other sequence of bytes.
CambiOS needs a storage model where every object is self-describing: it carries its own identity, authorship, ownership proof, and integrity guarantee regardless of where it lives.
Decision
CambiOS adopts a content-addressed object store as its native storage model and cryptographic Principals (Ed25519 public keys) as its identity primitive. Every stored object (an CambiObject) is identified by the hash of its content, carries immutable authorship and transferable ownership, and is signed by its owner. Identity is established by key generation, not by registration with an authority.
These two decisions — content-addressing and cryptographic identity — are co-dependent. Content-addressing without identity produces anonymous blobs. Identity without content-addressing produces signed bytes that can be silently swapped. Together they produce signed artifacts with unforgeable provenance.
Architecture
Principal: The Identity Primitive
A Principal is a 32-byte Ed25519 public key. It is the identity of a process, a user, or a service.
pub struct Principal {
pub public_key: [u8; 32],
pub key_hash: [u8; 16], // FNV-1a — fast comparison only, not cryptographic
}
Principals are:
- Self-issued. Generating a key pair creates the identity. No enrollment, no authority.
- Unforgeable. Only the holder of the corresponding private key can produce valid signatures.
- Bound to processes. The kernel binds a Principal to each process via
BindPrincipal(syscall 11). Only the bootstrap Principal can perform this binding — preventing identity theft.
IPC Sender Stamping
Every IPC message carries a sender_principal field stamped by the kernel in send_message_with_capability(). The sender cannot set or forge this field — the kernel reads the sender’s bound Principal from the CapabilityManager and writes it into the message.
This provides zero-cost unforgeable identity for local IPC. Receiving processes know who sent every message without cryptographic verification overhead. Signatures are reserved for network boundaries (future phases).
CambiObject: The Storage Unit
Every stored object in CambiOS is an CambiObject:
pub struct CambiObject {
pub content_hash: [u8; 32], // content address (FNV-1a Phase 0, Blake3 Phase 1)
pub author: [u8; 32], // creator's public key — IMMUTABLE
pub owner: [u8; 32], // current controller's public key — transferable
pub signature: [u8; 64], // owner's signature (Ed25519 Phase 1+)
pub acl: [(Principal, ObjectRights); 8], // access control list
pub lineage: Option<[u8; 32]>, // hash of parent object (provenance chain)
pub content: Vec<u8>, // the actual data
}
Key properties:
| Property | Meaning |
|---|---|
| Content-addressed | The object’s identity is the hash of its content. Same content = same address. Different content = different object. |
| Immutable author | The author field is set at creation and never changes. It records who brought the object into existence — historical fact. |
| Transferable owner | The owner field is the current controller. Starts as the author. Can be transferred via signed OwnershipTransfer objects. |
| Signed | The owner signs the object. Stripping or changing the owner field invalidates the signature. |
| Lineage | Optional pointer to a parent object hash. Forking a document produces a new object with the original in its lineage — the chain of attribution is permanent. |
ObjectStore Trait
The storage abstraction is a trait, not a filesystem:
pub trait ObjectStore {
fn get(&self, hash: &[u8; 32]) -> Result<&CambiObject, StoreError>;
fn put(&mut self, object: CambiObject) -> Result<[u8; 32], StoreError>;
fn delete(&mut self, hash: &[u8; 32]) -> Result<(), StoreError>;
fn list(&self) -> Result<Vec<[u8; 32]>, StoreError>;
fn count(&self) -> usize;
}
RamObjectStore is the Phase 0 implementation: a fixed-capacity (256 objects) heap-allocated store with linear scan. It validates that content hashes match on put and enforces basic capacity limits. This is deliberately minimal — a block-device-backed implementation will follow the same trait.
FS Service: User-Space ObjectStore Gateway
The filesystem service (user/fs-service/) is the first real user-space service in CambiOS. It runs as a Rust no_std ELF, registers IPC endpoint 16, and enters a service loop:
RecvMsg— receives IPC withsender_principal+ command payload- Parse command (PUT/GET/DELETE/LIST)
- Verify sender’s authorization via
sender_principal - Call ObjPut/ObjGet/ObjDelete/ObjList kernel syscalls
Writeresponse back to sender
The FS service demonstrates the microkernel pattern: storage policy (who can access what) runs in user-space, while storage mechanism (the ObjectStore) is a kernel-managed resource accessed via syscalls.
Syscalls
Seven new syscalls support identity and storage:
| Number | Name | Purpose |
|---|---|---|
| 11 | BindPrincipal | Bind a 32-byte Principal to a process (restricted to bootstrap Principal) |
| 12 | GetPrincipal | Read the calling process’s bound Principal |
| 13 | RecvMsg | Identity-aware receive: returns [sender_principal:32][from_endpoint:4][payload:N] |
| 14 | ObjPut | Store an CambiObject, returns 32-byte content hash |
| 15 | ObjGet | Retrieve object content by hash |
| 16 | ObjDelete | Delete object (ownership enforced — only owner can delete) |
| 17 | ObjList | List all object hashes (packed 32-byte hashes) |
Bootstrap Principal
At boot, the kernel generates a deterministic 32-byte bootstrap Principal (Phase 0 uses a fixed seed; Phase 1 will use real entropy). This Principal is bound to all kernel processes (PIDs 0–2) and boot modules. It serves as the initial trust anchor — the only Principal authorized to call BindPrincipal on other processes.
The bootstrap Principal is explicitly temporary in implementation but permanent in interface. The Principal type, the bind_principal()/get_principal() API, and the sender_principal stamping mechanism are the production interfaces. The bootstrap seed and FNV-1a hashing are Phase 0 scaffolding.
Why Content-Addressing
| Alternative | Problem |
|---|---|
| Path-based (Unix/NTFS) | Identity is location. Move or copy = lose provenance. |
| UUID-based | Random identifier with no relation to content. Duplicate detection impossible. |
| Block-addressed | Couples storage to physical layout. Meaningless across devices. |
| Content-addressed | Identity is intrinsic. Same content = same address everywhere. Tamper-evident. Deduplication is free. |
Content-addressing also makes the ObjectStore trait naturally implementable across backing stores: RAM, block device, network peer, or sovereign cloud. The object’s address doesn’t change when it moves between stores.
Why Author and Owner Are Separate
A single “owner” field conflates two distinct concepts:
- Authorship — who created this object. Historical fact. Immutable.
- Ownership — who currently controls this object. Current authority. Transferable.
An employee creates a document at work — they are the author, the employer is the owner. An independent contractor creates a document — they are both author and owner. Separating these fields preserves both provenance and authority through ownership transfers.
Lock Ordering
OBJECT_STORE sits at position 8 in the lock hierarchy — the highest-numbered lock:
SCHEDULER(1)* → TIMER(2)* → IPC_MANAGER(3) → CAPABILITY_MANAGER(4) →
PROCESS_TABLE(5) → FRAME_ALLOCATOR(6) → INTERRUPT_ROUTER(7) → OBJECT_STORE(8)
Position 8 reflects that the ObjectStore is the highest-level kernel subsystem — it depends on everything below it (IPC for message delivery, capabilities for access control, frames for storage) but nothing depends on it. Syscall handlers that touch the ObjectStore acquire its lock last.
BOOTSTRAP_PRINCIPAL is outside the hierarchy — written once at boot, read-only thereafter.
Phase Progression
The interfaces (Principal, CambiObject, bind_principal/get_principal, sender_principal stamping, ObjectStore trait) are stable. The backing implementations evolve through phases:
| Aspect | Initial design (Phase 0) | Production target |
|---|---|---|
| Content hashing | Stub | Blake3 (cryptographic) |
| Signatures | Placeholder field | Ed25519 verification |
| Bootstrap identity | Deterministic seed | Hardware-backed YubiKey root of trust |
| Storage backing | RAM-only | Block device + network peers |
| Ownership transfer | Not in scope | Signed OwnershipTransfer objects |
For which phase is currently realized in the code, see STATUS.md § Phase markers. The interfaces are final. The implementations upgrade in place.
Verification
Test counts and what each test covers (Principal construction, IPC sender stamping, CambiObject hashing, RamObjectStore put/get/delete/list/capacity, etc.) live in STATUS.md § Test coverage.
References
- ADR-000: Zero-Trust Architecture and Capability-Based Access Control
- ADR-004: Cryptographic integrity (Blake3 + Ed25519)
- identity.md: Identity architecture (authoritative design document)
- FS-and-ID-design-plan.md: Phase intent for identity + storage
src/fs/mod.rs,src/fs/ram.rs: CambiObject and RamObjectStoresrc/ipc/mod.rs: Principal type, sender_principal stampingsrc/ipc/capability.rs: Principal binding on ProcessCapabilitiesuser/fs-service/src/main.rs: User-space FS service
Divergence
- Date: 2026-04-17
- Implementation: commit
6aec800(fs/OBJECT_STORE: dyn dispatch → ObjectStoreBackend enum) - Trigger: Formal-verification audit of src/microkernel/main.rs surfaced
static OBJECT_STORE: Spinlock<Option<Box<dyn fs::ObjectStore + Send>>>(src/lib.rs:533) as adyntrait object on a kernel hot path. CLAUDE.md’s Formal Verification rule says: “No trait objects in kernel hot paths. Monomorphized generics are statically analyzable; dynamic dispatch is not.”ObjectStore::get/put/delete/listare called from everySYS_OBJ_*syscall handler — unambiguously a hot path.
What changed
The kernel-side dispatch at the OBJECT_STORE static moves from Box<dyn ObjectStore + Send> to an enum dispatch shim:
pub enum ObjectStoreBackend {
Ram(RamObjectStore),
LazyDisk(DiskObjectStore<VirtioBlkDevice>),
// future: Network(...), etc. — each new backend = one enum variant
}
impl ObjectStore for ObjectStoreBackend {
fn get(&self, hash: &[u8; 32]) -> Result<CambiObject, StoreError> {
match self {
Self::Ram(s) => s.get(hash),
Self::LazyDisk(s) => s.get(hash),
}
}
// ... put / delete / list / count delegated identically
}
pub static OBJECT_STORE: Spinlock<Option<ObjectStoreBackend>> = Spinlock::new(None);
What did not change
- The
ObjectStoretrait remains the specification — it still defines what every backend must implement, and individual backends (RamObjectStore,DiskObjectStore) stillimpl ObjectStore for …. This preserves the Formal Verification rule’s “separation of specification from implementation” — the trait is the spec, the enum is the impl shim that monomorphizes dispatch. - The lazy RAM → Disk swap pattern described in src/fs/lazy_disk.rs is preserved without behavior change. The atomic two-phase install (handshake outside the lock, install under the lock) becomes:instead of
*guard = Some(ObjectStoreBackend::LazyDisk(store));Some(Box::new(store)). Callers see the sameObjectStoreinterface through the enum’s trait impl. - Test code that uses
dyn ObjectStore(mock stores in unit tests with their own scope) is unchanged —dynis permitted in test code per the Formal Verification rule’s “non-test kernel code” qualifier.
Cost
Each new backend (e.g., a future NetworkObjectStore for peer sync per the original ADR’s openness) requires one new enum variant and one new arm in each delegated method. That is the exact cost a verifier wants to see — closed-world, exhaustive match, no unbounded extension point in kernel code. Adding a backend is a single-file change with a compile error if any method dispatch arm is missed.
Why not other options
| Considered | Why rejected |
|---|---|
Static-dispatch generics (Spinlock<Option<S: ObjectStore>>) | Doesn’t compile — static items require concrete types; can’t carry a generic parameter pinned at runtime. |
Drop the ObjectStore trait entirely; single struct with internal Backend enum | Loses the spec/impl separation the Formal Verification rule wants encoded. The trait is the spec; the enum is the impl. Conflating them erases the audit point. |
Keep dyn, document the verification debt in ASSUMPTIONS.md | Pure deferral. The fix is structurally cheap and the debt is on a hot path — paying it now beats carrying it. |
Related (not in this ADR’s scope)
The same rule and the same monomorphization pattern apply to the kernel-side IPC interceptor (Box<dyn IpcInterceptor> on both IpcManager and ShardedIpcManager, called on every IPC send). That site will receive the same enum-dispatch treatment in a separate change, with a divergence appended to the relevant policy/interceptor ADR (to be decided when that work is sequenced). The decision rule is identical; only the call site differs.
To keep the follow-up from being lost in an appendix:
- Source-level
// VERIFICATION DEBT:markers tag the twodyn IpcInterceptorfield sites in src/ipc/mod.rs. - A row in STATUS.md § Known issues names the debt at the project-status level.
- The CLAUDE.md “Policy /
on_syscall/ interceptor decisions” Required Reading row links back to this divergence so any future edit onsrc/ipc/interceptor.rspicks up the precedent before code is written.
Verification
After this change, every SYS_OBJ_* handler dispatches via match-arm calls (monomorphized at compile time, statically analyzable, exhaustive). The trait remains as the specification verifier targets implement against. The kernel binary contains no dyn ObjectStore references.
Divergence: 2026-04-30 — Principal as 32-byte AID
- Superseded by: ADR-025 (Principal as 32-byte AID, decoupled from key bytes)
- Trigger: Pre-v1 audit of post-quantum migration cost. ML-DSA-65 keys are 1952 bytes; if “Principal IS the Ed25519 pubkey” remained the contract, the v1.5 PQ upgrade would force
sender_principalout of the 256-byte IPC envelope (ADR-005) and force a redesign of the boot-sized capability tables (ADR-008). identity.md § Dynamic-Sized Field Space had already declared the identity layer “algorithm-agnostic” with dynamic-sized keys — a posture this ADR’s Principal definition contradicted.
What changed
The ratified contract for Principal is now: a 32-byte AID (Autonomic Identifier), not a public key. The 32-byte size is architectural — fixed by the AID model, invariant under key rotation and algorithm migration. In v1, the AID bytes coincide with an Ed25519 pubkey for backward continuity (no key event log yet, no rotation, no keystore). In v1.5+, the AID is blake3(key_event_log_inception_block) and the actual signing key is resolved at verify time via the keystore service.
The Principal struct in src/ipc/mod.rs renames public_key: [u8; 32] to aid: [u8; 32] and exposes two accessors: aid() for identity equality / IPC stamping / lookup, and current_key_bytes() for verifier sites. Pre-v1, current_key_bytes() is the identity function on aid(); at v1.5 its body becomes a keystore round-trip — a one-function migration instead of a five-site refactor.
What did not change
- The
sender_principalwire format stays 32 bytes (src/ipc/mod.rsMessage::sender_principal). The 256-byte IPC envelope (ADR-005) is preserved without modification. BindPrincipalsemantics. The kernel still accepts a raw 32-byte value from userspace; it does not hash. The semantic shift is in vocabulary: those 32 bytes are now the AID, and v1 happens to use raw pubkey bytes as a v1-only AID-derivation shortcut.- Bootstrap trust model. The bootstrap Principal continues to be compile-time embedded in
bootstrap_pubkey.bin. (The file format gains a header in a separate change so the algorithm is explicit, but the trust path is unchanged.) - CambiObject author/owner fields. Still raw 32-byte public keys per ADR-004. When the keystore lands at v1.5, those fields can either continue to carry pubkeys or migrate to AIDs; that choice belongs to the keystore-service ADR, not this one.
Note on the original Principal definition above
The struct shown in this ADR’s “Principal: The Identity Primitive” section (line 35) lists a key_hash: [u8; 16] FNV-1a fast-comparison field that was never implemented in the actual code (the kernel-side Principal only ever carried public_key: [u8; 32]). That field is dropped under ADR-025; comparison is by full 32-byte AID equality. Documenting here so the historical drift between the original ADR and the kernel code does not propagate further.