Valori Kernel makes specific engineering tradeoffs to prioritize determinism and portability over raw flexibility. Understanding these concepts is key to using the kernel effectively.
The primary goal of Valori is to guarantee that State A + Command B = State C is bit-identically true on every computer.
- The Problem: Floating point math (
f32) behaves differently on x86 vs ARM, and even with different compiler flags (e.g., FMA optimizations). - The Solution: We forbid
f32in the core logic.
Valori is not just a vector database. It is a Deterministic Memory Engine that fuses Semantic Vectors with a Knowledge Graph.
This hybrid approach allows AI agents to "remember" in two ways:
- Similarity (Vague): "Find things related to 'apples'."
- Structure (Precise): "Find the exact object linked to 'User:Alice' via 'Edge:Owns'."
The fundamental atomic unit of memory.
- What it is: A dense fixed-point vector (e.g., 16-dim or 1536-dim) representing meaning.
- Storage: Stored in a heap-allocated, dynamic memory pool that grows on demand.
- Addressing: Identified by a
RecordId(integer). - Self-Describing: The kernel auto-detects vector dimensions from the first ingestion, making it model-agnostic (Zero-Config).
- Metadata: Optional binary blob (up to 64KB). Deterministically hashed and snapshotted.
A lightweight graph overlay sitting on top of the vectors.
- Node: A semantic entity. Can be a
Document, aChunk, aUser, or aTask.- Note: A Node implementation points to a Record. This means every node in the graph has a "semantic embedding" attached to it.
- Edge: A directed link between nodes.
- Example:
Document (Node A)->ParentOf->Chunk (Node B).
- Example:
The mechanism for finding records similar to a query.
-
Brute Force (Exact):
- How it works: Scans every single record in memory.
- Pros: 100% Accuracy (Recall). Zero indexing time.
- Cons: Linearly slower as data grows.
- Best For: Datasets < 1M vectors.
-
HNSW (Approximate) [Coming Soon]:
- How it works: Builds a "navigateable small world" graph. Think of it like a highway system for vectors.
- Pros: Extremely fast (Logarithmic time). Can search billions of vectors in milliseconds.
- Cons: Uses more memory (RAM) to store links. 99% accuracy instead of 100%.
- Best For: Datasets > 1M vectors (Scale).
-
Valori's Strategy: The Kernel uses a
VectorIndextrait. This means you can start with Brute Force and hot-swap to HNSW when you scale, without changing your application code.
Traditional databases use float32 or float64. This is bad for distributed systems because 0.1 + 0.2 != 0.3 on all chips.
Valori used Fixed-Point Math (Q16.16):
- We treat numbers like integers.
1.0is stored as65536.- Addition/Multiplication is just integer math.
- Result: If you run Valori on a Raspberry Pi and a Supercomputer, the resulting database binary will be identical bit-for-bit.
This enables:
- Verifiable AI: Prove that an agent's memory hasn't been tampered with.
- Instant Sync: Sync state by just sending the binary snapshot. No "replication logs" needed.