OpenViking uses a dual-layer storage architecture that separates content storage from index storage, providing clear separation of concerns and enabling independent scaling.
Architecture Overview
┌─────────────────────────────────────────┐
│ VikingFS (URI Abstraction Layer) │
│ • URI Mapping │
│ • Hierarchical Access │
│ • Relation Management │
└────────────────┬────────────────────────┘
┌────────┼────────┐
│ │
┌────────▼────────┐ ┌─────▼───────────┐
│ Vector Index │ │ AGFS │
│ (Semantic │ │ (Content │
│ Search) │ │ Storage) │
│ │ │ │
│ • URIs │ │ • L0/L1/L2 │
│ • Vectors │ │ • Files │
│ • Metadata │ │ • Relations │
│ • No content │ │ • Multimedia │
└─────────────────┘ └─────────────────┘
Dual-Layer Design
AGFS - Content Storage Stores all actual file content
L0/L1/L2 full content
Multimedia files
Relations and metadata
POSIX-style operations
Vector Index - Semantic Search Stores only references and vectors
URIs (pointers to AGFS)
Dense/sparse vectors
Metadata fields
No file content
Design Benefits
Vector Index : Handles semantic retrieval and filtering
AGFS : Handles content storage and file operations
No overlap in responsibilities
Vector index doesn’t store file content, saving memory
Only URIs and vectors in index
Actual content read from AGFS on demand
All content read from AGFS (single source of truth)
Vector index only stores references
Eliminates data synchronization issues
Vector index can scale independently for search performance
AGFS can scale independently for storage capacity
Different backends for different needs
VikingFS: Virtual Filesystem
VikingFS is the unified URI abstraction layer that hides underlying storage details and provides a consistent interface.
URI Mapping
VikingFS maps virtual URIs to physical paths in the AGFS backend.
Core API
File Operations
Layer Operations
Relation Management
Search Integration
from openviking.storage.viking_fs import get_viking_fs
viking_fs = get_viking_fs()
# Read file content (L2)
content = await viking_fs.read(
"viking://resources/docs/api.md"
)
# Write file
await viking_fs.write(
"viking://resources/docs/new.md" ,
"# New Document \n ..."
)
# Create directory
await viking_fs.mkdir( "viking://resources/new-project/" )
# Delete file/directory
await viking_fs.rm(
"viking://resources/old-project/" ,
recursive = True
)
# Move/rename
await viking_fs.mv(
"viking://resources/docs/old.md" ,
"viking://resources/docs/new.md"
)
# Read L0 abstract
abstract = await viking_fs.abstract(
"viking://resources/docs/"
)
# Read L1 overview
overview = await viking_fs.overview(
"viking://resources/docs/"
)
# Write context (L0/L1/L2)
await viking_fs.write_context(
uri = "viking://resources/docs/auth/" ,
abstract = "Brief auth summary" ,
overview = "Detailed auth overview \n ..." ,
is_leaf = False
)
# Create relation
await viking_fs.link(
from_uri = "viking://resources/docs/auth" ,
uris = [ "viking://resources/docs/security" ],
reason = "Related security docs"
)
# Get relations
relations = await viking_fs.relations(
"viking://resources/docs/auth"
)
for rel in relations:
print ( f "Related: { rel.uri } - { rel.reason } " )
# Semantic search (uses vector index)
results = await viking_fs.find(
query = "authentication methods" ,
target_uri = "viking://resources/"
)
# Results contain URIs and abstracts
for ctx in results.resources:
# Full content loaded from AGFS on demand
content = await viking_fs.read(ctx.uri)
AGFS: Backend Storage
AGFS (Agent Filesystem) provides POSIX-style file operations with multiple backend support.
Backend Types
LocalFS (Default)
S3FS
Memory (Testing)
Local filesystem storage {
"storage" : {
"workspace" : "/home/user/openviking_workspace"
}
}
Uses local disk for storage
Best for development and single-node deployment
Fast read/write performance
S3-compatible object storage {
"storage" : {
"backend" : "s3fs" ,
"bucket" : "openviking-data" ,
"endpoint" : "https://s3.amazonaws.com" ,
"access_key" : "..." ,
"secret_key" : "..."
}
}
Uses S3 for distributed storage
Supports AWS S3, MinIO, Aliyun OSS, etc.
Enables multi-node deployment
In-memory storage # Only available in code, not config
from openviking.storage.agfs import MemoryFS
agfs = MemoryFS()
Uses RAM for storage
Best for testing and temporary use
Data lost when process ends
Directory Structure
Each context directory follows a unified structure in AGFS:
viking://resources/docs/auth/
├── .abstract.md # L0: Abstract (~100 tokens)
├── .overview.md # L1: Overview (~2k tokens)
├── .relations.json # Relations table
├── .meta.json # Metadata (timestamps, etc.)
├── oauth.md # L2: Full OAuth docs
├── jwt.md # L2: Full JWT docs
└── api-keys.md # L2: Full API key docs
All these files are physically stored in AGFS. The vector index only stores references to them.
Vector Index
The vector index stores semantic indices, supporting vector search and scalar filtering.
Context Collection Schema
Field Type Description idstring Primary key (UUID) uristring Resource URI (references AGFS) parent_uristring Parent directory URI context_typestring resource/memory/skill is_leafbool Whether leaf node (file vs directory) vectorvector Dense vector (1024 or 3072 dims) sparse_vectorsparse_vector Sparse vector (optional) abstractstring L0 abstract text (for display) levelint Context level (0=L0, 1=L1, 2=L2) namestring Resource name descriptionstring Description (for skills) created_atstring Creation timestamp updated_atstring Update timestamp active_countint64 Usage count account_idstring Account identifier owner_spacestring User/agent space
Index Strategy
index_meta = {
"IndexType" : "flat_hybrid" , # Hybrid index (dense + sparse)
"Distance" : "cosine" , # Cosine similarity
"Quant" : "int8" , # Quantization for memory efficiency
}
The vector index uses hybrid search combining dense and sparse vectors for better retrieval accuracy.
Backend Support
Local (Default)
HTTP Remote
Volcengine VikingDB
Local persistence using embedded vector DB {
"vectordb" : {
"type" : "local" ,
"path" : "/home/user/openviking_workspace/vectordb"
}
}
Embedded vector database
Best for development and small deployments
No external dependencies
Remote vector service via HTTP {
"vectordb" : {
"type" : "http" ,
"url" : "http://vectordb-service:8080"
}
}
Connects to remote vector service
Enables distributed deployment
Shared vector index across multiple clients
Volcengine managed vector database {
"vectordb" : {
"type" : "volcengine" ,
"host" : "api-vikingdb.volces.com" ,
"region" : "cn-beijing" ,
"ak" : "your-access-key" ,
"sk" : "your-secret-key"
}
}
Fully managed vector database service
High performance and scalability
Production-ready
Vector Synchronization
VikingFS automatically maintains consistency between vector index and AGFS. Manual synchronization is not needed.
Delete Sync
When deleting from AGFS, vector index is automatically updated:
# Delete directory
await viking_fs.rm( "viking://resources/docs/auth" , recursive = True )
# Automatically:
# 1. Deletes files from AGFS
# 2. Deletes all records with URI prefix "viking://resources/docs/auth" from vector index
Move Sync
When moving/renaming in AGFS, vector index URIs are updated:
# Move directory
await viking_fs.mv(
"viking://resources/docs/auth" ,
"viking://resources/docs/authentication"
)
# Automatically:
# 1. Moves files in AGFS
# 2. Updates uri and parent_uri fields in vector index
# 3. Updates all descendant URIs
Write Sync
When writing new content, vector index is updated:
# Write new resource
await viking_fs.write_context(
uri = "viking://resources/docs/new/" ,
abstract = "New documentation" ,
overview = "Detailed overview..." ,
is_leaf = False
)
# Automatically:
# 1. Writes .abstract.md and .overview.md to AGFS
# 2. Enqueues embedding generation
# 3. Adds vector records to index after embedding
Data Flow Example
Adding a Resource
Searching and Reading
Implementation Example
VikingFS API
Storage Configuration
from openviking.storage.viking_fs import VikingFS
class VikingFS :
"""Virtual filesystem with URI abstraction."""
def __init__ ( self , agfs , vector_index ):
self .agfs = agfs # Content storage
self .vector_index = vector_index # Semantic index
async def read ( self , uri : str ) -> str :
"""Read file content from AGFS."""
return await self .agfs.read_file(uri)
async def write ( self , uri : str , data : str ):
"""Write file to AGFS."""
await self .agfs.write_file(uri, data)
async def abstract ( self , uri : str ) -> str :
"""Read L0 abstract."""
return await self .agfs.read_file(
f " { uri.rstrip( '/' ) } /.abstract.md"
)
async def overview ( self , uri : str ) -> str :
"""Read L1 overview."""
return await self .agfs.read_file(
f " { uri.rstrip( '/' ) } /.overview.md"
)
async def find ( self , query : str , target_uri : str = None ):
"""Semantic search using vector index."""
# Vector index returns URIs + metadata
results = await self .vector_index.search(
query = query,
filter = { "parent_uri" : target_uri} if target_uri else None
)
# Content loaded from AGFS on demand
return results
Architecture System architecture overview
Context Layers L0/L1/L2 progressive loading
Viking URI URI specification and operations
Retrieval How vector index is used for search