Early Access Preview—AitherOS is in active development. Features may change, break, or disappear.

LLM

0/24

GPU0/0GB

IDLEFREE

Connecting to services…

•

Live Demo

Invite Only

Theme

GitHub

Live Demo

Invite Only

Theme

GitHub

Back to blog

engineeringmulti-tenancyarchitecturesecuritygraphsisolation

Multi-Tenant Graph Scoping: Zero-Bleed Isolation Across 23 Faculty Graphs

April 23, 202612 min readAitherium

The Problem: 23 Graphs, Zero Isolation

AitherOS runs 23+ in-process faculty graphs. MemoryGraph stores episodic and semantic memories. CodeGraph indexes every function and class via AST. EventGraph tracks causal event chains. DocGraph, RAGGraph, WikipediaGraph, SecurityGraph — each one a specialized knowledge store powering the cognitive pipeline.

They all share one problem: no tenant isolation.

When AitherOS runs as a single-user platform, this is fine. But the moment you add tenants — SaaS customers, workspace users, multi-agent deployments — every graph becomes a data leak vector.

The durable store (AitherKnowledgeGraph service) already enforced tenant isolation with a proper _tenant_index and intersection-filtered queries. But the hot in-process caches? Completely flat. A tenant's memories were indistinguishable from any other tenant's memories.

What We Needed

Scope hierarchy: Platform → Tenant → Workspace → User
Automatic enforcement: Reads scope from async context — callers don't need to change
Zero breakage: Every existing query path must keep working unchanged
One fix, 23 graphs: Inheritance-based — change the base class, all children are secured

Architecture

graph TD
    REQ[Incoming Request] --> JWT[JWT / Headers]
    JWT --> TC[TenantContext]
    TC --> GS[GraphScope]
    GS --> CV[ContextVar Propagation]
    CV --> BFG[BaseFacultyGraph._scope_filter]
    BFG --> MG[MemoryGraph<br/>scope: tenant]
    BFG --> EG[EventGraph<br/>scope: workspace]
    BFG --> CG[CodeGraph<br/>scope: platform]
    BFG --> DG[DocGraph<br/>scope: tenant]
    BFG --> SG[SecurityGraph<br/>scope: tenant]
    BFG --> MORE[... 18 more graphs]

The Scope Hierarchy

The ScopeLevel enum defines four isolation depths:

class ScopeLevel(str, Enum):
    PLATFORM = "platform"     # No isolation — operator sees all
    TENANT = "tenant"         # Isolated by tenant_id
    WORKSPACE = "workspace"   # Isolated by tenant_id + workspace_id
    USER = "user"             # Isolated by tenant_id + workspace_id + user_id

The GraphScope dataclass carries the full hierarchy through async chains:

@dataclass
class GraphScope:
    tenant_id: str = "platform"
    workspace_id: str = ""
    user_id: str = ""
    agent_id: str = ""

    def matches(self, node_meta: dict, level: ScopeLevel) -> bool:
        if self.is_platform:
            return True
        # Platform-owned nodes are visible to everyone (shared infra)
        if node_meta.get("tenant_id", "platform") == "platform":
            return True
        # Tenant check
        if node_meta["tenant_id"] != self.tenant_id:
            return False
        # Workspace check (if scope level requires it)
        if level in (ScopeLevel.WORKSPACE, ScopeLevel.USER):
            if self.workspace_id and node_meta.get("workspace_id"):
                if node_meta["workspace_id"] != self.workspace_id:
                    return False
        return True

The key insight: platform-owned nodes (shared infrastructure) are always visible to every tenant. A tenant can see CodeGraph results for AitherOS internals, but never another tenant's private memories or documents.

The Base Class Fix

Every faculty graph inherits from BaseFacultyGraph. We added scope-aware filtering at this level:

class BaseFacultyGraph:
    # Subclasses override this to declare isolation level
    _scope_level: str = "platform"  # Default: no filtering

    def _scope_filter_node(self, node_meta: dict) -> bool:
        """Check if a node passes scope filtering."""
        if self._scope_level == "platform":
            return True
        scope, level = self._get_scope()  # From ContextVar
        if scope is None or scope.is_platform:
            return True
        return scope.matches(node_meta, level)

    def _scope_filter_results(self, results: list) -> list:
        """Batch-filter query results by scope."""
        if self._scope_level == "platform":
            return results  # Fast path: no filtering
        scope, level = self._get_scope()
        if scope is None or scope.is_platform:
            return results
        return [r for r in results if scope.matches(
            self._extract_meta(r), level
        )]

One change. 23 graphs secured.

MemoryGraph: The Critical Case

MemoryGraph is where tenant isolation matters most. Before:

# BEFORE: No tenant awareness
def hybrid_query(self, query, agent_id=None, scope="shared"):
    eligible = self._get_eligible_nodes(...)  # All tenants mixed

After:

# AFTER: Tenant-isolated queries
def hybrid_query(self, query, agent_id=None, scope="shared",
                 tenant_id=None):
    # Auto-resolve from ContextVar when not passed
    if tenant_id is None:
        tenant_id = get_current_graph_scope().tenant_id
    eligible = self._get_eligible_nodes(..., tenant_id=tenant_id)

The filter in _get_eligible_nodes is surgical:

# Non-platform tenants only see their own + platform-shared memories
if tenant_id and tenant_id != "platform":
    node_tenant = getattr(mem, "tenant_id", "platform")
    if node_tenant != "platform" and node_tenant != tenant_id:
        continue  # Invisible — different tenant

Platform-owned memories (system knowledge, shared procedures) remain visible to all tenants. Tenant-specific memories are walled off.

Graph Classification

Platform (8 graphs): CodeGraph, ServiceGraph, InfraGraph, ConfigGraph, ScriptGraph, TestGraph, TypeGraph, APIGraph

Tenant-scoped (10 graphs): MemoryGraph, DocGraph, RAGGraph, StrataGraph, WikipediaGraph, MediaGraph, DirectoryGraph, FluxGraph, LogGraph, SecurityGraph

Workspace-scoped (2 graphs): EventGraph, KVCacheGraph

Pipeline Wiring: Automatic Propagation

The pipeline propagation required minimal changes:

# AgentRuntime._faceted_run() — reads scope once
from lib.core.AitherTenant import get_current_graph_scope
tenant_id = get_current_graph_scope().tenant_id

# Passes to TieredContextAssembler
assembler = TieredContextAssembler(
    session_id=session_id,
    effort_level=effort,
    tenant_id=tenant_id,  # NEW — propagated to all graph queries
)

Callers that don't pass tenant_id explicitly? They still get scoped. hybrid_query() auto-resolves from the ContextVar:

if tenant_id is None:
    tenant_id = get_current_graph_scope().tenant_id

Zero existing callers needed changes. The ContextVar propagation does the work.

The Enforcement Stack

Isolation is enforced by five services working in concert:

Service	Role
AitherIdentity	Extracts tenant from JWT → TenantContext
AitherTenant	Propagates TenantContext + GraphScope via ContextVars
BaseFacultyGraph	Filters every graph query by scope level
AitherStrata	Scopes storage paths: `tenants/{slug}/`
AitherFlux	Scopes event channels: `{slug}:event_name`

Non-local requests without a valid tenant are fail-closed to PUBLIC — never PLATFORM. This is enforced in resolve_full_caller_context().

Design Decisions & Trade-offs

Platform nodes visible to all: Shared infrastructure knowledge (CodeGraph, ServiceGraph) is accessible to every tenant. A tenant can query how AitherOS services work, but never see another tenant's memories.
ContextVar over mandatory params: Scope is resolved implicitly from the async context. Existing callers work unchanged. The trade-off is implicit behavior — but the alternative (refactoring every caller) would have taken months.
Class-level scope declaration: _scope_level is a class attribute, not instance. A MemoryGraph is always tenant-scoped. You can't create an unscoped instance by accident.
Fail-closed defaults: No scope context → treated as platform (backwards compatible for self-hosted). Non-local requests without tenant → forced to PUBLIC (never PLATFORM).

What's Next

Capability gates: Add graph-level RBAC in capabilities.yaml — control which plan tiers can access which graphs
Workspace isolation expansion: More graphs moving from TENANT to WORKSPACE as workspace semantics mature
Scope audit tooling: Automated verification that no cross-tenant data leaks through any query path
Neuron scoping: Wire GraphScope into NeuronFire so speculative prefetch respects tenant boundaries

One base class. 23 graphs. Zero bleed. That's how you build multi-tenant isolation without breaking everything.

Enjoyed this post?

All posts Try AitherOS