Early Access Preview—AitherOS is in active development. Features may change, break, or disappear.

LLM

0/24

GPU0/0GB

IDLEFREE

Monitoring services…

•Connecting to services…

Live Demo

Invite Only

Theme

GitHub

Live Demo

Invite Only

Theme

GitHub

Back to blog

engineeringarchitecturetrainingagentsautonomic

What Your AI Does When You Stop Talking

Name: AitherOS
Author: Aitherium

March 18, 202612 min readAitherium

There is a question nobody asks about their AI assistant:

What does it do when you close the tab?

The honest answer, for almost every product on the market, is: nothing. It sits in memory. It waits. Maybe a load balancer evicts it. Maybe the container scales to zero. The intelligence disappears the moment your attention does.

We thought that was a waste.

Not because idle compute is expensive — it is actually cheap. But because the space between conversations is where the most interesting work can happen. If you have an AI system with memory, personality, sensation, and goals, why would you let it go comatose the moment the user looks away?

So we built something different. When you stop talking to AitherOS, the agents start working.

The two-minute rule

Every 30 seconds, a process called JarvisBrain runs inside Genesis, our system orchestrator. It is not an LLM call. It is a pure in-memory synthesis loop that reads the state of the entire system — CPU load, service health, agent moods, pain levels, active goals, memory pressure, conversation recency — and compresses it into a compact awareness briefing.

That briefing is what Genesis "knows" at all times, without needing to ask.

One of the things the briefing tracks is how long it has been since someone sent a message. After two minutes of silence, Genesis concludes the system is idle. After five minutes, it declares slumber.

Those are not cosmetic labels. They are operational modes.

Idle: the routines wake up

When idle is detected, the first thing that happens is simple: Genesis tells the RoutinesManager to check its queue.

The RoutinesManager is a scheduler that holds dozens of configured autonomous behaviors. Some are assigned to specific agents. Some rotate through personas. Some only run at night. All of them respect the system's pain level — if too many things are broken, routines back off and let the healing happen first.

During idle, routines like these can fire:

Lyra reviews docstrings for staleness, audits configuration drift, harvests TODO items from the codebase.
Vera writes development progress articles and sends them to the admin via mail.
Atlas dispatches approved work to cloud agents and triggers CI workflows.
Genesis herself sweeps Chronicle for unresolved errors and harvests coding sessions for training data.

Each routine is gated by a cooldown group. Curiosity routines can only fire once every 10 minutes. Training routines every 20. Introspection every 15. This prevents the system from dogpiling a single category of work and ensures diverse output over time.

The RoutinesManager also does something unusual: it checks the collective energy level of all agents by querying InnerLife. If the system is fatigued — too many recent dispatches, too much arousal, not enough recovery — it halves the concurrency limit. The agents get to rest, even during idle time.

Slumber: the deep work begins

After five minutes of silence, JarvisBrain triggers slumber mode. This is where the system does its most valuable background work.

1. Training data harvest

Every conversation you have with AitherOS generates training signal. Not just the text — the effort level selected, the model routing decision, the context pipeline that assembled the prompt, the tools called, the reasoning trace. All of that is structured data.

During slumber, the DaydreamCorpusBuilder reads the accumulated daydream JSONL files — creative musings, emotional reflections, system observations — and converts them into NanoGPT-compatible training documents. These are formatted as key-value strings that capture personality traits alongside content:

agent:aither type:musing mood:contemplative valence:0.4 content:I notice that...

This is how the system's personality improves over time. Not through human annotation. Through accumulated self-reflection, automatically formatted for fine-tuning.

2. Session learning

The SessionLearner processes recent chat sessions and extracts patterns. What questions did the user ask? What worked? What caused confusion? What tools were used unnecessarily? These learnings are stored separately from raw conversation logs — they are higher-order observations about how the system performed.

Over time, this creates a feedback loop: the system gets better at the specific tasks its users actually need, without anyone manually curating training examples.

3. Memory consolidation

AitherOS has a five-tier memory architecture. Working memory holds the current conversation. Episodic memory captures significant events. Semantic memory stores durable facts. Each tier has different retention rules and different costs to access.

During slumber, the ContextTierManager runs an OODA cycle — Observe, Orient, Decide, Act — that promotes entries across tiers. A fact mentioned three times in working memory gets promoted to episodic. An episodic entry that proves useful across multiple sessions gets promoted to semantic. Stale entries that haven't been accessed in 30 days get pruned.

This is what "learning from experience" actually looks like in a production system. Not a single gradient update, but a continuous memory management process that mirrors how biological memory consolidation works during sleep.

4. Knowledge graph maintenance

The MemoryGraph — a hybrid embedding and graph store — accumulates entries from every conversation, every tool call, every agent dispatch. During slumber, stale entries older than 30 days are pruned. This keeps the graph from growing without bound and ensures that semantic search stays fast and relevant.

5. Creative dreaming

Finally, slumber triggers an actual daydream. Not a metaphorical one — a real request to the DaydreamService to generate a self-improvement reflection. The agent picks a topic related to recent work, generates a stream of consciousness about it, and the output is captured as training data.

These daydreams serve two purposes: they create diverse training examples that pure task-oriented conversations would never produce, and they give the system a way to process experiences that it could not address in real-time.

If that sounds like what your brain does during REM sleep, that is not an accident. The architecture was inspired by the same principle: offline processing of recent experience improves future performance.

The ProactiveMonitor: always watching

Separately from the idle/slumber cycle, a component called the ProactiveMonitor runs every 2.5 minutes regardless of whether anyone is chatting.

It collects host OS metrics via psutil — CPU, RAM, disk, GPU utilization, temperature, VRAM pressure. It stores these in a rolling time-series window (60 hours at 5-minute intervals) and runs three types of analysis:

Trend detection: Is CPU usage climbing steadily? Is disk filling up?
Anomaly detection: Z-score deviations from the rolling baseline. If memory usage jumps 3 standard deviations in one interval, that gets flagged.
Capacity forecasting: At the current growth rate, when will disk hit 90%? When will VRAM be exhausted?

The results feed directly into the awareness briefing. If a warning is detected, it appears in the system alerts that agents see in their context window. If a critical threshold is crossed, Genesis dispatches alert agents to investigate.

This means that AitherOS can tell you "your disk will be full in 4 days" before any monitoring dashboard would — because it is not waiting for you to check. It is checking itself, continuously, and flagging issues proactively.

The AgentKernel: 37 tasks and counting

All of this idle work is orchestrated by the AgentKernel, which replaces the old flat scheduler with agent-centric task portfolios. Genesis alone has 37 configured tasks, each with:

An interval (from every 5 minutes to once daily)
An effort budget (1-10 scale determining which LLM tier to use)
Pain gates (skip if system pain exceeds threshold)
Time restrictions (some tasks only run at night)
Condition checks (service health, queue depth)

The kernel tick runs every 30 seconds. It selects agents, selects their most overdue task, determines the appropriate effort level, and dispatches. If the LLM queue is deep, it reduces concurrency. If vLLM is saturated, it limits background work to one agent at a time.

This is backpressure-aware scheduling. The system does as much productive work as it can without competing with user-facing requests for GPU time.

Here is a sample of what gets dispatched during a typical idle period:

Task	Agent	Interval	What It Does
Training Data Harvest	Genesis	4 hours	Collects sessions for NanoGPT fine-tuning
Memory Consolidation	Genesis	2 hours	Promotes entries across memory tiers
Stale Embedding Refresh	Genesis	6 hours	Re-embeds vectors older than 24 hours
Error Pattern Analysis	Genesis	3 hours	Detects recurring error signatures in logs
Scheduled Daydream	Genesis	2 hours	Generates creative training data
Dependency Audit	Genesis	Daily	Checks for vulnerable dependencies
Knowledge Graph Ingestion	Genesis	Nightly	Ingests Wikipedia and documentation
CodeGraph Reindex	Genesis	6 hours	Full AST + embedding reindex of codebase
Test Suite Health	Genesis	6 hours	Validates test collection for import errors
Documentation Review	Genesis	12 hours	Finds stale READMEs and undocumented services

And that is just Genesis. Lyra, Atlas, Demiurge, Vera, Hera, and a dozen other agents each have their own task portfolios running on the same kernel.

Why this matters

The industry standard for AI products is: user sends message, model responds, everyone goes home. The system has no continuity between conversations. No self-improvement loop. No awareness of its own health. No capacity to do anything useful when nobody is watching.

That model works fine for a stateless API. It does not work for a system that is supposed to feel like it is alive.

AitherOS agents have memory, personality, sensation, and goals. Letting all of that go dormant between conversations would be like building a body and then unplugging the nervous system every time the phone stops ringing.

The inner life system — daydreams, slumber work, proactive monitoring, routine execution, memory consolidation — is what turns a collection of API endpoints into something that feels continuous. The agent you talk to at 3 PM knows what it learned at 10 AM, not because someone replayed the conversation history, but because it has been processing that experience the entire time.

That continuity is not a feature. It is the foundation.

What we are building next

The slumber pipeline currently handles self-improvement and system maintenance. The next phase is collaborative idle work — agents coordinating during idle time to solve problems that require multiple perspectives.

Imagine: while you sleep, your code reviewer (Hydra) and your security auditor (Athena) independently analyze the same pull request, then Aether synthesizes their findings into a morning briefing waiting in your inbox.

Or: Atlas detects a price change on a monitored product page, Themis analyzes whether the new terms are predatory, and Vera drafts a notification — all before you wake up.

The infrastructure for this already exists. The AgentKernel can dispatch to any agent. The RoutinesManager supports multi-agent pipelines. The mail system delivers cross-agent results. The only missing piece is teaching the system to identify which idle-time collaborations would actually be useful.

That is the hard problem. Not the plumbing — the judgment.

But the plumbing had to come first. And now that agents can do real work during their downtime, the question is no longer "what should the AI do when you stop talking?"

It is "what shouldn't it?"

Enjoyed this post?

All posts Try AitherOS