visionarchitectureself-hostedagentsfounder

Your AI Should Work For You -- Not the Other Way Around

March 13, 202610 min readDavid Parkhurst

Perplexity just announced something called Personal Computer. It is a Mac Mini with their software baked in, running their AI continuously on your desk. Always listening. Always indexing. Always connected to their cloud.

The pitch is compelling: an AI that knows your files, your habits, your context -- running 24/7 on dedicated hardware right next to you.

We have been building exactly this for over a year. Except ours is open source, runs on hardware you already own, uses any model you choose, and keeps every byte of your data on your machine.

This is not a competitive jab. Perplexity shipping a hardware product validates the thesis we have been building on since early 2025: the future of AI is local, personal, and always-on. The question is who controls it.

The Appliance vs. The Operating System

Perplexity's approach is the appliance model. You buy their hardware. You run their software. Your data flows through their servers for the hard parts. It works out of the box, and that is genuinely valuable for people who want zero setup.

AitherOS is the operating system model. You install it on whatever hardware you have. You choose the models. You own the data pipeline end to end. It takes more setup, but you get something fundamentally different: sovereignty.

Here is what that looks like in practice:

	Perplexity Personal Computer	AitherOS
Hardware	Their Mac Mini (purchase required)	Your existing hardware -- any x86 with a GPU
Assistants	1 general-purpose assistant	16 specialized agents with distinct capabilities
Models	Their proprietary models	Any model -- Llama, Qwen, DeepSeek, Mistral, or your own fine-tune
Model routing	Single model	Effort-routed: simple queries hit a 3B CPU model, complex work hits a 14B reasoning model
Data	Flows through their servers	Never leaves your machine
Source code	Closed	Open source (BSL-1.1)
Agent coordination	No multi-agent orchestration	11-agent coding swarms, autonomous task delegation, fleet mode
GPU management	Unknown	VRAM-aware scheduling, priority queues, concurrent model serving
Getting started	Buy their hardware	`pip install aither-adk`

Why 16 Agents Instead of 1

A single assistant is fine for answering questions. It is not fine for running a business, managing infrastructure, or building software autonomously.

AitherOS runs 16 specialized agents, each with a distinct identity, toolset, and security boundary:

Genesis orchestrates everything -- the central brain that plans and delegates
Demiurge writes and refactors code across any language
Hydra reviews code with security and quality analysis
Athena runs security audits and threat modeling
Apollo handles performance optimization
Atlas manages service discovery and infrastructure mapping
Prometheus builds simulation environments and procedural worlds
Viviane manages long-term memory and knowledge retrieval

Each agent has HMAC-signed capability tokens that control exactly what it can access. No agent gets God-mode permissions. The security model is cryptographic, not trust-based.

When you ask AitherOS to build something complex, Genesis dispatches a swarm -- 3 coders, 2 testers, 2 security reviewers, and a documentation writer, all running in parallel, each backed by the agent identity best suited to the role. The result gets reviewed, tested in a sandbox, and delivered as a complete package.

A single assistant cannot do this. Multi-agent orchestration is not a feature you bolt on -- it is an architectural foundation.

Effort Routing: The Right Model for Every Task

Perplexity routes everything through their models. You get what they give you.

AitherOS uses effort-based routing. Every request gets classified by complexity, and the system picks the right model automatically:

Effort 1-2 (quick answers, greetings): Llama 3.2 3B on CPU. Zero GPU cost. Responds in milliseconds.
Effort 3-6 (tool use, coding, analysis): Nemotron-Orchestrator-8B on GPU. Fine-tuned on our own tool-routing dataset.
Effort 7-10 (deep reasoning, research): DeepSeek-R1 14B on GPU. Full chain-of-thought reasoning.

MicroScheduler coordinates all of this -- tracking VRAM budgets, managing priority queues, and preempting lower-priority work when a high-priority request arrives. You never choose a model. The system knows.

Your Data Never Leaves

This is the part that matters most.

Perplexity's Personal Computer connects to their cloud for model inference. Your files, your context, your queries -- they flow through their servers. They say they do not train on your data. Maybe they do not today.

With AitherOS, there is no "their servers." The models run on your GPU. The knowledge graph lives in SQLite on your disk. The embeddings are computed locally. The training pipeline -- if you choose to fine-tune -- runs on your hardware using your data.

The architectural guarantee is simple: there is no outbound connection to any AI provider unless you explicitly configure one. No telemetry. No phoning home. No "we updated our privacy policy" emails.

Open Source Means You Can Verify

AitherOS is licensed under BSL-1.1. The source code is on GitHub. You can read every line of the agent kernel, the security middleware, the capability token system, the memory graph. You can audit the network calls. You can verify that no data leaves your machine.

You cannot do this with a closed appliance. You are trusting a company's privacy policy instead of verifying the code yourself.

The On-Ramp: From pip install to Enterprise

Getting started should not require buying hardware.

pip install aither-adk
aither init my-agent
aither run

That gives you a single-agent setup with local LLM inference (via Ollama), SQLite memory, and a FastAPI server with an OpenAI-compatible API. It runs on a laptop.

From there, the path scales:

Free tier -- Local: One machine, local models, full agent capabilities. Zero cloud cost. This is the aither-adk experience.

Pro tier -- Cloud acceleration: Connect to the AitherOS mesh for cloud GPU overflow, the MCP SaaS gateway at mcp.aitherium.com, and federated agent discovery. Your data still stays local -- cloud resources handle compute overflow only.

Enterprise tier -- Sovereign deployment: Full AitherOS on bare metal. Rocky Linux 9, one bootstrap script, 97 microservices, air-gapped if needed. This is for organizations that need the complete platform under their own roof.

The point is that every tier keeps your data under your control. Cloud features are opt-in compute resources, not data pipelines.

What Perplexity Got Right

Credit where it is due: Perplexity understood that always-on, local-first AI is the next platform shift. Most companies are still building chatbots. Perplexity is building an ambient computing layer.

They also understood that dedicated hardware matters. A GPU sitting on your desk, running inference 24/7, is fundamentally different from an API call to a datacenter. The latency is lower. The availability is higher. The cost per token is zero after the hardware purchase.

We agree on all of this. We just disagree on who should control the stack.

What They Got Wrong

Locking it to their hardware is the wrong move. People already have powerful machines. Gamers have RTX 4090s and 5090s. Developers have workstations. Small businesses have servers. None of these people need another Mac Mini.

Locking it to their models is the wrong move. The model landscape changes every month. Qwen 3 just shipped. DeepSeek R1 rewrote the reasoning benchmark. Llama 4 is imminent. An AI platform that locks you to one provider's models is a platform with a built-in expiration date.

Routing data through their servers is the wrong move. Not because they are malicious, but because the architecture should make the question irrelevant. If the models run locally and the data never leaves, there is nothing to trust and nothing to audit.

The Thesis

The future of personal AI is not an appliance you buy. It is an operating system you install.

It runs on your hardware. It uses whichever models are best this month. It keeps your data under your physical control. It coordinates multiple specialized agents instead of routing everything through a single assistant. And when you outgrow a single machine, it scales to a mesh -- without changing the trust model.

We have been building this for over a year. It has 97 microservices, 16 agents, 300+ MCP tools, VRAM-aware GPU scheduling, autonomous training pipelines, and a knowledge graph that learns from every conversation.

It starts with one command:

pip install aither-adk

Links

GitHub: github.com/Aitherium/aither
PyPI: pypi.org/project/aither-adk
AitherZero (PowerShell automation): github.com/Aitherium/AitherZero
Live demo: demo.aitherium.com
Website: aitherium.com

Enjoyed this post?

All posts Try AitherOS