Early Access Preview—AitherOS is in active development. Features may change, break, or disappear.

LLM

0/24

GPU0/0GB

IDLEFREE

Monitoring services…

•Connecting to services…

Live Demo

Invite Only

Theme

GitHub

Live Demo

Invite Only

Theme

GitHub

Back to blog

agentscomputer-useplaywrightexpeditionscreative-enginemigrationsaas-lock-inrebranduntethercommand-and-controlportal

Point at a SaaS and Walk Away: Autonomous Rebrand & Migrate

June 2, 202611 min readAitherOS Engineering

Point at a SaaS and Walk Away

One URL in. Mirrored data, a rebranded app, and a go-live gate out — autonomously, watched live in your workspace.

The thing nobody wants to do by hand

You're locked into a SaaS. Maybe it has no API. Maybe the export is a one-record-at-a-time PDF. Your data is in there — clients, invoices, sessions, history — and the only way out is a person clicking through pages for a week. Meanwhile the product you actually want is the same data, in a tool you own, with your brand on it.

This is two miserable jobs stapled together: a data migration out of a hostile system, and a rebrand + rebuild of the front end. Both are mechanical. Both are exactly the kind of work that should be a button.

So we made it a button. In your workspace, there's now a Rebrand & Migrate tab. You give it a URL and a redesign prompt, pick "point at it" or "let the agent log in," and hit run. Then you watch — live — as it mirrors the data, rebrands the site, deploys a working app, and stops at a go-live gate for your approval.

The interesting part isn't the demo. It's that making this real meant wiring five separate subsystems into one autonomous flow, and the work was 20% architecture and 80% the unglamorous bugs that only show up when you actually run it on one GPU in a real container.

What it actually does

A single run decomposes into four phases, each streamed to your screen as it happens:

ACQUIRE — mirror the data. Two paths, same destination:
- Point-at: the agent crawls the public site and mirrors every page as an idempotent record.
- Authenticate the agent to drive it: the agent logs in itself — it observes the page, finds the login form, fills credentials pulled from the secrets vault (never inline), submits, handles an MFA challenge by pausing for a one-time gate, then walks the authenticated app extracting structured records. Everything lands in Untether, our canonical mirror store, keyed by (tenant, entity, source, source_id) so re-running is a safe upsert, never a duplicate.
DESIGN — rebrand it. The creative engine extracts the brand (palette, type, logo, tone) from the live site, crawls it for real content and product imagery, and assembles a redesigned, scroll-animated page — before/after, real content, your new look.
DEPLOY — ship a real app. Either a self-contained HTML artifact for instant review, or a promote-to-Next.js path that produces a deployable application.
REVIEW — gate the go-live. Nothing publishes without a human. The run opens a role-scoped approval gate — only a workspace owner (or whatever role you set) can approve it.

All four run under a single Expedition — our unit of tracked, governed work — so the whole thing is scoped, budgeted, resumable, and auditable.

The spine: an Expedition with real command and control

The temptation with "autonomous agent does a big multi-step job" is to let one agent free-run a tool loop and hope. That's how you get a confident, untraceable mess.

Instead, every Rebrand & Migrate run is an Expedition: a first-class record with phases, tasks, human gates, a budget envelope, and a live event stream. We already used Expeditions for code work; the new requirement was command and control at five levels — platform, workspace, tenant, user, and agent — because this capability lives in a multi-tenant portal where one workspace's migration must never be visible to, or approvable by, another's.

So we extended the Expedition model:

Workspace scoping on every expedition, phase, task, and gate. A tenant/workspace caller only ever sees and touches their own work; a platform operator sees everything. (Schema migration, additive and back-compatible — legacy rows stay visible to their owners.)
Parent→child expeditions, so a big migration can fan out into tracked sub-jobs with a rolled-up status.
Role-ranked gates. A gate can require owner, admin, or member; a member trying to approve an owner-gated go-live gets a clean 403, not a silent override. Roles rank (viewer < member < admin < owner), so the check is one comparison, not a tangle of special cases.

The result: an autonomous job you can actually govern. The agent does the work; the org keeps the controls.

"Authenticate the agent to drive it": computer-use, honestly

The single biggest gap was authenticated browsing. Crawling a public site is easy. Logging into a gated SaaS — finding the form, filling it, surviving MFA, then navigating an authenticated app — is the hard, valuable part, and our browser service had no concept of a session at all.

We built a computer-use session API on top of the existing headless browser: open a persistent (optionally pre-authenticated) session, observe it (the interactive elements, their selectors, the visible text, a screenshot), act on it (goto / click / fill / press / select / scroll), export the logged-in session state, and close.

That observe → act pair is the whole game. It's an OODA loop the agent runs to log in like a person: observe the page, decide the next action, act, observe again. The login logic is a heuristic that finds the email and password fields by their attributes and submits — with an injectable hook so a model can take over for weird forms, and an MFA branch that pauses for a human/OTP gate instead of failing. Credentials come from the secrets vault by reference; they never appear in a prompt, a log, or the job payload.

We're being honest about the maturity here: the session primitives are live-verified, and the login loop is unit- and smoke-tested. The full hands-off login against an arbitrary real-world SaaS — with its CAPTCHAs, its bot detection, its bespoke DOM — is the part that needs a real target and credentials to harden. The architecture is built for it; the last mile is empirical, and we'd rather say so than oversell it.

The bugs that actually stood in the way

Architecture diagrams don't ship. Here's what did the blocking, because this is the part other teams will recognize:

Image generation was dead on a single GPU — and the cause was a moved model. We'd relocated the reasoning model to a remote box. A well-meaning safety rule — never preempt the orchestrator for image gen — combined with that move to silently disable all VRAM reclamation: the only paths it knew were "sleep the local reasoning model," and that model was now remote. So the orchestrator quietly expanded to fill the freed space, image generation could never get VRAM, and every render returned a 503. The fix was a reclamation path appropriate to the new topology: unload idle image/3D models to make room, while never touching the orchestrator. Image gen came back to life.

A one-line crash had taken out the entire image backend. The fast image model passed an empty negative prompt as None; a downstream call did .lower() on it; every generation 500'd. Two characters — or "" — restored it.

A self-deadlock on a single slot. There was exactly one image "slot." The canvas service grabbed it, then called the image model — which tried to grab the same slot, blocked, and timed out 60 seconds later. The service was deadlocking against itself. The fix was teaching it to let the backend own the slot.

The browser had no browser. The container shipped the automation library but not the actual Chromium binary, so every screenshot, crawl, and brand-extraction silently degraded to nothing — which is why our first rebrands came back blank. And once we baked Chromium into the image, it still hung, because headless Chromium as a non-root container user needs --no-sandbox or it just sits there. Both are now baked into the image and survive every recreate.

None of these are glamorous. All of them were load-bearing. The lesson we keep relearning: the gap between "the architecture is right" and "it works when you press the button" is paved with infrastructure papercuts, and the only way to find them is to actually press the button.

Watching it work

The point of all this is that you see it. The whole run streams to your workspace over Server-Sent Events: each phase lights up as it starts and completes, the mirrored-record count ticks up during ACQUIRE, the rebranded site renders in an iframe the moment it's built, and the go-live gate appears when it's time to decide. A point-at run against a real site streamed end-to-end on the first live try — mirror, rebrand, deploy, gate — exactly as designed.

It lives in the portal's Creative workspace next to Campaign Studio (the same engine, pointed at out-of-home ad campaigns — brand to dark interactive map to billboards to a photoreal mockup to a landing page). One creative engine, two surfaces, both watchable.

Why this shape matters

You could build each piece of this as a standalone tool: a scraper, a migration script, a website builder, an approval workflow. Plenty of those exist. What's different here is that they're one governed, autonomous job — the agent drives the browser, mirrors the data, designs the rebrand, and deploys the app, while the Expedition keeps it scoped to your workspace, budgeted, gated, and on the record.

That's the bet: the valuable autonomy isn't a smarter free-running agent. It's an agent doing real, multi-system work inside a structure that an organization can actually trust — point at the thing you're locked into, walk away, and come back to a decision you get to make.

Point at a SaaS. Get your data and a rebranded app back. Approve the go-live yourself.

Enjoyed this post?

All posts Try AitherOS