Early Access Preview
Back to blog
engineeringstreamingvideohlsinfrastructurefrontend

Video Hosting & Live Streaming: From Upload to Adaptive HLS in One Platform

March 15, 202612 min readAitherium
Share

Most AI platforms treat media as an afterthought — upload a file, get a URL, show a <video> tag. That works until your 4K training recording buffers for 30 seconds on a spotty connection, or your demo stream has no authentication, or you realize you're storing terabytes of uncompressed originals with no CDN strategy.

Today we shipped a complete video hosting and live streaming platform inside AitherOS. Two new Python services, one third-party container, and a full React frontend — all wired into the existing tenant isolation, billing, event bus, and durable queue infrastructure.

The Architecture

Three components, clean separation of concerns:

STREAMER (OBS/ffmpeg)          VIEWER (Browser)
       |                              |
       | RTMP :1935                   | HLS.js
       v                              v
  +-----------+              +---------------+
  | MediaMTX  |--webhook---->| AitherStream  |
  | (Go bin)  |              |   :8145       |
  | RTMP+HLS  |              | Catalog, Auth |
  +-----------+              | HLS serving   |
                             +-------+-------+
                                     |
                                     v
                             +---------------+
                             |AitherTranscoder|
                             |   :8146       |
                             | ffmpeg workers|
                             | HLS segments  |
                             +---------------+

AitherStream (:8145) owns the catalog, handles uploads, serves HLS, and authenticates live streams. AitherTranscoder (:8146) does the heavy lifting — GPU-accelerated ffmpeg transcoding into adaptive bitrate HLS. MediaMTX is a single Go binary that handles RTMP ingest and real-time HLS output for live streams.

Why Not Just Use S3 + CloudFront?

Three reasons:

  1. Tenant isolation at the data layer. Every video, stream key, and view event is scoped to a tenant. The same CallerContext + X-Tenant-ID header pattern that isolates agent sessions, memory graphs, and billing works here too. No shared S3 buckets with prefix-based isolation hacks.

  2. Integrated billing. Transcoding costs flow into ACTA just like LLM inference costs. When a tenant transcodes a 4K video into 5 profiles, that compute time appears on their ledger next to their agent dispatch minutes.

  3. Event bus visibility. Every upload, transcode, and view emits Flux events (VID_UPLOAD, VID_TRANSCODE, VID_READY, VID_VIEW). JarvisBrain's awareness tick sees "3 videos processing" the same way it sees "5 agent sessions active." The system stays conscious of media workload.

Chunked Upload: 5MB at a Time

Large videos need resilient uploads. We implemented a three-phase protocol:

  1. InitPOST /v1/videos/upload with filename and size. Returns an upload_id and the total chunk count.
  2. ChunksPUT /v1/videos/upload/{id}/chunk/{n} with 5MB binary payloads. Each chunk is written to disk and tracked in SQLite.
  3. CompletePOST /v1/videos/upload/{id}/complete. Server verifies all chunks arrived, concatenates them, runs ffprobe for metadata, and enqueues the transcode job.

The upload is resumable. If your connection drops at chunk 47 of 200, the client can query which chunks were received and resume from 48. A background task runs every 6 hours to clean up abandoned uploads older than 24 hours.

On the frontend, the upload wizard shows per-chunk progress:

for (let i = 0; i < total_chunks; i++) {
    const chunk = file.slice(i * chunkSize, (i + 1) * chunkSize)
    await fetch(`/api/stream/upload?action=chunk&upload_id=${id}&chunk_n=${i}`, {
        method: 'PUT',
        body: chunk,
    })
    setProgress(Math.round(((i + 1) / total_chunks) * 100))
}

Simple sequential uploads with retry. No need for multipart form uploads or presigned URLs.

GPU-Accelerated Transcoding

AitherTranscoder consumes from the video_transcode ResilientQueue. Each job runs through a 7-step pipeline:

  1. Probe — ffprobe extracts codec, resolution, frame rate, duration
  2. Profile selection — choose which of the 5 profiles fit the source resolution (no upscaling)
  3. Thumbnails — extract poster frame at 25% mark + a 10x10 sprite sheet for seek preview
  4. Segment — per profile, run ffmpeg with HLS output (-f hls -hls_time 6)
  5. Master playlist — generate master.m3u8 with EXT-X-STREAM-INF per variant
  6. DB update — mark video as ready, store HLS paths
  7. Flux event — VID_READY so the dashboard updates in real time

The five profiles:

LabelResolutionVideo BitrateAudioMax FPS
360p640x360800 kbps96k30
480p854x4801500 kbps128k30
720p1280x7203000 kbps128k30
1080p1920x10806000 kbps192k60
4K3840x216015000 kbps256k60

Hardware acceleration detection follows the same pattern as our existing video_tools.py: try NVENC first (RTX GPUs), fall back to AMD AMF, then Intel QSV, then software libx264. A GPU semaphore limits concurrent transcode jobs to 2, and ComputeCapacityGate prevents transcoding from starving LLM inference.

The HLS file layout is straightforward:

Library/stream/hls/{tenant_id}/{video_id}/
  master.m3u8
  thumbnail.jpg
  sprites.jpg
  360p/playlist.m3u8 + segment_000.ts...
  720p/playlist.m3u8 + segment_000.ts...
  1080p/playlist.m3u8 + segment_000.ts...

Nginx serves .ts segments with Cache-Control: immutable (they never change) and .m3u8 playlists with no-cache (they update during transcoding and live streams).

Live Streaming with MediaMTX

Live streaming adds a real-time dimension. MediaMTX handles the protocol complexity — accepting RTMP pushes from OBS or ffmpeg and converting them to low-latency HLS on the fly.

The auth flow uses webhooks:

  1. User creates a stream in the Studio page → AitherStream generates an HMAC-SHA256 stream key
  2. OBS pushes to rtmp://host:1935/live/{stream_key}
  3. MediaMTX fires an auth webhook to AitherStream → single indexed DB lookup → 200 or 403
  4. On success: stream status flips to live, Flux emits VID_LIVE
  5. Browser plays http://host:8889/live/{stream_key}/index.m3u8 via HLS.js

The auth webhook must respond in under 2 seconds — it's a single indexed SQLite lookup, no external calls. When the streamer disconnects, MediaMTX fires a done webhook, and if recording was enabled, we enqueue VOD conversion with a 5-second delay to let the file flush.

Viewer tracking uses an in-memory dictionary with deduplication. Same IP + user_id within a 30-second window counts as one viewer. A background flush pushes counts to SQLite every 10 seconds.

The Frontend: HLS.js + Quality Selection

The HLS player uses hls.js for Chrome and Firefox (which don't support HLS natively) and falls back to the native <video> element on Safari. The hook pattern is clean:

const { levels, currentLevel, setQuality, stats, isReady } = useHlsPlayer(
    videoRef,
    { src: hlsMasterUrl, lowLatency: isLive }
)

Quality levels are read from the master playlist. Users can select "Auto" (bandwidth-adaptive) or pin a specific quality. For live streams, low-latency mode sets liveSyncDurationCount: 2 and liveMaxLatencyDuration: 5 — about 4-6 seconds of latency, which is good enough for most use cases.

The Stream Studio page gives streamers everything they need: stream key management (with show/hide toggle and copy button), OBS configuration instructions, and a regenerate button that invalidates the old key immediately.

What We Reused

A significant portion of this feature is glue between existing systems:

  • ResilientQueue for durable job processing with retry and dead letter
  • FluxEmitter events so the awareness loop, dashboards, and analytics all see video activity
  • CallerContext tenant isolation — the same header extraction pattern used everywhere
  • AitherIntegration one-line service setup with health endpoints, middleware stack, Chronicle logging
  • video_tools.py hardware acceleration detection patterns
  • Nginx veil-lb.conf with WebSocket support already in place
  • AppShell component wrapper for consistent page layout

The new code is about catalog management (SQLite CRUD), upload orchestration (chunk tracking), HLS generation (ffmpeg invocations), and the React UI components.

Numbers

  • 2 new Python services (AitherStream: 1494 lines, AitherTranscoder: 757 lines)
  • 7 React components (player, upload wizard, video card, grid, transcode status, live badge, stream key panel)
  • 5 pages (catalog, watch, live browser, live viewer, studio)
  • 5 API routes (stream proxy, upload proxy, HLS passthrough, live CRUD, analytics)
  • 2 hooks (video catalog with chunked upload, HLS.js player wrapper)
  • 86 passing tests (59 stream + 27 transcoder)
  • 1 MediaMTX container (RTMP:1935, HLS:8889, RTSP:8554)
  • 5831 lines across 30 files

What's Next

The foundation is solid. Upcoming work:

  • Content moderation — during transcoding, sample frames to AitherVision for automated flagging
  • Storage tiering — cold-tier HLS segments after 30 days, keeping only thumbnails and master playlists hot
  • WebRTC — for sub-second latency on interactive streams (MediaMTX supports it, we just need to wire it)
  • Clip extraction — select a time range in the player, generate a new video from the HLS segments without re-encoding

The pattern holds: build the core infrastructure with clean APIs and event integration first, then layer intelligence on top. Video hosting is just another service in the mesh — it gets the same monitoring, billing, tenant isolation, and awareness that every other AitherOS service enjoys.

That's the advantage of building on an integrated platform instead of stitching together SaaS products. When your video pipeline speaks the same language as your agent orchestrator, your billing ledger, and your event bus, features compose instead of collide.

Enjoyed this post?
Share