A Three‑Layer Terminal Architecture

How AiStudio helped me design and build a terminal engine that’s more capable than I could have done alone

This is the architecture behind the second‑generation terminal engine.
It emerged over several iterations, with AiStudio assisting in both design and implementation. The final structure is cleaner and more capable than anything I would have produced alone.

The terminal is built as three layers:

L1 — Duplex Device Communication, Protocols, libvterm

The lowest layer handles bidirectional communication with a device:

SSH
MacTelnet
(future: serial, telnet, docker exec, kubectl exec, etc.)

This layer is not just a byte pump.
It is duplex, meaning:

the device sends data to libvterm
libvterm sends capability‑negotiation sequences back
the device responds accordingly

This is how a real terminal emulator behaves.

L1 responsibilities:

transport
protocol framing
PTY semantics
capability negotiation
escape‑sequence handling
feeding libvterm
sending libvterm’s responses back to the device

Everything above L1 is protocol‑agnostic.

Note

Why duplex matters
L1 isn’t a one‑way stream.
libvterm and the device negotiate capabilities (dimensions, colors, cursor modes, keypad modes, etc.).
This is what makes the terminal behave like a real emulator instead of a raw pipe.

L2 — Damage Events → Messages → DOM → WebView2

libvterm maintains the authoritative terminal state.
Whenever something changes, it emits damage events:

cell updates
line scrolls
cursor movement
attribute changes
screen clears
alternate buffer switches

These are converted into structured messages and sent through a one‑way channel to a DOM wrapper, which mirrors the terminal state inside WebView2.

This gives:

flicker‑free rendering
correct colors and attributes
clean resizing
no custom drawing code
a fully browser‑based terminal UI

L2 is purely about presentation.
It doesn’t know anything about commands or LLMs.

L3 — Terminal Service for the LLM

The top layer exposes the terminal as a service with three operations:

terminal_command(cmd)
terminal_snapshot()
terminal_abort()

terminal_command(cmd)

This returns a delta — the exact consequence of the command as interpreted by libvterm.

Not the whole screen.
Not raw stdout.
Just the meaningful change.

This works for ~95% of console interactions.

terminal_snapshot()

Captures the full screen state.
Used for full‑screen applications (top, htop, less, vim, etc.).

terminal_abort()

Terminates the running program and restores the terminal state.

L3 is what makes the terminal usable by an LLM.
It provides deterministic state transitions and a clean, structured interface.

Why This Architecture Works

The layers separate concerns cleanly:

Layer	Responsibility
L1	Duplex protocol handling + capability negotiation
L2	Rendering and UI mirroring
L3	LLM‑friendly terminal service

The result is a terminal engine that:

handles SSH and MacTelnet now, with room for more protocols later
negotiates capabilities like a real terminal
supports full‑screen ncurses applications
produces clean deltas for the LLM
renders smoothly in WebView2
is protocol‑agnostic
is easy to extend

It works better than I expected when I started.
The combination of iterative design and AiStudio assistance produced something that feels more mature than the sum of its parts.