A Three‑Layer Terminal Architecture
How AiStudio helped me design and build a terminal engine that’s more capable than I could have done alone
This is the architecture behind the second‑generation terminal engine.
It emerged over several iterations, with AiStudio assisting in both design and implementation. The final structure is cleaner and more capable than anything I would have produced alone.
The terminal is built as three layers:
L1 — Duplex Device Communication, Protocols, libvterm
The lowest layer handles bidirectional communication with a device:
- SSH
- MacTelnet
- (future: serial, telnet, docker exec, kubectl exec, etc.)
This layer is not just a byte pump.
It is duplex, meaning:
- the device sends data to libvterm
- libvterm sends capability‑negotiation sequences back
- the device responds accordingly
This is how a real terminal emulator behaves.
L1 responsibilities:
- transport
- protocol framing
- PTY semantics
- capability negotiation
- escape‑sequence handling
- feeding libvterm
- sending libvterm’s responses back to the device
Everything above L1 is protocol‑agnostic.
Note
Why duplex matters
L1 isn’t a one‑way stream.
libvterm and the device negotiate capabilities (dimensions, colors, cursor modes, keypad modes, etc.).
This is what makes the terminal behave like a real emulator instead of a raw pipe.
L2 — Damage Events → Messages → DOM → WebView2
libvterm maintains the authoritative terminal state.
Whenever something changes, it emits damage events:
- cell updates
- line scrolls
- cursor movement
- attribute changes
- screen clears
- alternate buffer switches
These are converted into structured messages and sent through a one‑way channel to a DOM wrapper, which mirrors the terminal state inside WebView2.
This gives:
- flicker‑free rendering
- correct colors and attributes
- clean resizing
- no custom drawing code
- a fully browser‑based terminal UI
L2 is purely about presentation.
It doesn’t know anything about commands or LLMs.
L3 — Terminal Service for the LLM
The top layer exposes the terminal as a service with three operations:
terminal_command(cmd)terminal_snapshot()terminal_abort()
terminal_command(cmd)
This returns a delta — the exact consequence of the command as interpreted by libvterm.
Not the whole screen.
Not raw stdout.
Just the meaningful change.
This works for ~95% of console interactions.
terminal_snapshot()
Captures the full screen state.
Used for full‑screen applications (top, htop, less, vim, etc.).
terminal_abort()
Terminates the running program and restores the terminal state.
L3 is what makes the terminal usable by an LLM.
It provides deterministic state transitions and a clean, structured interface.
Why This Architecture Works
The layers separate concerns cleanly:
| Layer | Responsibility |
|---|---|
| L1 | Duplex protocol handling + capability negotiation |
| L2 | Rendering and UI mirroring |
| L3 | LLM‑friendly terminal service |
The result is a terminal engine that:
- handles SSH and MacTelnet now, with room for more protocols later
- negotiates capabilities like a real terminal
- supports full‑screen ncurses applications
- produces clean deltas for the LLM
- renders smoothly in WebView2
- is protocol‑agnostic
- is easy to extend
It works better than I expected when I started.
The combination of iterative design and AiStudio assistance produced something that feels more mature than the sum of its parts.