Developer Tooling

Snap: Visual Annotation for Agentic Coding

Adjective built Snap to solve a problem we kept hitting during our own development work: AI coding agents are powerful, but they cannot see your screen. When a layout breaks, a component misaligns, or a color is wrong, you have to stop coding, take a screenshot, paste it into chat, and describe the problem in words. That friction kills flow state and wastes time. Snap eliminates it entirely — press a hotkey, circle the bug, type a label, hit enter. Your agent sees exactly what you see and fixes it.

Client
Adjective (Open Source)
Timeline
10 days to production
Sector
Developer Tooling

Challenge

During intensive vibe engineering sessions — rapid AI-assisted development where a developer and an AI agent co-build in real-time — visual bugs create a disproportionate slowdown. The agent can read code, run tests, and modify files, but it cannot see the rendered output. Every UI issue requires the developer to break flow, context-switch to a screenshot tool, capture the screen, paste or upload the image, and then verbally describe the problem with enough precision for the agent to locate and fix it. This round-trip takes 30-90 seconds per issue and introduces ambiguity. On a typical frontend session with 15-25 visual issues, that friction compounds into 10-30 minutes of lost time and a measurable drop in fix accuracy — agents frequently misinterpret verbal descriptions and fix the wrong element on the first attempt.

Approach

01

Annotation Overlay (Tauri 2.x)

Built a native desktop application using Tauri 2.x (Rust backend, vanilla JS frontend) that captures the screen on a global hotkey and presents a fullscreen annotation canvas

Global hotkey (Ctrl+Shift+S) works from any application
6 annotation tools: circle, rectangle, arrow, freehand, text label, numbered marker
Color swatches, stroke widths, dim toggle for busy screens
Undo support (Ctrl+Z) and cancel (Escape)
Annotated PNG composited with all drawings baked into the image
02

Structured Metadata Layer

Every annotation save produces a matched JSON sidecar with structured data that agents can parse programmatically — not just a screenshot

Annotation coordinates, types, labels, and colors as structured data
Source window title, process ID, and window class for context
Display resolution and monitor identification
Timestamp for inbox ordering
Machine-readable format eliminates verbal description ambiguity
03

MCP Server (Python / fastmcp)

Built an MCP server that exposes the annotation inbox to any AI agent via stdio transport, making Snap agent-native from day one

5 MCP tools: check_new_annotations, get_latest_annotation, list_annotations, get_annotation, clear_inbox
Compatible with Claude Code, Claude Desktop, Cursor, Windsurf, and any MCP client
Agent reads annotated image + structured metadata in a single call
Inbox model: annotations queue up, agent processes them, clears when confirmed
04

Cross-Platform Setup

Shipped setup scripts and documentation for Linux (X11, Wayland/GNOME, Sway, Hyprland) and macOS with automated MCP registration

One-command MCP registration for all supported AI tools
Makefile-based build system (make deps && make build)
GNOME, Sway, Hyprland, and X11 hotkey configuration
macOS .app bundle with permission prompt handling

Impact

60% faster
Bug Resolution Speed

UI bug resolution during vibe coding sessions dropped from 60-90 seconds (screenshot + describe + wait for fix) to under 10 seconds (hotkey + annotate + enter). Measured across 200+ visual issues during internal development.

3x improvement
First-Attempt Fix Accuracy

Agents fix the correct element on the first attempt 90%+ of the time when given annotated screenshots with coordinates vs. ~30% accuracy with verbal descriptions alone. Structured metadata eliminates the "which button do you mean?" back-and-forth.

< 10 seconds
Flow State Preservation

Complete round-trip from spotting a bug to having the agent working on the fix. No app-switching, no file uploads, no typing paragraphs of description. The developer stays in flow.

10 days
Time to Production

From zero to a working cross-platform annotation tool with MCP integration, shipped as open source under MIT license.

"
Once you have Snap in your workflow, going back to describing UI bugs in words feels like faxing a screenshot. The agent just sees what you see now.
Adjective Engineering
Internal Development

Stack

Tauri 2.x
Rust
JavaScript
Python
fastmcp
MCP Protocol

Build What's Next

Move faster. Spend smarter. Prove more before you bet big.

Start an R&D Sprint