Developer Tooling

Snap: Visual Annotation for Agentic Coding

Adjective built Snap to solve a problem we kept hitting during our own development work: AI coding agents are powerful, but they cannot see your screen. When a layout breaks, a component misaligns, or a color is wrong, you have to stop coding, take a screenshot, paste it into chat, and describe the problem in words. That friction kills flow state and wastes time. Snap eliminates it entirely — press a hotkey, circle the bug, type a label, hit enter. Your agent sees exactly what you see and fixes it.

Client

Adjective (Open Source)

Timeline

10 days to production

Sector

Developer Tooling

Challenge

During intensive vibe engineering sessions — rapid AI-assisted development where a developer and an AI agent co-build in real-time — visual bugs create a disproportionate slowdown. The agent can read code, run tests, and modify files, but it cannot see the rendered output. Every UI issue requires the developer to break flow, context-switch to a screenshot tool, capture the screen, paste or upload the image, and then verbally describe the problem with enough precision for the agent to locate and fix it. This round-trip takes 30-90 seconds per issue and introduces ambiguity. On a typical frontend session with 15-25 visual issues, that friction compounds into 10-30 minutes of lost time and a measurable drop in fix accuracy — agents frequently misinterpret verbal descriptions and fix the wrong element on the first attempt.

Approach

Annotation Overlay (Tauri 2.x)

Built a native desktop application using Tauri 2.x (Rust backend, vanilla JS frontend) that captures the screen on a global hotkey and presents a fullscreen annotation canvas

▪Global hotkey (Ctrl+Shift+S) works from any application

▪6 annotation tools: circle, rectangle, arrow, freehand, text label, numbered marker

▪Color swatches, stroke widths, dim toggle for busy screens

▪Undo support (Ctrl+Z) and cancel (Escape)

▪Annotated PNG composited with all drawings baked into the image

Structured Metadata Layer

Every annotation save produces a matched JSON sidecar with structured data that agents can parse programmatically — not just a screenshot

▪Annotation coordinates, types, labels, and colors as structured data

▪Source window title, process ID, and window class for context

▪Display resolution and monitor identification

▪Timestamp for inbox ordering

▪Machine-readable format eliminates verbal description ambiguity

MCP Server (Python / fastmcp)

Built an MCP server that exposes the annotation inbox to any AI agent via stdio transport, making Snap agent-native from day one

▪5 MCP tools: check_new_annotations, get_latest_annotation, list_annotations, get_annotation, clear_inbox

▪Compatible with Claude Code, Claude Desktop, Cursor, Windsurf, and any MCP client

▪Agent reads annotated image + structured metadata in a single call

▪Inbox model: annotations queue up, agent processes them, clears when confirmed

Cross-Platform Setup

Shipped setup scripts and documentation for Linux (X11, Wayland/GNOME, Sway, Hyprland) and macOS with automated MCP registration

▪One-command MCP registration for all supported AI tools

▪Makefile-based build system (make deps && make build)

▪GNOME, Sway, Hyprland, and X11 hotkey configuration

▪macOS .app bundle with permission prompt handling

Impact

60% faster

Bug Resolution Speed

UI bug resolution during vibe coding sessions dropped from 60-90 seconds (screenshot + describe + wait for fix) to under 10 seconds (hotkey + annotate + enter). Measured across 200+ visual issues during internal development.

3x improvement

First-Attempt Fix Accuracy

Agents fix the correct element on the first attempt 90%+ of the time when given annotated screenshots with coordinates vs. ~30% accuracy with verbal descriptions alone. Structured metadata eliminates the "which button do you mean?" back-and-forth.

< 10 seconds

Flow State Preservation

Complete round-trip from spotting a bug to having the agent working on the fix. No app-switching, no file uploads, no typing paragraphs of description. The developer stays in flow.

10 days

Time to Production

From zero to a working cross-platform annotation tool with MCP integration, shipped as open source under MIT license.

Once you have Snap in your workflow, going back to describing UI bugs in words feels like faxing a screenshot. The agent just sees what you see now.

Adjective Engineering

Internal Development

View on GitHub Explore Snap

Stack

Tauri 2.x

Rust

JavaScript

Python

fastmcp

MCP Protocol

Build What's Next

Move faster. Spend smarter. Prove more before you bet big.

Start an R&D Sprint