Snap
Visual annotation layer for agentic coding. See a bug? Circle it, label it, hit enter. Your AI agent sees exactly what you see.
THE_FLOW
01 Ctrl+Shift+S
02 Circle the bug
03 Type "fix this"
04 Enter
05 Agent reads it. Agent fixes it.
What Snap Does
Snap captures your screen, lets you annotate it with circles, arrows, rectangles, freehand marks, numbered markers, and text labels, then saves the annotated screenshot plus structured metadata to a local inbox.
An MCP server exposes the inbox to any AI agent. Claude Code, custom GPTs, Cursor, Windsurf — anything that speaks the Model Context Protocol can read your annotations and act on them.
No copy-pasting screenshots into chat. No describing what you see. You mark it up, the agent reads it. Structured annotation data includes coordinates, labels, colors, source window context, and display resolution.
Two Components
Annotation Overlay
Tauri 2.x app (Rust + vanilla JS). Global hotkey triggers a fullscreen overlay that captures your screen and gives you a canvas to draw on.
MCP Server
Python (fastmcp) server exposes the inbox via stdio transport. AI agents call tools like get_latest_annotation() to read your marked-up screenshots with full structured metadata.
Annotation Tools
| Tool | Key | Description |
|---|---|---|
| Circle | C | Click-drag to draw ellipses. Shift constrains to perfect circle. |
| Rectangle | R | Click-drag to draw boxes. Shift constrains to square. 10% fill for visibility. |
| Arrow | A | Click start, drag to end. Arrowhead on the endpoint. |
| Freehand | F | Click-drag to draw smoothed paths with quadratic curve interpolation. |
| Text | T | Click to place a label. Type your instruction. Enter to confirm. |
| Numbered Marker | N | Click to place auto-incrementing circled numbers (1, 2, 3...). |
What Gets Saved
Every save produces a matched pair: an annotated PNG with all drawings composited directly into the image, and a structured JSON metadata file.
The JSON includes annotation coordinates, labels, colors, source window title, process ID, display resolution, and timestamp. Agents use this structured data to understand exactly what you marked and where.
"source": {
"window_title": "localhost:3000/dashboard",
"resolution": [3840, 2160]
},
"annotations": [
{
"type": "circle",
"center": [340, 220],
"label": "fix this alignment"
}
]
}
Works With
Show Your Agent What You See.
Open source. MIT licensed. Built by Adjective.