Local MCP server · Windows · single .exe

NO MORE full screenshots
every single turn.

Your AI agent's eyes, 89–93% cheaper. Verdesk sends agents modulated visual deltas instead of full screenshots — same result, a fraction of the vision tokens.

Free & full-featured · No account · No telemetry · Works with Claude, GPT & local models

Scanned clean by VirusTotal Windows 10/11 macOS & Linux soon Tested on Claude Code
Ultrafast install

Running in 60 seconds. 4 steps.

  1. 1

    Download

    Grab the setup from GitHub Releases — 198 MB, single .exe, WebView2 bundled offline.

  2. 2

    Install

    Run Verdesk_x.y.z_x64-setup.exe. No admin needed, no telemetry. The tray icon lands quietly in the system tray.

  3. 3

    Copy the prompt

    Tray icon → SettingsAcceso. One button copies the MCP server config to your clipboard — paste it into your AI client.

  4. 4

    Restart the terminal

    Close and reopen your terminal (Claude Code, etc.) so it picks up the new MCP server. Done — your agent now sees deltas, not screenshots.

01 — How it works

Most of the screen doesn't change. So don't re-send it.

Today every vision agent — Computer Use, Copilot Vision, Browser Use — ships a full screenshot on every turn. The model pays, again and again, to re-read a navbar it has already seen forty times.

Verdesk watches the screen on a fixed grid and hands the agent only the cells that actually changed. One full capture, then just the deltas — that is where the saving comes from.

Baseline screenshot agent

full frame × 4 turns — 100% vision cost

Verdesk modulated deltas

1 full, then only what changed — 89–93% fewer vision tokens on a typical session

02 — What you get

A vision layer the agent can steer.

F1

Modulated capture

The agent picks resolution, quality and color per region — a cheap grayscale glance across the whole screen, a crisp 1× color crop only where the decision actually lives.

F2

Deterministic delta detection

A fixed 12×8 grid + perceptual hashing tags every cell new, changed or unchanged. Only the deltas travel — no AI in the loop.

F3

It doesn't just watch — it drives the desktop.

Verdesk operates any Windows app for the agent: click, type, scroll, fire key combos — Win32, WPF, UWP, Electron, browsers. It acts semantically through the accessibility tree, and falls back to coordinates and on-screen text when an app exposes nothing — even games and canvas UIs. Reactive, too: wait for a control to appear or change before the next step.

list_uia_elementsact_uiaclick_text click_atsend_keyspress_key scrollwait_for_uia
F4

Live visual buffer

Verdesk keeps a visual model for the agent — inspect it, refine a cell, query by hash. Every reset archives to a navigable history.

F5

MCP-standard, model-agnostic

Around 40 versioned tools over the Model Context Protocol. Verdesk assumes nothing about the model that drives it — Claude, GPT, a local model, any MCP client.

capturelookrefine_cell query_buffercompare_cellsfocus_zone get_buffer_stateget_capabilities
F6

Plain-text reading

The agent pulls plain text straight from any region of the screen — no vision tokens spent just to read.

02b — Remote

Your agent on one machine. Verdesk on another. One SSH tunnel.

In Pro, Verdesk binds to LAN or runs through an SSH tunnel — keys you generate, no SaaS broker. Tailscale optional for CGNAT.

03 — Two ways to get it

Free to run. Pro to ship.

Free $0

The complete Verdesk vision layer, free for personal and non-commercial use.

  • Full modulated vision pipeline
  • Local-mode MCP server
  • All base tools — capture, buffer, history, UIA
  • Modulation profiles
  • Single .exe · no install · no telemetry
Download free
For shipping
Pro $49/year

Yearly subscription, one developer. Everything in Free, plus what you need to ship.

  • + Remote access over LAN / WAN
  • + The run_command tool
  • + Commercial-use license — 1 developer
  • + Up to 3 device activations
  • + Updates included while subscribed
Buy Verdesk Pro
Coming soon
Team $199/year

Yearly subscription, up to 5 developers. Shared license keys and centralized billing.

  • + Everything in Pro
  • + Up to 5 developer seats
  • + Centralized billing & activations
  • + Priority email support

Stop paying vision tokens for pixels that didn't move.

Download free

Windows 10 / 11 · 64-bit · single executable · macOS & Linux soon