Screenshot Capture Automation
Use when the user explicitly asks for a desktop or system screenshot (full screen, specific app or window, or a pixel region), or when tool-specific capture capabilities are unavailable and an OS-level capture is needed.
Content
Follow these save-location rules every time:
1) If the user specifies a path, save there.
2) If the user asks for a screenshot without a path, save to the OS default screenshot location.
3) If Codex needs a screenshot for its own inspection, save to the temp directory.
Tool priority
- -Prefer tool-specific screenshot capabilities when available (for example: a Figma MCP/skill for Figma files, or Playwright/agent-browser tools for browsers and Electron apps).
- -Use this skill when explicitly asked, for whole-system desktop captures, or when a tool-specific capture cannot get what you need.
- -Otherwise, treat this skill as the default for desktop apps without a better-integrated capture tool.
macOS permission preflight (reduce repeated prompts)
On macOS, run the preflight helper once before window/app capture. It checks
Screen Recording permission, explains why it is needed, and requests it in one
place.
The helpers route Swift's module cache to $TMPDIR/codex-swift-module-cache
to avoid extra sandbox module-cache prompts.
To avoid multiple sandbox approval prompts, combine preflight + capture in one
command when possible:
For Codex inspection runs, keep the output in temp:
Use the bundled scripts to avoid re-deriving OS-specific commands.
macOS and Linux (Python helper)
Run the helper from the repo root:
Common patterns:
- -Default location (user asked for "a screenshot"):
- -Temp location (Codex visual check):
- -Explicit location (user provided a path or filename):
- -App/window capture by app name (macOS only; substring match is OK; captures all matching windows):
- -Specific window title within an app (macOS only):
- -List matching window ids before capturing (macOS only):
- -Pixel region (x,y,w,h):
- -Focused/active window (captures only the frontmost window; use
--appto capture all windows):
- -Specific window id (use --list-windows on macOS to discover ids):
The script prints one path per capture. When multiple windows or displays match, it prints multiple paths (one per line) and adds suffixes like -w<windowId> or -d<display>. View each path sequentially with the image viewer tool, and only manipulate images if needed or requested.
Workflow examples
- -"Take a look at <App> and tell me what you see": capture to temp, then view each printed path in order.
- -"The design from Figma is not matching what is implemented": use a Figma MCP/skill to capture the design first, then capture the running app with this skill (typically to temp) and compare the raw screenshots before any manipulation.
Multi-display behavior
- -On macOS, full-screen captures save one file per display when multiple monitors are connected.
- -On Linux and Windows, full-screen captures use the virtual desktop (all monitors in one image); use
--regionto isolate a single display when needed.
Linux prerequisites and selection logic
The helper automatically selects the first available tool:
1) scrot
2) gnome-screenshot
3) ImageMagick import
If none are available, ask the user to install one of them and retry.
Coordinate regions require scrot or ImageMagick import.
--app, --window-name, and --list-windows are macOS-only. On Linux, use
--active-window or provide --window-id when available.
Windows (PowerShell helper)
Run the PowerShell helper:
Common patterns:
- -Default location:
- -Temp location (Codex visual check):
- -Explicit path:
- -Pixel region (x,y,w,h):
- -Active window (ask the user to focus it first):
- -Specific window handle (only when provided):
Direct OS commands (fallbacks)
Use these when you cannot run the helpers.
macOS
- -Full screen to a specific path:
- -Pixel region:
- -Specific window id:
- -Interactive selection or window pick:
Linux
- -Full screen:
- -Pixel region:
- -Active window:
Error handling
- -On macOS, run
bash <path-to-skill>/scripts/ensure_macos_permissions.shfirst to request Screen Recording in one place. - -If you see "screen capture checks are blocked in the sandbox", "could not create image from display", or Swift
ModuleCachepermission errors in a sandboxed run, rerun the command with escalated permissions. - -If macOS app/window capture returns no matches, run
--list-windows --app "AppName"and retry with--window-id, and make sure the app is visible on screen. - -If Linux region/window capture fails, check tool availability with
command -v scrot,command -v gnome-screenshot, andcommand -v import. - -If saving to the OS default location fails with permission errors in a sandbox, rerun the command with escalated permissions.
- -Always report the saved file path in the response.
FAQ
Discussion
Loading comments...