Scrapeman

Everything.
Nothing extra.

Scrapeman is built for scraping engineers. Six auth schemes, a load runner with watched headers, a WebSocket client, pre and post-request scripts, native Scrape.do mode, and a git-friendly .sman file format. Apache 2.0, no account, no cloud sync.

Built on undici

Scrapeman's HTTP core uses undici, Node's official HTTP client and the layer behind built-in fetch. Spec-compliant, fast, with proxy and HTTP/2 support out of the box.

All HTTP methods

GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS, plus custom verbs like PROPFIND and QUERY.

HTTP/1.1 and HTTP/2

Per-request allowH2 toggle via ALPN negotiation.

Per-request TLS verify toggle

Skip cert checks the way curl does with -k. Self-signed proxies, expired certs, mitmproxy debug. The URL bar shows a red Insecure badge while it is on.

Custom headers

Set any header, override auto-managed ones, or disable them per request. Bulk-edit toggle switches the table to a Key: Value textarea.

Request body types

JSON, form-urlencoded, multipart, raw text, binary, plus a CodeMirror editor with {{var}} highlight, autocomplete, and lenient JSON beautify with auto-fix.

Query parameters

URL bar and Params table stay two-way synced. Disabled rows survive saves and reloads.

Cmd+R parallel send

Each press fires another request without cancelling the in-flight one. Live HUD tracks every parallel send with status and elapsed duration.

Proxy support

HTTP / HTTPS proxies with basic auth via undici ProxyAgent.

Nothing hidden, nothing lost

Responses arrive fully decompressed and rendered with syntax highlighting. Large bodies stay smooth thanks to virtualization.

Auto-decompression

gzip, brotli, and deflate responses decode automatically.

SSE event streaming

text/event-stream responses get an Events mode with id, event, retry, and a JSON tree per data block. Export as JSON.

200 MB body cap

Large responses captured fully. Virtualized rendering means a 5 MB body still scrolls smoothly. Save the response to a file with one click.

Pretty modes for every kind

CodeMirror with one-dark in dark mode for JSON, HTML, XML, JavaScript, and CSS. JSON also has a collapsible Tree view with JSONPath copy.

HTML preview with styles

Sandboxed iframe renders external CSS, images, and fonts. A base href is injected so relative URLs resolve to the originating server.

Response body search

150 ms debounce, line-indexed match table, Enter / Shift+Enter navigation. The query persists across sends and auto re-runs.

Dev Tools tab

Timing waterfall (DNS / TCP / TLS / TTFB / Download), sent URL and headers, redirect chain, TLS cert info with days-remaining warning, remote IP and HTTP version.

Timing breakdown

DNS, connect, TLS, TTFB, and download segmented with millisecond labels.

Every scheme. No friction.

Six auth types built in. Tokens cache, refresh before expiry, and dedup across concurrent sends.

Basic, Bearer, API Key

Plus AWS Signature v4 via aws4. API Key supports header and query placement.

OAuth 2.0 client credentials

Token fetched automatically, cached until expiry, refreshed before it expires. Concurrent requests share one in-flight fetch.

OAuth 2.0 authorization code

Browser-based flow with a local loopback callback on an ephemeral port. State validated on return.

OAuth 2.0 + PKCE

S256 code challenge, no client secret required. Refresh token flow with proactive refresh 30 seconds before expiry.

OIDC discovery

Point at a .well-known/openid-configuration URL and Scrapeman autofills Token URL, Auth URL, and supported scopes.

Token placement

Authorization header (default), query param, or form body field.

JWT inspector

Decodes header and payload of access_token and id_token with a live exp countdown. Display only, no signature check.

Pre/post scripts and a real load runner

JavaScript before and after every request. Stress-test any endpoint without leaving the app.

Pre-request scripts

Mutate URL, headers, and body via the req proxy. Read and write variables across folder, collection, environment, and global scopes via bru.

Post-response scripts

Inspect res.getStatus() and res.getBody() (auto-parsed JSON). Run assertions with test() / expect().toBe(). Failures render in the Scripts response tab.

vm sandbox

Node vm context with a 5-second timeout. No require, process, or import. Scripts round-trip through .sman as YAML literal blocks.

Load runner

Bounded concurrency, live RPS, p50/p95/p99 latency, status histogram, error kind breakdown. Per-tab isolation so a run survives tab switches.

Watched response headers

Track up to 10 headers across iterations. Per-status value distribution, top-5 values, numeric stats (min/max/avg/p50/p95/p99) when 95% parse as numbers.

Save failed responses

Ring buffer (1 to 1000) captures bodies of failed iterations. Export as JSON for offline triage.

Collection runner

Run any folder sequentially or in parallel. CSV-driven iterations, abort mid-run, export the report as JSON, CSV, or self-contained HTML.

Scoped variables, git-friendly files

One .sman YAML file per request. Variables resolve through five scopes with a clear precedence.

.sman file format

Custom YAML with stable key order. Bodies above 4 KB auto-promote to a sidecar file. Legacy .req.yaml files read as-is and migrate on first save.

Folder, collection, global

Variables live at every level via _folder.yaml, .scrapeman/collection.yaml, and .scrapeman/globals.yaml. Auth blocks at the folder or collection level inherit down the tree.

Scope precedence

Folder chain > active environment > collection > global > built-in. Highest match wins.

Built-in dynamics

{{random}}, {{uuid}}, {{timestamp}}, {{timestampSec}}, {{isoDate}}, {{randomInt}} re-resolve on every send.

Multi-workspace

Open several workspaces in one window and switch from the sidebar header. Per-workspace tabs, env, and view persist across restart.

Restore unsaved tabs on restart

Builder state (URL, headers, body, auth, settings, scripts) round-trips through localStorage. Transient runtime is stripped before saving.

Per-request git sync toggle

Right-click a request to stop syncing it. Backed by .git/info/exclude so teammates never see it. Cmd+Shift+H toggles on the active tab.

In-app git

Status bar branch, source-control panel, stage / commit / push / pull, diff viewer. Diverged-branch dialog offers Rebase or Merge commit.

Built for real-world targets

Scrape.do native mode, anti-bot detection, UA presets, rotating proxies, and rate limiting. The scraping use case is a one-toggle affair.

Scrape.do native mode

One toggle rewrites the URL to api.scrape.do and forwards residential rotation, JS rendering, geo targeting, and ban retry parameters.

Anti-bot detection banner

Cloudflare, HTTP 429, CAPTCHA markers, and bot-block bodies surface as a dismissable banner above the response with a Retry-After countdown.

User-Agent presets

9 presets: Scrapeman default, Chrome 124 macOS / Windows, Firefox 125 macOS / Windows, Safari 17 macOS / iOS, Googlebot, curl. Custom UA in Headers always overrides.

Rotating proxy

List of proxy URLs with round-robin or random strategy. Collection Runner rotates per request, Load Runner rotates per concurrent slot.

Per-request rate limit

Fixed delay plus optional jitter (min / max ms). Honoured by Collection Runner and Load Runner.

Cookie jar UI

Domain filter, manual add and inline edit, httpOnly value masking with reveal, JSON and Netscape exports, paste-import accepting document.cookie or cookies.txt.

Headers Overflow fix

undici maxHeaderSize bumped to 256 KiB so Cloudflare-fronted responses with big rotating cookies and signed proof-of-work tokens stop bouncing.

Bidirectional WebSocket on every tab

A WebSocket pane lives next to the HTTP request builder. Connect, send, and trace messages without leaving the tab.

Live timeline

Each message has a direction arrow, timestamp, and payload. JSON payloads expand inline with the Tree viewer.

Application-level ping

Keep-alive ping every 30 seconds with round-trip latency in the pong row.

Per-connection proxy

Routes through standard HTTP proxies or Scrape.do WS proxy endpoints.

Tab-switch survival

Switching tabs leaves the connection open. The timeline picks up where you left off when you come back.

Export timeline

Save the full message log as JSON for replay or audit.

Bring your existing collections

Read from five external formats. Export history as HAR. More export targets on the roadmap.

OpenAPI 3.0.x / 3.1.x and Swagger 2.0

Parse from URL or paste. Tags become folders, paths and methods become requests, security schemes become auth, server URLs go to {{base_url}}.

Postman v2.1

Folder hierarchy, auth (basic, bearer, apikey, oauth2, awsSigV4), variables, body modes round-trip.

Bruno .bru folders

Reads INI-like .bru files: methods, headers, auth (bearer / basic), bodies, query and path params.

Insomnia v4

Walks the resource list by type, rebuilds the folder tree from _id / parentId, maps all five auth types.

curl

Built on @scrape-do/curl-parser. Handles ANSI-C $'...' quoting, multipart, urlencoded, proxy, referer, basic auth, cookies, custom UA.

HAR 1.2 round-trip

Import Chrome DevTools HAR exports. Export history back to HAR. Round-trip tested.

Code export

Generate curl, JS fetch, Python requests, or Go net/http from the current request. Inline resolved values or keep {{var}} templates.

Zero data ever leaves your machine

No analytics. No crash reporting. No cloud sync. No account. Your requests and responses stay on your disk.

Local YAML and JSONL

Collections live as .sman files in a workspace folder you choose. History is per-workspace JSONL in app data.

Template-preserving history

{{token}} stays {{token}} on disk. Secrets are never baked in.

No telemetry

Zero metrics, analytics, or usage data collected or transmitted.

No account required

Download and run. No registration, no email, no OAuth.

No cloud sync

Nothing is backed up to a third-party server. Your data is your responsibility.

Open source

Apache 2.0 with an explicit patent grant in §3. Read the source on GitHub.

Ready to try it?

Free. Local. No account. Download and run in 30 seconds.

Download Scrapeman