HTML API
Run in PostmanOverview
Here is the list of the different parameters you can use with SparkProxy's HTML API.
You can also discover this API using our Postman collection covering every SparkProxy feature.
All Parameters
Quick reference of every parameter supported by the API.
| Parameter | Description |
|---|---|
api_key[string]required | |
url[string]required | |
render_js[boolean](true) | |
js_scenario[JSON object]({}) | |
wait_for[string]("") | |
wait[integer](0) | |
wait_for_and_click[string]("") | |
scroll[boolean](true) | |
human[boolean](true) | |
block_resources[boolean](false) | |
block_ads[boolean](false) | |
window_width[integer](1280) | |
window_height[integer](720) | |
premium_proxy[boolean](false) | |
country_code[string]("") | |
stealth[boolean](false) | |
own_proxy[string]("") | |
proxy_type[string]("") | |
forward_headers[JSON object]({}) | |
referrer[string]("") | |
format[string]("html") | |
json_response[boolean](false) | |
transparent_status_code[boolean](false) | |
session_id[string]("") | |
cookies[JSON array]([]) | |
device[string]("desktop") | |
custom_ua[string]("") | |
callback_url[string]("") | |
extract_rules[JSON object]({}) | |
tag[string]("") |
Quick Start
Send your first request in under 60 seconds. Select your language above, then paste the snippet.
Get your API key
Sign up at app.sparkproxy.io and copy your API key from the dashboard. Your key looks like sk-xxxxxxxxxxxxxxxx.
Send your first request
Pass the target URL and your API key. The API returns the fully-rendered HTML of the page.
import requests
API_KEY = "YOUR_API_KEY"
response = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": API_KEY},
params={"url": "https://example.com", "render_js": "true"}
)
print(response.text)Inspect the response
By default the response body is the raw HTML of the page. Use json_response=true to get metadata alongside the content.
// Add json_response=true to get metadata alongside content
// GET /api/v1?url=https://example.com&render_js=true&json_response=true
{
"job_id": "job_abc123",
"result_url": "https://scrape.sparkproxy.io/api/v1/files/job_abc123",
"format": "html",
"expires_at": "2026-06-27T16:00:00.000Z",
"status_code": 200,
"duration_ms": 3840,
"meta": {
"title": "Example Domain",
"description": "This domain is for use in illustrative examples."
},
"credits_used": 5
}Authentication
You can pass your API key in three ways. We recommend the header.
| Method | Example | Notes |
|---|---|---|
| Header | X-API-Key: YOUR_KEY | Recommended |
| Query param | ?api_key=YOUR_KEY | Easy for quick tests |
| Body field | { "api_key": "YOUR_KEY" } | POST requests only |
https://scrape.sparkproxy.io/api/v1API Reference
Full parameter documentation with code examples in 7 languages.
URL
urlThe page you want to scrape. Needs to be a full http:// or https:// URL. We block private and internal addresses. For batch scraping, pass multiple comma-separated URLs — see the Batch Scraping section.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={"url": "https://example.com"},
)
print(r.text)Batch Scraping
Pass multiple comma-separated URLs in the url parameter to scrape them all in one request. You get back a results array with one entry per URL, each containing its own status, content, and timing data.
render_js=false (plain HTTP fetch). The whole batch costs 1 credit regardless of URL count. For headless scraping, send separate requests. Response shape
{
"job_id": "abc123",
"duration_ms": 1840,
"credits_used": 1,
"tag": "my-label", // only present when tag param was provided
"results": [
{ "url": "https://example.com", "success": true, "httpStatus": 200, "body": "...", "error": null },
{ "url": "https://example.com/about", "success": false, "httpStatus": null, "body": null, "error": "Connection refused" }
]
}import requests
urls = [
"https://example.com",
"https://example.com/about",
"https://example.com/contact",
]
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": ",".join(urls),
"render_js": "false", # required for batch mode
},
)
data = r.json()
for result in data["results"]:
print(result["url"], result["httpStatus"], result["success"])Headless Browser
SparkProxy uses a headless Chromium browser by default. The following parameters control how the browser behaves.
JS Rendering
render_jsWhen true (the default), Chromium fully renders the page: JavaScript runs, dynamic content loads, and SPAs hydrate. Set it to false to skip the browser and do a plain HTTP fetch instead. That's 3x faster and costs 1 credit instead of 5.
import requests
# render_js=false → plain HTTP fetch (1 credit, faster)
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={"url": "https://example.com", "render_js": "false"},
)
print(r.text)JS Execution
js_scenarioA sequence of browser actions that run after the page loads. Pass an object with an instructions array (max 50 entries). Each instruction is an object where the key is the action name. Great for clicking buttons, filling forms, or triggering lazy-loaded content before capture.
| Key | Value | Description |
|---|---|---|
click | CSS selector | Wait for element then click it |
wait_for | CSS selector | Pause until element appears in DOM (30 s timeout) |
fill | {"selector": "...", "value": "..."} | Type into an input field |
wait | milliseconds (number) | Pause for N milliseconds (capped at 30 000 ms) |
scroll | pixels (number) | Scroll down by N pixels |
evaluate | JS string | Run arbitrary JavaScript on the page |
screenshot | true | Capture a JPEG at this point in the sequence |
import requests
import json
# Each instruction is { "action_key": value } — NOT { "type": "...", "key": value }
scenario = {
"instructions": [
{ "click": "#cookie-banner-close" },
{ "wait_for": ".main-content" },
{ "scroll": 800 },
{ "wait": 1500 },
{ "fill": { "selector": "#search", "value": "laptop" } },
{ "click": "#search-submit" },
{ "evaluate": "window.scrollTo(0, 0)" },
]
}
r = requests.post(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY", "Content-Type": "application/json"},
json={
"url": "https://example.com",
"js_scenario": scenario,
},
)
print(r.text)Wait for Selector
wait_forA CSS selector the scraper waits for before taking the page. The browser stays open until that element shows up in the DOM. Handy for pages with skeleton screens or lazy-loaded content. Gives up after 30 seconds.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"wait_for": "#main-content", # wait until element is in DOM
},
)
print(r.text)Wait for Browser
waitExtra seconds to wait after the browser's load event before capturing the page. Use this when content appears after page load but there's no reliable CSS selector to wait for.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"wait": "3", # wait 3 seconds after load event
},
)
print(r.text)Blocking Images & CSS
block_resourcesblock_adsblock_resources=true blocks images, fonts, stylesheets, and media so pages load faster and use less bandwidth. block_ads=true cuts out ad networks and tracking scripts.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"block_resources": "true", # block images, fonts, CSS
"block_ads": "true", # block ad/tracking scripts
},
)
print(r.text)Viewport Size
window_widthwindow_heightSet the browser viewport size in pixels. Go wider for desktop layouts, narrower to test responsive breakpoints. For tall full-page screenshots, bump up window_height or use device=mobile.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"format": "screenshot",
"window_width": "1920",
"window_height": "1080",
},
)
# r.content is a JPEG image
with open("screenshot.jpg", "wb") as f:
f.write(r.content)Wait and Click
wait_for_and_clickWaits for a CSS selector to appear in the DOM, then clicks it before capturing the page. Great for dismissing cookie banners, closing modals, or triggering content hidden behind a gated click. Times out after 30 seconds if the selector never appears.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"wait_for_and_click": "#accept-cookies", # dismiss banner then capture
},
)
print(r.text)Scroll & Human Mode
scrollhumanscroll=true automatically scrolls the page after load to trigger infinite scroll and lazy-loaded images. human=true simulates human-like mouse movement and random interaction delays, helping bypass sites that fingerprint cursor behaviour. Both are on by default — set either to false to disable.
import requests
# Disable scroll and human simulation (e.g. for fast static pages)
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"scroll": "false",
"human": "false",
},
)
print(r.text)Proxies
SparkProxy routes requests through a pool of rotating proxies. Use these parameters to control proxy type, location, and anonymity level.
GeoLocation
country_codeExit the request through a proxy in a specific country. Pass an ISO 3166-1 alpha-2 code like US, GB, DE, or JP. Great for geo-restricted content or comparing prices across regions. Adds 5 credits on top of the base cost regardless of rendering mode.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"country_code": "US", # route through US proxy
},
)
print(r.text)Stealth Mode
stealthEvery request already includes randomised browser fingerprints, stealth Chromium patches, human-like mouse movement, and WebRTC leak prevention. Enabling stealth=true adds three extra layers on top: a homepage pre-warm visit before the target page, a forced Google referrer, and extended idle delays (1.5–3.5 s vs 0.5–2 s). Use it on the toughest targets like e-commerce, travel, and finance sites.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"stealth": "true",
},
)
print(r.text)Own Proxy
own_proxyRoute the request through your own proxy server. Accepted formats: ip:port, ip:port:user:pass, http://user:pass@host:port, or socks5://host:port. SOCKS5 requires render_js=true.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"own_proxy": "http://user:[email protected]:8080",
},
)
print(r.text)Proxy Type
proxy_typeSelect from your server's file-based proxy pools defined in proxies.config.js. Accepts premium or ad_free. Can be combined with country_code to pick a country-specific entry from the chosen pool. proxy_type=premium costs 10 credits without JS or 25 credits with JS (same tier as premium_proxy=true); ad_free has no extra credit cost.
import requests
# premium pool (+5 credits) — high-quality residential IPs from proxies.config.js
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"proxy_type": "premium", # or "ad_free" (no extra credits)
"country_code": "US", # optional: pick US entry from the pool
},
)
print(r.text)Headers
Control the HTTP headers sent to the target site.
Forward Headers
forward_headersSend custom HTTP headers to the target URL. Pass a JSON object of key/value pairs. Your headers are merged with the browser defaults, and any key you set will override that default.
import requests
import json
headers_to_forward = {
"Accept-Language": "en-US,en;q=0.9",
"Referer": "https://google.com",
}
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"forward_headers": json.dumps(headers_to_forward),
},
)
print(r.text)Pure Header Forwarding
Need complete control over every header, including User-Agent, Accept-Language, and Referer? Set them all in forward_headers and we'll send exactly what you provide, no additions from our side.
import requests
import json
# Override all key headers to appear as a real Chrome browser
custom_headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 Chrome/125.0.0.0 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.9",
"Accept-Encoding": "gzip, deflate, br",
"Referer": "https://www.google.com/",
"DNT": "1",
}
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"forward_headers": json.dumps(custom_headers),
},
)
print(r.text)Custom Referrer
referrerSets the HTTP Referer header on the request. Some sites check this to decide whether to serve content or show a paywall. Pass a plausible origin URL like a search engine result page to make the request look like it came from organic traffic. When omitted, the API automatically picks a realistic referrer from a weighted pool (Google, Bing, DuckDuckGo, Reddit, Facebook, or none) so you get varied, natural-looking traffic without doing anything.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com/article",
"referrer": "https://www.google.com/search?q=example",
},
)
print(r.text)Response Format
formatControls what the API returns. The result file is stored and served via /api/v1/files/:jobId. screenshot and pdf require render_js=true — the API returns 422 if you combine them with render_js=false. md, mdx, and json are rendered properly only with render_js=true; with render_js=false the raw HTTP response body is returned (no error — the format parameter is effectively ignored in plain HTTP mode).
htmlFully-rendered HTML page source
mdMarkdown — stripped to clean readable text (render_js=true)
mdxMDX file extension, same Markdown content as md (render_js=true)
screenshotFull-page JPEG screenshot, quality 80 (render_js=true)
pdfPDF printout of the page (render_js=true)
jsonStructured JSON: headings, links, images, JSON-LD, and page meta (render_js=true)
import requests
# Screenshot example: response body is a JPEG
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"format": "screenshot", # html | md | mdx | screenshot | pdf | json
},
)
with open("page.jpg", "wb") as f:
f.write(r.content)json_responseReturns a JSON envelope instead of streaming the content directly. The response shape is { "job_id", "status_code", "duration_ms", "body", "encoding": "base64", "credits_used" } — the page content is base64-encoded in body. Headless mode additionally includes a meta object with page title, description, and other metadata. Both status_code and meta may be null if the page did not respond normally. If a tag was provided, it is also echoed back in the envelope.
transparent_status_codeWhen true, the API HTTP status mirrors the target page's status (e.g. 404 if the target 404s). Only applies to plain HTTP mode (render_js=false). Headless mode always returns 200 on success or 530 on failure.
Session
session_idLabels this job's browser profile in per-job logs. Each request gets a fresh browser profile regardless of the session_id value — cookies and login state do not persist across separate API requests.
import requests
SESSION = "my-session-abc123"
# Step 1: Log in
requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com/login?user=alice&pass=secret",
"session_id": SESSION,
},
)
# Step 2: Scrape authenticated page with same session
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com/dashboard",
"session_id": SESSION,
},
)
print(r.text)Timeout
waitExtra seconds the browser waits after the page load event before capturing. Capped at 30. The page navigation timeout itself is 90 seconds on the first attempt and up to 3 automatic retries at 120 s and 180 s — all handled server-side. If all attempts fail, the job returns 530 and credits are refunded.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"wait": "10", # wait up to 10s after page load (max 30)
},
)
print(r.text)Devices
deviceEmulate a device type. This sets the viewport size, User-Agent, and touch events to match that device.
desktop·1280 × 720, desktop UA (default)mobile·390 × 844, iPhone UAtablet·768 × 1024, iPad UArandom·Randomly selects desktop, mobile, or tablet with equal probabilityimport requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"device": "mobile", # desktop | mobile | tablet | random
},
)
print(r.text)Custom User-Agent
custom_uaOverride the browser's User-Agent string entirely. When set, the API ignores the device pool and uses exactly the string you provide. The viewport is still determined by the device parameter. Useful for mimicking a specific browser version, a known bot, or any custom identity.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"custom_ua": "Mozilla/5.0 (compatible; MyBot/1.0; +https://mysite.com/bot)",
},
)
print(r.text)Webhooks
callback_urlFire-and-forget mode. Pass a callback_url and the API returns 202 Accepted immediately with the job ID — no waiting for Chromium to finish. When the job completes (or fails), SparkProxy POSTs the full result JSON to your URL. For headless requests (render_js=true), the payload mirrors the synchronous response plus a top-level success boolean. For plain HTTP requests (render_js=false), the webhook always delivers a JSON envelope with the page content base64-encoded in a body field — regardless of the json_response parameter.
202 response (immediate)
{ "job_id": "abc123", "status": "queued" }Webhook POST body (on completion)
{
"success": true,
"job_id": "abc123",
"result_url": "https://scrape.sparkproxy.io/api/v1/files/abc123",
"format": "html",
"expires_at": "2026-06-27T16:00:00.000Z",
"status_code": 200,
"duration_ms": 4200,
"meta": { "title": "Example Domain", "description": "...", "canonical": "...", "wordCount": 42 },
"credits_used": 5,
"tag": "my-label" // only present when tag param was provided
}
// On failure:
{
"success": false,
"job_id": "abc123",
"error": "CAPTCHA challenge could not be bypassed",
"reason": "captcha_blocked",
"captcha_type": "cloudflare_turnstile",
"attempts": 3
}import requests
# Step 1 — fire the scrape, get 202 immediately
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"callback_url": "https://your-server.com/webhook/sparkproxy",
},
)
print(r.status_code) # 202
print(r.json()) # {"job_id": "abc123", "status": "queued"}
# Step 2 — receive the result at your endpoint (e.g. Flask)
# from flask import Flask, request
# app = Flask(__name__)
# @app.route("/webhook/sparkproxy", methods=["POST"])
# def webhook():
# data = request.json
# print("Job", data["job_id"], "success:", data["success"])
# return "", 200POST / PUT
All parameters work exactly the same way with POST and a JSON body. Switch to POST when payloads like js_scenario get too big to fit in a URL.
import requests
r = requests.post(
"https://scrape.sparkproxy.io/api/v1",
headers={
"X-API-Key": "YOUR_API_KEY",
"Content-Type": "application/json",
},
json={
"url": "https://example.com",
"render_js": True,
"stealth": True,
"format": "screenshot",
"js_scenario": {
"instructions": [
{"click": "#accept-cookies"},
{"wait": 1000},
]
},
},
)
print(r.json())Data Extraction
extract_rulesPull structured data directly out of a page without writing a parser. Pass a JSON object where each key becomes a field in the response. The value can be a plain CSS selector string (returns the element's text content) or an object with a selector and type for richer extraction.
Selector formats
{
"title": "h1", // string: querySelector → textContent
"links": { "selector": "a.nav", "type": "list" }, // list: querySelectorAll → string[]
"href": { "selector": "a.cta", "type": "href" }, // href: → el.href (full URL)
"img": { "selector": "img.hero", "type": "src" } // src: → el.src (full URL)
}Response shape (replaces the normal result_url response)
{
"job_id": "abc123",
"duration_ms": 2340,
"credits_used": 5,
"tag": "my-label", // only present when tag param was provided
"extracted": {
"title": "Example Domain",
"description": "This domain is for illustrative examples.",
"links": ["Home", "About", "Contact"],
"href": "https://example.com/signup"
},
"meta": {
"title": "Example Domain",
"description": "This domain is for use in illustrative examples.",
"canonical": "https://example.com/",
"h1": ["Example Domain"],
"wordCount": 42
}
}import requests
rules = {
# Plain string: querySelector → textContent
"title": "h1",
"description": "meta[name='description']",
"price": ".product-price",
# Object form: querySelectorAll → list of textContent strings
"features": { "selector": "ul.features li", "type": "list" },
# Object form: returns el.href (full URL)
"buy_link": { "selector": "a.buy-now", "type": "href" },
# Object form: returns el.src (full URL)
"hero_img": { "selector": "img.hero", "type": "src" },
}
r = requests.post(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY", "Content-Type": "application/json"},
json={
"url": "https://example.com/product",
"extract_rules": rules,
},
)
data = r.json()
print(data["extracted"])
# { "title": "...", "price": "$29.99", "features": ["Fast", "Reliable"], "buy_link": "https://..." }Tag
tagAttach a custom label to any request. The tag comes back in the response so you can group and filter by project, campaign, or feature. Allowed characters: alphanumeric, _, -, ., :, @, /.
import requests
r = requests.get(
"https://scrape.sparkproxy.io/api/v1",
headers={"X-API-Key": "YOUR_API_KEY"},
params={
"url": "https://example.com",
"tag": "price-monitor/amazon", # echoed back in the response
},
)
data = r.json()
print(data.get("tag")) # "price-monitor/amazon"Credit Cost
Credits are deducted before the request runs and automatically refunded on failure. Base cost is determined by proxy tier × rendering mode. The three add-ons (country_code, js_scenario, screenshot/PDF) each add 5 credits on top of the base cost.
| Feature | Credits |
|---|---|
| Rotating proxy, no JS (render_js=false) | +1 |
| Rotating proxy, with JS (render_js=true) | +5 |
| Premium proxy, no JS (premium_proxy=true, render_js=false) | +10 |
| Premium proxy, with JS (premium_proxy=true, render_js=true) | +25 |
| Stealth Proxy with residential network | coming soon |
| Add-on: stealth browser mode (stealth=true, render_js=true) | +5 |
| Add-on: geo-targeted proxy (country_code) | +5 |
| Add-on: JS scenario (js_scenario, render_js=true) | +5 |
| Add-on: screenshot or PDF format (render_js=true) | +5 |
Status Codes
HTTP status codes returned by the SparkProxy API, not the page you scraped.
| Code | Meaning |
|---|---|
200 | Result body or JSON envelope returned |
202 | Job queued (webhook mode) — result will be POSTed to callback_url when ready |
401 | Missing or invalid API key |
402 | Not enough credits, top up your account |
404 | File not found or does not belong to this API key (files endpoint) |
422 | Invalid parameters, check the error message |
429 | Rate limit or concurrency limit exceeded |
500 | Internal server error, credits are refunded |
503 | Service temporarily unavailable (database or upstream down) |
410 | File expired and was not backed up to remote storage |
530 | Scrape failed, the target returned an error or timed out |
Error response bodies
Every error response is JSON with at least an error string. Some include extra fields to help you handle the failure without guessing.
401Missing or invalid API key{ "error": "api_key is required" }
// or
{ "error": "Invalid API key" }402Insufficient credits{
"error": "Insufficient credits",
"credits_required": 10,
"credits_remaining": 3
}422Invalid parameters{ "error": "url is required" }
// or
{ "error": "Invalid format. Valid: html, md, mdx, screenshot, pdf, json" }
// or
{ "error": "Requests to private, loopback, or internal addresses are not allowed", "url": "..." }
// or
{ "error": "SOCKS5 proxies are not supported with render_js=false. Use render_js=true or supply an HTTP proxy." }
// or
{ "error": "stealth=true requires render_js=true" }
// or
{ "error": "js_scenario requires render_js=true" }
// or
{ "error": "format=screenshot and format=pdf require render_js=true" }
// or
{ "error": "Batch scraping (multiple URLs) requires render_js=false. For headless scraping, send one URL per request." }
// or
{ "error": "callback_url: Requests to private, loopback, or internal addresses are not allowed" }
// or
{ "error": "own_proxy cannot point to a private or internal address" }429Rate limit or concurrency limit// Rate limit (per-minute window)
{
"error": "Rate limit exceeded",
"retry_after_seconds": 60,
"limit": 60
}
// Concurrency limit (too many parallel requests)
{
"error": "Concurrency limit reached",
"active": 3,
"limit": 3,
"retry_after_seconds": 5
}530Scrape failed// General failure (network error, timeout, anti-bot block, etc.)
{
"error": "Timeout 30000ms exceeded", // raw error from Playwright / axios
"reason": "Page load exceeded configured timeout", // human-readable translation; "Unexpected error — check the full error message" for unclassified failures
"captcha_type": null, // null for non-CAPTCHA failures
"attempts": 3, // total attempts made (always equals MAX_RETRIES=3 when all fail)
"job_id": "abc123"
}
// CAPTCHA block (reason is always "captcha_blocked" for CAPTCHA failures)
{
"error": "CAPTCHA challenge (cloudflare_turnstile) could not be bypassed with the current proxy",
"reason": "captcha_blocked",
"captcha_type": "cloudflare_turnstile", // "cloudflare_turnstile" | "hcaptcha" | "recaptcha" — null for behavioral blocks (DataDome, PerimeterX, Akamai)
"attempts": 3,
"job_id": "abc123"
}500Internal server error{ "error": "Internal server error" }
// job_id is included when the failure occurred after the job was created:
{ "error": "Internal server error", "job_id": "abc123" }
// Credits are always refunded on 500503Service temporarily unavailable{ "error": "Service temporarily unavailable" }
// Retry with exponential back-off
// Files endpoint — local file expired and remote storage fetch failed:
{ "error": "File unavailable — remote storage fetch failed" }410File expired (files endpoint only){ "error": "File has expired and was not backed up to remote storage" }
// Files are stored locally for FILE_TTL_HOURS (default 2 h) then deleted.
// If remote backup is enabled, the file remains available after local expiry.Response Headers
Plain HTTP responses (render_js=false) include these headers. Headless responses return the same values inside the JSON body (job_id, credits_used, duration_ms).
| Header | Description |
|---|---|
X-Job-Id | Unique job ID, use it with /api/v1/files/:jobId to fetch the result |
X-Credits-Used | Number of credits consumed by this request |
X-Duration-Ms | Total server-side execution time in milliseconds |
