Skip to main content
If you’re collecting data from the web at scale, you’ll encounter CAPTCHAs and blocks. This is normal. It’s not a sign that your proxies are broken. It’s a sign that the target website is doing its job. This page explains what causes detection, what SOAX can and can’t control, and what you can do to improve your success rates.

How detection works

Websites use anti-bot systems to identify and block automated traffic. These systems look at signals like request frequency, browser fingerprints, header patterns, behavioral cues, and IP reputation. When a website detects something suspicious, it typically responds in one of these ways:
  • Serving a CAPTCHA challenge instead of the page content
  • Returning an HTTP 403 (Forbidden) or 429 (Too Many Requests) response
  • Returning a fake or empty page that looks normal but contains no real data
  • Silently rate-limiting your requests so they slow down or time out
  • Blocking the IP address entirely so future requests fail
The important thing to understand is that these decisions are made by the target website, not by SOAX. The proxy’s job is to route your request through a clean IP. What the target does with that request is outside the proxy’s control.

What SOAX can and can’t see

SOAX proxies use CONNECT tunnels for HTTPS traffic. This means the proxy establishes a connection to the target on your behalf, but it can’t read or inspect the encrypted traffic that flows through it. In practice, this means:
  • SOAX can detect infrastructure failures: the node is offline, the connection was refused, the request timed out before the tunnel was established.
  • SOAX can’t see HTTP status codes from the target (403, 429, 503, etc.) because they’re inside the encrypted tunnel.
  • SOAX can’t detect CAPTCHAs, soft blocks, or fake pages because those are part of the HTTP response body, which is encrypted.
This is why the error handling rules (onerror-replace, onerror-retry_N, onerror-fail) only respond to infrastructure-level failures, not target-level blocks. SOAX works at the transport layer; if a target returns a CAPTCHA page with an HTTP 200 status, the proxy sees a successful connection and passes it through unchanged. Detecting and handling target-side blocks is your application’s responsibility.

What’s a normal CAPTCHA / block rate?

There’s no universal answer. It depends entirely on the target website, the volume of your requests, and how your scraper behaves. Here are some general ranges: Low-protection targets (most news sites, public directories, open APIs): Block rates under 1% are typical. Residential IPs rarely get challenged on these sites. Medium-protection targets (ecommerce product pages, review sites, job boards): Block rates between 2% and 10% are common, depending on your request volume and patterns. High-protection targets (Google, Amazon, social media platforms, sneaker sites): Block rates of 10% to 30% or higher are expected. These sites invest heavily in anti-bot systems and actively fingerprint incoming traffic. Even real users occasionally see CAPTCHAs on these sites. If your block rate on a specific target is significantly higher than these ranges, it’s usually a sign that something in your request pattern is triggering detection, not that the proxies are bad.

How to improve success rates

Use the right network type

Mobile proxies generally have lower block rates than residential on heavily protected targets. This is because mobile carrier IPs are shared across many real users via NAT, making them harder to block without affecting legitimate traffic. If residential IPs are getting blocked on your target, try network-mob.

Rotate IPs appropriately

Sending too many requests from the same IP is the most common trigger for blocks. For high-volume scraping, use rotating mode (no session parameter) so every request gets a fresh IP. If you need sessions for multi-page workflows (login, navigate, scrape), use rotate-timed_N to force a fresh IP periodically. A common pattern is rotating every 3 to 5 minutes.

Pace your requests

Sending hundreds of requests per second to the same domain is the fastest way to get blocked, regardless of how clean your IPs are. Anti-bot systems track request frequency per IP, per subnet, and per behavioral pattern. Add reasonable delays between requests. Even 1 to 2 seconds between requests to the same domain significantly reduces detection risk.

Vary your request patterns

Anti-bot systems look for patterns that real users don’t produce:
  • Requesting the same page structure repeatedly without variation
  • Missing standard browser headers (User-Agent, Accept, Accept-Language, etc.)
  • Sending requests in perfectly regular intervals (exactly every N seconds)
  • Never loading CSS, images, or JavaScript (for browser-based checks)
Make sure your scraper sends realistic headers and varies its timing slightly between requests.

Target broader geos

Sending 1,000 requests per minute from IPs in a single city looks suspicious. Spreading your traffic across a wider geographic area (country-level instead of city-level) gives you access to more IPs and makes your traffic pattern look more natural.

Handle blocks in your application

Since SOAX can’t see target-side responses, your application needs to detect and react to blocks:
  • Check the HTTP status code of every response. A 403, 429, or unexpected 200 with CAPTCHA content means you’ve been detected.
  • When you detect a block, rotate to a new IP by using a new session ID or switching to rotating mode.
  • Track your success rate per target. If it drops below your threshold, slow down, switch network types, or broaden your geo-targeting.
  • Consider exponential backoff when you detect rate-limiting (429 responses). Hammering a target that’s already blocking you only makes it worse.

Common misconceptions

“I’m getting blocked, so the IPs must be dirty.” Not necessarily. Even fresh, never-before-used IPs get blocked if the request pattern triggers detection. Clean IPs with bad request patterns get blocked faster than “used” IPs with good patterns. “Residential IPs should never get blocked.” Residential IPs are harder to detect than datacenter IPs, but they’re not invisible. Anti-bot systems have evolved beyond simple IP classification. They look at TLS fingerprints, header ordering, mouse movements (for browser-based checks), and many other signals. “Higher success rates mean better proxies.” Success rate depends on target difficulty, request patterns, and scraper quality as much as it depends on IP quality. Two customers using the same SOAX plan on the same target can have very different success rates based on how their scrapers behave. “More retries will fix the problem.” Retrying the same request through different IPs helps with transient failures, but it doesn’t help if the target is detecting something about your request other than the IP (e.g. headers, fingerprint, behavior). Fix the detection signal first, then retry.

When to contact support

Reach out to support if:
  • Your success rate drops significantly and suddenly across multiple targets (not just one). This could indicate a pool issue rather than target-side detection.
  • You’re getting SOAX-level errors (407, 429, 502, 503) rather than target-side blocks. These are infrastructure issues that the SOAX team can investigate. See Error Codes for the full list.
  • You’re seeing IP geo-mismatches where the exit IP doesn’t match the country you targeted. This is a routing issue on our side.
If your block rate is high on a single target but everything else works fine, it’s almost certainly target-side detection. The suggestions above are your best path forward.

Next steps

Error codes

Distinguish between SOAX errors and target-side blocks.

Connection debugging

Step-by-step checklist for diagnosing connection issues.

Residential proxies

Session, rotation, and error handling parameter reference.

Mobile proxies

Lower detection rates for heavily protected targets.