We use cloudflare as the CDN/security for our website and use browsertrix for archiving. We are getting more crawls with captchas blocking now.
Can you point me in the right direction to bypass the captcha for website where I control everything?
We use cloudflare as the CDN/security for our website and use browsertrix for archiving. We are getting more crawls with captchas blocking now.
Can you point me in the right direction to bypass the captcha for website where I control everything?
Hi,
I tried running a crawl with a profile, which made the captcha go away for now - sometimes just saving a profile for the site you’re crawling on its own is enough.
But, since its your site, probably the most foolproof way is to bypass the captcha by IP address or user-agent. I think this can be configured here via Firewall rules in Cloudflare, seems like there’s more info here:
https://community.cloudflare.com/t/how-to-turn-off-captcha/305523/6
Alternatively, you can also set a custom User Agent in the Browser Settings, I think that may be another option.
I’ll reach out directly with the IP ranges we use for our hosted service.
This topic was automatically closed after 15 days. New replies are no longer allowed.