I’m not sure how they actually implemented it, but you can easily block ML crawlers via cloud flare. Isn’t just about every small site/service behind CF anyway?
Last I checked, cloudflare requires the user to have JavaScript and cookies enabled. My institution doesn’t want to require those because it would likely impact legitimate users as well as bots.
I’m clearly not on the same setup as you are, but my off-the-cuff guess is that your curl command was issued from a system that cloudflare already recognized (IP whitelist, cookies, I dunno).
Of course their challenge requires those things. How else could they implement it? Most users will never be presented with a challenge though and it is trivial to disable if you don’t want to ever challenge anyone. I was just saying CF blocks ML crawlers.
I’m not sure how they actually implemented it, but you can easily block ML crawlers via cloud flare. Isn’t just about every small site/service behind CF anyway?
Last I checked, cloudflare requires the user to have JavaScript and cookies enabled. My institution doesn’t want to require those because it would likely impact legitimate users as well as bots.
Huh? I can reach my site via curl that has neither. How did you come up with this random set of requirements?
Odd. I just tried
and got
I’m clearly not on the same setup as you are, but my off-the-cuff guess is that your curl command was issued from a system that cloudflare already recognized (IP whitelist, cookies, I dunno).
Anyways, I’m reading through this blog post on using cURL with cloudflare-protected sites and I’m finding it interesting.
Of course their challenge requires those things. How else could they implement it? Most users will never be presented with a challenge though and it is trivial to disable if you don’t want to ever challenge anyone. I was just saying CF blocks ML crawlers.