
Key Points
- Perplexity AI Crawlers Exposed for Stealth Tactics by Cloudflare
- Cloudflare claims Perplexity is bypassing anti-bot protections
- AI bots allegedly disguise themselves as Chrome browsers using rotating IPs
- Cloudflare removes Perplexity from the verified bot list
- Perplexity denies wrongdoing, calls the report a publicity stunt
Perplexity AI Crawlers are back in the spotlight, and not for the right reasons.
Cloudflare, a global internet infrastructure giant, has accused the AI search startup of stealthily crawling websites that explicitly blocked it. According to a recent Cloudflare report, Perplexity’s bots are allegedly bypassing site restrictions by masking their identity and rotating IP addresses to avoid detection.
This isn’t the first time Perplexity has been caught scraping content from the web without permission. Last year, the company faced criticism for ignoring robots.txt files and accessing paywalled content. CEO Aravind Srinivas then deflected blame onto third-party crawlers. But this time, Cloudflare is directly pointing fingers.
I’ve been enjoying testing Comet but this is not a good look for Perplexity -> Cloudflare says Perplexity uses stealth crawling techniques, like undeclared user agents and rotating IP addresses, to evade robots.txt rules and network blocks
“Although Perplexity initially crawls… pic.twitter.com/xmjfOhR21o
— Glenn Gabe (@glenngabe) August 4, 2025
In their investigation, Cloudflare created test domains with crawling restrictions designed to block Perplexity’s bots. Initially, the AI bot identified itself correctly using names like “PerplexityBot” or “Perplexity-User.”
However, when blocked, it allegedly switched tactics, impersonating a Chrome browser on macOS using a different user agent. This trick allowed the bot to slip past defenses unnoticed, much like a disguised intruder.
The situation escalated when Cloudflare noticed that these bots were not only using rotating IPs but also switching autonomous system networks (ASNs), a more technical method of avoiding blocks. Cloudflare reported this behavior across tens of thousands of domains and millions of daily requests.
Cloudflare responded by removing Perplexity from its verified bots list and rolling out new systems to help website owners detect and block similar unauthorized AI activity.
These actions mirror how other platforms are tightening their grip, especially as tools like ChatGPT conversations and AI search models become more data-hungry.
Perplexity has been observed engaging in stealth crawling behavior to evade website no-crawl directives. This involves modifying their user agent and changing their source ASNs
• Perplexity has been de-listed as a verified bot and blocked by Cloudflare’s managed rules pic.twitter.com/UzzJeptd0o
— Donnie (@vibedonnie) August 4, 2025
Perplexity Pushes Back Against Bot Allegations
In response, Perplexity issued a public denial. Spokesperson Jesse Dwyer dismissed the Cloudflare report as a “publicity stunt,” claiming it was full of misunderstandings. The company emphasized that what Cloudflare observed was not AI bot activity but “user-driven agents” triggered by specific user queries.
According to Perplexity, the 20 to 25 million requests cited by Cloudflare were misattributed. They claim most of the suspicious traffic came from BrowserBase, a separate cloud browser service for AI agents, which they say they use only occasionally.
Cloudflare says Perplexity AI uses secret crawlers to scrape websites that say “no crawling.”
> They ignore robots.txt rules
🧵👇🏻 pic.twitter.com/MNNTPBB0R6— Harshith (@HarshithLucky3) August 4, 2025
Perplexity’s blog post also argued that Cloudflare confused human-directed browser interactions with bot-driven scraping, and that their systems do not automatically vacuum up web data in violation of content policies.
Despite the defense, Cloudflare has taken a hard stance. The company is pushing for clearer boundaries on what constitutes fair crawling behavior by AI tools. Their recent measures allow site owners to opt out of AI bot traffic and even demand compensation for access.
This debate is similar to broader issues emerging in the AI space. For example, Google’s ongoing AI development recently faced scrutiny after its AI bug hunter uncovered 20 open-source flaws, raising questions about transparency and ethical responsibility in AI systems.
🏴☠️ The brave lads at #Perplexity — AI’s pirate Robin Hoods.
Cloudflare says they’re stealth-scraping sites, dodging robots.txt and spoofing Google. Crawlers in disguise, chasing knowledge no matter the cost.
🤖 The data wars are heating up.https://t.co/VjcuuOAlND— DeAi (@DeAI_Insights) August 6, 2025
A Growing Clash Between AI and Internet Gatekeepers
The clash between Perplexity AI and Cloudflare is just one example of a broader battle brewing across the internet.
AI startups rely heavily on access to web data to train their models and deliver real-time results. But as these tools grow more powerful and more commercial, website owners and infrastructure providers are pushing back.
Cloudflare’s new anti-scraping stance aligns with a shift happening across the industry. More companies are starting to see uncontrolled AI crawling as a threat, not just to revenue, but to the stability and ethics of the internet itself.
Even major players like OpenAI are under pressure. Despite boasting 180 million weekly ChatGPT users, OpenAI is now facing challenges over how its tools access and use third-party content. Similarly, Apple’s AI Answer Engine aims to generate responses based on multiple data sources—but without crossing ethical boundaries.
And with powerful models like Gemini 2.5 entering the arena, the hunger for clean, high-quality data is only growing. The risk? More stealth crawling tactics, more pushback from internet gatekeepers, and more legal and ethical battles over who controls web content in the AI age.
The Perplexity case may be a sign of what’s to come, AI startups needing to balance innovation with respect for digital boundaries, or risk losing access to the very web they depend on.