NewsAI

Perplexity AI Crawlers Exposed for Stealth Tactics by Cloudflare

Perplexity AI Crawlers Exposed for Stealth Tactics by Cloudflare
Perplexity AI Crawlers Exposed for Stealth Tactics by Cloudflare

Key Points

  • Perplexity AI Crawlers Exposed for Stealth Tactics by Cloudflare
  • Cloudflare claims Perplexity is bypassing anti-bot protections
  • AI bots allegedly disguise themselves as Chrome browsers using rotating IPs
  • Cloudflare removes Perplexity from the verified bot list
  • Perplexity denies wrongdoing, calls the report a publicity stunt

Perplexity AI Crawlers are back in the spotlight, and not for the right reasons.

Cloudflare, a global internet infrastructure giant, has accused the AI search startup of stealthily crawling websites that explicitly blocked it. According to a recent Cloudflare report, Perplexity’s bots are allegedly bypassing site restrictions by masking their identity and rotating IP addresses to avoid detection.

This isn’t the first time Perplexity has been caught scraping content from the web without permission. Last year, the company faced criticism for ignoring robots.txt files and accessing paywalled content. CEO Aravind Srinivas then deflected blame onto third-party crawlers. But this time, Cloudflare is directly pointing fingers.

In their investigation, Cloudflare created test domains with crawling restrictions designed to block Perplexity’s bots. Initially, the AI bot identified itself correctly using names like “PerplexityBot” or “Perplexity-User.”

However, when blocked, it allegedly switched tactics, impersonating a Chrome browser on macOS using a different user agent. This trick allowed the bot to slip past defenses unnoticed, much like a disguised intruder.

The situation escalated when Cloudflare noticed that these bots were not only using rotating IPs but also switching autonomous system networks (ASNs), a more technical method of avoiding blocks. Cloudflare reported this behavior across tens of thousands of domains and millions of daily requests.

Cloudflare responded by removing Perplexity from its verified bots list and rolling out new systems to help website owners detect and block similar unauthorized AI activity.

These actions mirror how other platforms are tightening their grip, especially as tools like ChatGPT conversations and AI search models become more data-hungry.

Perplexity Pushes Back Against Bot Allegations

In response, Perplexity issued a public denial. Spokesperson Jesse Dwyer dismissed the Cloudflare report as a “publicity stunt,” claiming it was full of misunderstandings. The company emphasized that what Cloudflare observed was not AI bot activity but “user-driven agents” triggered by specific user queries.

According to Perplexity, the 20 to 25 million requests cited by Cloudflare were misattributed. They claim most of the suspicious traffic came from BrowserBase, a separate cloud browser service for AI agents, which they say they use only occasionally.

Perplexity’s blog post also argued that Cloudflare confused human-directed browser interactions with bot-driven scraping, and that their systems do not automatically vacuum up web data in violation of content policies.

Despite the defense, Cloudflare has taken a hard stance. The company is pushing for clearer boundaries on what constitutes fair crawling behavior by AI tools. Their recent measures allow site owners to opt out of AI bot traffic and even demand compensation for access.

This debate is similar to broader issues emerging in the AI space. For example, Google’s ongoing AI development recently faced scrutiny after its AI bug hunter uncovered 20 open-source flaws, raising questions about transparency and ethical responsibility in AI systems.

A Growing Clash Between AI and Internet Gatekeepers

The clash between Perplexity AI and Cloudflare is just one example of a broader battle brewing across the internet.

AI startups rely heavily on access to web data to train their models and deliver real-time results. But as these tools grow more powerful and more commercial, website owners and infrastructure providers are pushing back.

Cloudflare’s new anti-scraping stance aligns with a shift happening across the industry. More companies are starting to see uncontrolled AI crawling as a threat, not just to revenue, but to the stability and ethics of the internet itself.

Even major players like OpenAI are under pressure. Despite boasting 180 million weekly ChatGPT users, OpenAI is now facing challenges over how its tools access and use third-party content. Similarly, Apple’s AI Answer Engine aims to generate responses based on multiple data sources—but without crossing ethical boundaries.

And with powerful models like Gemini 2.5 entering the arena, the hunger for clean, high-quality data is only growing. The risk? More stealth crawling tactics, more pushback from internet gatekeepers, and more legal and ethical battles over who controls web content in the AI age.

The Perplexity case may be a sign of what’s to come, AI startups needing to balance innovation with respect for digital boundaries, or risk losing access to the very web they depend on.

What's your reaction?

Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
Aishwarya Patole
Aishwarya is an experienced AI and tech content specialist with 5+ years of experience in turning intricate tech concepts into engaging, relatable stories. With expertise in AI applications, blockchain, and SaaS, she creates data-driven articles, explainer pieces, and trend reports that drive impact.

You may also like

More in:News

Leave a reply

Your email address will not be published. Required fields are marked *