Home/Blog/robots.txt for AI Agents: How to Control Which Bots Read Your Products

robots.txt for AI Agents: How to Control Which Bots Read Your Products

2025-01-15·7 min read
ResearchTechnicalAI Agents

The New robots.txt Landscape

The robots.txt file has been a cornerstone of web crawling etiquette since 1994. But in 2025, it has a new purpose: managing AI agent access. Major AI companies have introduced dedicated crawlers — GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, GoogleOther (Google AI), and CCBot (Common Crawl, used by many AI companies) — each with distinct behaviors and purposes.

For e-commerce merchants, this creates a strategic decision: which AI bots should you allow, which should you restrict, and how do you maximize shopping visibility while protecting sensitive content?

Known AI Bot User-Agents

Here are the AI crawlers you should know about:

GPTBot — OpenAI's crawler. Powers ChatGPT's product recommendations and browsing. Blocking it means ChatGPT cannot recommend your products.

ChatGPT-User — OpenAI's real-time browsing bot (when users ask ChatGPT to visit a URL). Different from GPTBot which crawls for training.

ClaudeBot / Claude-Web — Anthropic's crawlers for Claude's knowledge and web access.

PerplexityBot — Perplexity's shopping and search crawler.

GoogleOther — Google's AI-specific crawler, separate from Googlebot (search).

CCBot — Common Crawl's bot, whose data is used by many AI companies for training.

Recommended Configuration for E-Commerce

For most e-commerce stores, the optimal strategy is to allow AI shopping bots while restricting access to sensitive areas:

# AI Shopping Agents — ALLOW product pages
User-agent: GPTBot
Allow: /products/
Allow: /collections/
Disallow: /account/
Disallow: /checkout/
Disallow: /cart/

User-agent: ChatGPT-User
Allow: /

User-agent: ClaudeBot
Allow: /products/
Allow: /collections/
Disallow: /account/

User-agent: PerplexityBot
Allow: /

User-agent: GoogleOther
Allow: /

# Block AI training on non-product content
User-agent: CCBot
Disallow: /blog/
Allow: /products/

This configuration maximizes product visibility while protecting checkout flows, account pages, and (optionally) blog content from training data harvesting.

How MerchantStamp Checks This

MerchantStamp's AI Readiness audit includes a "robots.txt AI-friendly" check that looks for explicit AI agent directives. Stores that mention GPTBot, ClaudeBot, or other AI user-agents in their robots.txt earn points for proactive AI visibility management. Run a free scan to see your current status.

Evaluate your AI readiness

See how well AI agents can read your product data.

Run free audit

Related articles

Complete Guide to JSON-LD Product Schema for E-Commerce

12 min read

Read

Why AI Shopping Assistants Skip Your Products

11 min read

Read

Product Feeds & Google Merchant Center: Complete Guide

13 min read

Read