Bulk up to 100 • Detect blocks + missing robots

Robots.txt Checker

Paste domains or URLs — we’ll fetch /robots.txt, follow redirects, and flag global blocks like Disallow: /.

Paste up to 100 domains/URLs (one per line)
Tip: A missing robots.txt is usually OK, but a global block (User-agent: * + Disallow: /) is not.

Results

Host Robots.txt HTTP Hops Time Issues
Run a check to see results here.
“Global block” means crawlers are blocked from the whole site.

Quick interpretation

Robots.txt is a crawl control file — mistakes can deindex a site.

  • OK robots fetches and looks normal
  • Warning missing robots, odd content-type, no UA:*, global block
  • Error fetch errors, loops, HTTP 4xx/5xx
Crawl control

Robots.txt Checker: detect accidental site blocks

Robots.txt tells crawlers what they may crawl. A single wrong line can block an entire site. This tool fetches robots.txt in bulk, follows redirects, and flags risky patterns.

Common problems

  • Disallow: / under User-agent: * blocks everything.
  • No sitemap isn’t fatal, but adding it can help discovery.
  • Redirect chains add latency and can fail for some bots.

FAQ

What does this tool check?

It fetches /robots.txt for each host, follows redirects (up to a safe limit), and reports the final URL, HTTP status, content type, file size, and basic crawl-control signals.

Is it bad if robots.txt is missing (404)?

Not necessarily. A missing robots.txt usually means “no special crawl restrictions”. The real risk is a robots.txt that blocks important sections by mistake.

What does “global block (Disallow: /)” mean?

It means robots.txt contains User-agent: * and Disallow: /, which blocks crawlers from crawling the entire site. That’s a common cause of sudden indexing drops.

Do you validate robots.txt syntax fully?

This checker is a fast audit: it fetches the file and highlights common risky patterns. It doesn’t attempt a full “Google-level” parser for every edge case.

Why do you flag “noindex” inside robots.txt?

noindex in robots.txt is non-standard and not reliably supported. If you need deindexing, use meta robots or X-Robots-Tag headers.

What about sitemap lines in robots.txt?

Sitemap: directives can help discovery. The tool counts how many sitemap entries it sees. Sitemaps can still exist even if the robots.txt has none.

Why can HTTP be 0?

HTTP 0 means no valid HTTP response was received (request failed). Typical causes: DNS errors, TLS/SSL handshake failures, firewall/CDN blocks, or timeouts.

Does robots.txt block indexing?

Robots.txt mainly controls crawling. A blocked page can still be indexed in some situations (URL-only), but it’s common to lose rankings when important pages can’t be crawled. For index control use meta robots / X-Robots-Tag.

Why do redirects matter for robots.txt?

Redirect chains add latency and can fail for some bots. Best practice is a clean, direct https://host/robots.txt response with minimal hops.