HTTP 0 means the request failed before a valid HTTP response was received. Causes include DNS errors, TLS/SSL failures, firewall/CDN blocks, or timeouts.

Bulk up to 100 • Detect blocks + missing robots

Robots.txt Checker

Q: Is it bad if robots.txt is missing (404)?

Not necessarily. A missing robots.txt usually means no special crawl restrictions. The real risk is a robots.txt that blocks important sections by mistake.

Q: Do you validate robots.txt syntax fully?

This is a fast audit tool. It fetches the file and highlights common risky patterns, but it does not implement a full search-engine parser for every edge case.

Q: What about sitemap lines in robots.txt?

Sitemap directives can help discovery. The tool counts sitemap entries found in robots.txt, but sitemaps can exist even if robots.txt has none.

Q: Does robots.txt block indexing?

Robots.txt mainly controls crawling. For index control use meta robots or X-Robots-Tag headers.

Q: Why do redirects matter for robots.txt?

Redirect chains add latency and can fail for some bots. A direct https://host/robots.txt response with minimal hops is preferred.

Paste domains or URLs — we’ll fetch /robots.txt, follow redirects, and flag global blocks like Disallow: /.

Results

Host	Robots.txt	HTTP	Hops	Time	Issues
Run a check to see results here.

“Global block” means crawlers are blocked from the whole site.

Quick interpretation

Robots.txt is a crawl control file — mistakes can deindex a site.

OK robots fetches and looks normal
Warning missing robots, odd content-type, no UA:*, global block
Error fetch errors, loops, HTTP 4xx/5xx

FAQ

What does this tool check?

It fetches /robots.txt for each host, follows redirects (up to a safe limit), and reports the final URL, HTTP status, content type, file size, and basic crawl-control signals.

Is it bad if robots.txt is missing (404)?

Not necessarily. A missing robots.txt usually means “no special crawl restrictions”. The real risk is a robots.txt that blocks important sections by mistake.

What does “global block (Disallow: /)” mean?

It means robots.txt contains User-agent: * and Disallow: /, which blocks crawlers from crawling the entire site. That’s a common cause of sudden indexing drops.

Do you validate robots.txt syntax fully?

This checker is a fast audit: it fetches the file and highlights common risky patterns. It doesn’t attempt a full “Google-level” parser for every edge case.

Why do you flag “noindex” inside robots.txt?

noindex in robots.txt is non-standard and not reliably supported. If you need deindexing, use meta robots or X-Robots-Tag headers.

What about sitemap lines in robots.txt?

Sitemap: directives can help discovery. The tool counts how many sitemap entries it sees. Sitemaps can still exist even if the robots.txt has none.

Why can HTTP be 0?

HTTP 0 means no valid HTTP response was received (request failed). Typical causes: DNS errors, TLS/SSL handshake failures, firewall/CDN blocks, or timeouts.

Does robots.txt block indexing?

Robots.txt mainly controls crawling. A blocked page can still be indexed in some situations (URL-only), but it’s common to lose rankings when important pages can’t be crawled. For index control use meta robots / X-Robots-Tag.

Why do redirects matter for robots.txt?

Redirect chains add latency and can fail for some bots. Best practice is a clean, direct https://host/robots.txt response with minimal hops.

Robots.txt Checker

Results

Quick interpretation

Robots.txt Checker: detect accidental site blocks

Common problems

FAQ