Skip to content

Robots.txt Validator

Validate robots.txt syntax and test URL paths against allow/disallow rules for any user-agent. Debug crawling issues instantly. Free, 100% in your browser.

 

What is robots.txt?

robots.txt is a plain text file placed at the root of a website (e.g., example.com/robots.txt) that instructs web crawlers which pages or sections they are allowed or disallowed from accessing. It follows the Robots Exclusion Protocol, a standard supported by all major search engines including Google, Bing, and Yahoo. While robots.txt is advisory (crawlers can choose to ignore it), well-behaved bots always respect these rules.

Robots.txt directives

User-agent — specifies which crawler the rules apply to (* means all crawlers). Disallow — blocks access to a URL path or prefix. Allow — explicitly allows access to a path (overrides Disallow in some implementations). Sitemap — points to the XML sitemap URL for the site. Crawl-delay — requests a delay between successive crawl requests (supported by some bots). Wildcards — Google and Bing support * and $ in path patterns for advanced matching.

Common use cases

SEO debugging — verify that important pages are not accidentally blocked from search engines. Pre-launch checks — ensure robots.txt does not carry over "Disallow: /" from staging. Crawler management — block specific bots from resource-heavy pages. Privacy — prevent indexing of admin panels, user profiles, or internal tools. Migration — validate rules after a site restructure or domain migration.

Privacy

All validation and testing runs 100% in your browser. No data is sent to any server.