Robots.txt Validator
Validate robots.txt syntax and test URL paths against allow/disallow rules for any user-agent. Debug crawling issues instantly. Free, 100% in your browser.
Reference
What is robots.txt?
robots.txt is a plain text file placed at the root of a website (e.g., example.com/robots.txt) that instructs web crawlers which pages or sections they are allowed or disallowed from accessing. It follows the Robots Exclusion Protocol, a standard supported by all major search engines including Google, Bing, and Yahoo. While robots.txt is advisory (crawlers can choose to ignore it), well-behaved bots always respect these rules.
Robots.txt directives
User-agent — specifies which crawler the rules apply to (* means all crawlers). Disallow — blocks access to a URL path or prefix. Allow — explicitly allows access to a path (overrides Disallow in some implementations). Sitemap — points to the XML sitemap URL for the site. Crawl-delay — requests a delay between successive crawl requests (supported by some bots). Wildcards — Google and Bing support * and $ in path patterns for advanced matching.
Common use cases
SEO debugging — verify that important pages are not accidentally blocked from search engines. Pre-launch checks — ensure robots.txt does not carry over "Disallow: /" from staging. Crawler management — block specific bots from resource-heavy pages. Privacy — prevent indexing of admin panels, user profiles, or internal tools. Migration — validate rules after a site restructure or domain migration.
Privacy
All validation and testing runs 100% in your browser. No data is sent to any server.