Robots.txt Lint Validator

Lint robots.txt syntax, flag risky rules, and test important URLs before you ship crawler directives

Example Results

1 examples

Catch a malformed Disallow rule before deploy

Validate a robots.txt draft, then test important URLs like /admin and /blog before publishing.

{
  "summary": {
    "errorCount": 1,
    "testedUrlCount": 3
  }
}
View input parameters
{ "robotsText": "User-agent: *\nDisallow /admin\nAllow: /admin/help", "siteOrigin": "https://example.com", "testUrls": "/admin\n/admin/help\n/blog" }

Click to upload file or drag and drop file here

Maximum file size: 0MB Supported formats: text/plain, application/octet-stream, .txt

Key Facts

Category
Security & Validation
Input Types
textarea, file, text
Output Type
json
Sample Coverage
4
API Ready
Yes

Overview

Validate your robots.txt files for syntax errors, check for risky crawler directives, and test critical URLs against your rules before deploying them to production.

When to Use

  • Before deploying a new robots.txt file to production to prevent accidental search engine de-indexing.
  • When troubleshooting why search engine crawlers are not indexing specific pages or directories.
  • During website migrations or redesigns to verify that staging rules are correctly updated for the live site origin.

How It Works

  • Paste your robots.txt content directly into the text area or upload your robots.txt file.
  • Enter your site origin URL and list the specific paths or URLs you want to test.
  • Run the validator to parse the syntax, identify structural errors, and check which test URLs are allowed or disallowed.

Use Cases

Auditing staging robots.txt files to ensure sensitive admin paths are blocked before going live.
Testing if critical landing pages or blog posts are accidentally blocked by broad wildcard rules.
Verifying syntax compliance with major search engine crawler standards.

Examples

1. Preventing Admin Directory Indexing

SEO Specialist
Background
An SEO specialist is preparing to launch a new website and wants to ensure search engines do not crawl the admin panel while keeping the help section accessible.
Problem
A draft robots.txt has a syntax error where the colon is missing in the Disallow rule, which might cause crawlers to ignore the directive.
How to Use
Paste the draft robots.txt content, set the site origin to 'https://example.com', and add '/admin' and '/admin/help' to the test URLs list.
Example Config
robotsText: "User-agent: *\nDisallow /admin\nAllow: /admin/help"
siteOrigin: "https://example.com"
testUrls: "/admin\n/admin/help"
Outcome
The tool flags the syntax error on the Disallow rule, allowing the specialist to correct it to 'Disallow: /admin' before deployment.

2. Testing Wildcard Rules for Blog Paths

Web Developer
Background
A developer is updating the robots.txt file to block temporary preview URLs but wants to ensure live blog posts remain crawlable.
Problem
Complex wildcard rules might accidentally match and block valid blog URLs like '/blog/post-1'.
How to Use
Upload the robots.txt file, input the production site origin, and list several blog URLs in the test URLs field.
Example Config
robotsText: "User-agent: *\nDisallow: /blog/*-preview\nAllow: /blog/"
siteOrigin: "https://example.com"
testUrls: "/blog/my-first-post\n/blog/draft-preview"
Outcome
The validator confirms that '/blog/my-first-post' is allowed, while '/blog/draft-preview' is correctly blocked.

Try with Samples

text, file

Related Hubs

FAQ

What does this tool validate in a robots.txt file?

It checks for syntax errors, missing colons, invalid directives, and tests specific URLs against your rules.

Can I test URLs with different origins?

You should specify the correct Site Origin in the input field to ensure relative paths are resolved and tested accurately.

Why is my Disallow rule flagged as an error?

Common issues include missing colons after directives, incorrect wildcards, or invalid user-agent declarations.

Does this tool support sitemap declarations?

Yes, it parses and validates the format of Sitemap directives within your robots.txt file.

Can I upload a robots.txt file directly?

Yes, you can upload your robots.txt file using the file input option instead of copying and pasting.

API Documentation

Request Endpoint

POST /en/api/tools/robots-txt-lint-validator

Request Parameters

Parameter Name Type Required Description
robotsText textarea No -
robotsFile file (Upload required) No -
siteOrigin text Yes -
testUrls textarea No -

File type parameters need to be uploaded first via POST /upload/robots-txt-lint-validator to get filePath, then pass filePath to the corresponding file field.

Response Format

{
  "key": {...},
  "metadata": {
    "key": "value"
  },
  "error": "Error message (optional)",
  "message": "Notification message (optional)"
}
JSON Data: JSON Data

AI MCP Documentation

Add this tool to your MCP server configuration:

{
  "mcpServers": {
    "elysiatools-robots-txt-lint-validator": {
      "name": "robots-txt-lint-validator",
      "description": "Lint robots.txt syntax, flag risky rules, and test important URLs before you ship crawler directives",
      "baseUrl": "https://elysiatools.com/mcp/sse?toolId=robots-txt-lint-validator",
      "command": "",
      "args": [],
      "env": {},
      "isActive": true,
      "type": "sse"
    }
  }
}

You can chain multiple tools, e.g.: `https://elysiatools.com/mcp/sse?toolId=png-to-webp,jpg-to-webp,gif-to-webp`, max 20 tools.

Supports URL file links or Base64 encoding for file parameters.

If you encounter any issues, please contact us at [email protected]