Robots.txt Files Need to be Smaller than 500 KB, says Google
According to John Mueller from Google, the leading search engine can only process up to 500 KB of the robots.txt file. This is so because having a heavy robots.txt file can confuse GoogleBot and can cause serious problems with your website’s search engine optimisation (SEO) performance in the Google search results.
The Robots.txt specifications provide a clear understanding on how Google handles the robots.txt file for your SEO strategy. These specifications are the guidelines followed by all automated crawlers at Google.
Given below is a list of valid robots.txt URLs:
The different outcomes when robots.txt are fetched, are-
- Full allow, where all content may be crawled.
- Full disallow, where no content may be crawled.
- Conditional allow, where the ability to crawl certain content is determined as per the directives in the robots.txt.
The file format for robots.txt is plain text encoded in UTF-8. It comprises of the records or lines separated by CR, CR/LF or LF.
Further, the grouping of the records is also discussed under the Robots.txt specifications from SEO point of view. The records are categorised as start-of-group, group-member and non-group.
This is how the crawlers would choose the relevant group:
|Name of crawler||Record group followed||Comments|
|Googlebot News||(group 1)||Only the most specific group is followed, all others are ignored.|
|Googlebot (web)||(group 3)|
|Googlebot Images||(group 3)||There is no specific googlebot-images group, so the more generic group is followed.|
|Googlebot News (when crawling images)||(group 1)||These images are crawled for and by Googlebot News, therefore only the Googlebot News group is followed.|
|Otherbot (web)||(group 2)|
|Otherbot (News)||(group 2)||Even if there is an entry for a related crawler, it is only valid if it is specifically matching.|
In order to ensure that robots.txt remains an important part of the search engine optimisation strategy of your website, the file size should be restricted to less than 500 KB.