Disallow Directives in Robots.txt
The disallow directive (added within a website’s robots.txt file) is used to instruct search engines not to crawl a page on a site. This will normally also prevent a page from appearing within search results.
Within the SEO Office Hours recaps below, we share insights from Google Search Central on how they handle disallow directives, along with SEO best practice advice and examples.
For more on disallow directives, check out our article, Noindex, Nofollow & Disallow.
Disallow Rule Must Start with a Slash
If you’re specifying a path in the robots.txt file, you must start with a slash, not a * wildcard. This was always true, but was only recently added to the documentation and Search Console testing tool.
Disallowed URLs can be Indexed
Even if a URL is disallowed it can still show up in the index.
Disallow doesn’t prevent indexing
A disallowed URL will be indexed and shown in results if Google has sufficient external signals.
Disallow prevents PageRank from being passed
PageRank can be inherited by a disallowed URL but can’t be passed on.
You Can Escape URLs in Robots.txt
In robots.txt, you can escape URLs if you want, they are treated as equivalents.
Submit Updated Robots.txt via Search Console
If you submit your robots.txt file via the Search Console Robots testing tool, they will recrawl it immediately instead of waiting for the normal daily check.
Noindex Pages Can’t Accumulate PageRank
Noindex pages can’t accumulate pagerank for the site, even though the pages can be crawled. So this isn’t an advantage over disallowing.
Use Disallow to Improve Crawling Efficiency
John recommends against robots.txt, because it prevents Google consolidating authority signals, but then says there are occassions when crawling efficiency is more important.
Disallowed URLs Don’t Pass PageRank
If a URL is disallowed in robots.txt, it won’t be crawled, and therefore can’t pass any pagerank.
Backlink Disavow Doesn’t Do Anything Until Google Updates Penguin Data
If you have a Penguin penalty for backlinks, and you remove or disavow those backlinks, you won’t see any effect until Google updates the algorithm again and refreshes the data. John says it’s a long delay and you should go as far as possible with disavow including domain level disallow directive to make sure you cover as much as you can.