Noindex
A rel=”noindex” directive is used to instruct search engines not to include a page within their index, to prevent it from appearing within search results. Our SEO Office Hours Notes below explain the use of this directive, along with further advice compiled from Google’s Office Hours sessions and real-world examples.
For more on noindex directives, check out our article: Noindex, Nofollow & Disallow.
Google Will Ignore Links on Noindexed Pages Over Time
If pages are noindex, Google will ignore those links over time. If you have pages which are only linked from noindex pages then Google may not see the linked pages as important.
Google May Treat Noindex Pages as Soft 404
Google may treat a noindex page as a soft 404, which are equivalent in how they are treated in search results. If you want them to be re-indexed, you need to let Google know the pages have changed, such as submitting in a Sitemap with a last modified date.
Options For Out of Stock Items Include Noindexing, Returning a 404, Adding Schema or Redirecting to a Replacement
Out of stock items can be dealt with by specifying in HTML and schema to show it is not available. Alternatively, the page can be noindexed, return a 404 or be redirected to a replacement product.
Google Would View a Page Canonicalized to a Noindex URL as a Noindexed Page
If you have a canonical link pointing to a page that is noindexed, the page canonicalised to it would also be considered noindex. This is because Google would view it as a redirect to a noindex page and therefore drop it.
There is No Risk of a Noindex Signal Being Transferred to the Target Canonical Page
If a page is marked as noindex and also has a canonical link to an indexable page, there is no risk of the noindex signal being transferred to the target canonical page.
Using Noindex Header Tag Will Not Prevent Google From Viewing a Page
Including a ‘noindex X-Robots-Tag’ HTTP header directive on a sitemap file will not affect how Google is able to process the file. You can also include this directive on other documents such as CSS files, as it will not affect how Google views them, instead it will just prevent them from showing up in a web search.
Either Disallow Pages in Robots.txt or Noindex Not Both
Noindexing a page and blocking it in robots.txt will mean the noindex will not be seen, as Googlebot won’t be able to crawl it. Instead, John recommends using one or the other.
Noindex Thin Pages That Provide Value to Users on Site But Not in Search
Some pages on your site may have thin content so it won’t be as valuable to have them indexed and shown in search, but if they are useful to users navigating your website then you can noindex them rather than removing them.
Google Will Use Other Canonicalization Factors If the Canonical Is Noindex
Google would receive conflicting signals if a canonical points to a noindex page. John suggested that Google would rely on other canonicalization factors in this scenario to decide which page should be indexed, such as internal links.
Only Use Sitemap Files Temporarily for Serving Removed URLs to be Deindexed
Sitemap files are a good temporary solution for getting Google to crawl and deindex lists of removed URLs quickly. However, make sure these sitemaps aren’t being served to Google for too long.