Indexing
In order for web pages to be included within search results, they must be in Google’s index. Search engine indexing is a complex topic and is dependent on a number of different factors. Our SEO Office Hours Notes on indexing cover a range of best practices and compile indexability advice Google has released in their Office Hours sessions to help ensure your website’s important pages are indexed by search engines.
The Indexing API Will be Restricted to Job Listings For Now
There are no immediate plans to expand the Indexing API to content other than job listings. Google will need to see how this works and how people are using it before releasing it to other elements of search.
The Signals You Provide Via JavaScript Shouldn’t Conflict with the Ones in HTML
The signals you give Google via JavaScript shouldn’t contradict the ones you’ve provided in the HTML. For example, if you add a follow link in the HTML but use JavaScript to inject a nofollow tag, it will be too late as the signals will be passed through this link with the first wave of indexing.
URL Removal Tool Hides Pages in Search but URLs Still Crawled and Indexed
URLs entered in the URL Removal Tool in Search Console cause that page to be hidden in search. However, these pages will still be crawled, indexed and all associated signals will be kept.
Ensure Google is Shown the Same Title When the Page is Fetched & Rendered
If Google is switching the titles between individual URLs, then something with the back-end of the website may be wrong. Google should be able to get the same title when it initially fetches the page as when it is rendered.
Videos Blocking Googlebot May Still be Crawled and Indexed
Blocking Googlebot from crawling a video may still result in a video snippet appearing in search if the video file is embedded from a different location, if some Google datacentres haven’t yet seen the updated version or if the video URL has parameters attached.
Keyword Stuffing on Homepage Can Cause Lower Level Pages to Rank Instead
There are instances where Google may rank a lower level page in place of the homepage. This might happen if Google detects a lot of keyword stuffing on the homepage and doesn’t know if the page is relevant, in which case another lower level page may rank instead.
Noindex & 410 Pages Are Removed Faster Than 404
Noindex and 410 remove pages from Google’s index at about the same speed, and both are slightly quicker than using a 404.
Google Will Index Pages Blocked in Robots.txt if They’re Linked To
If you block a page in robots.txt but someone links to it, then the page could still be indexed without any content as Google will have been blocked from seeing it. Use the noindex tag for these pages instead.
Use Banners to Suggest Different Language Sites so Google Can Index Them
Banners for recommending different language page versions are great for users who may be categorised incorrectly with geotargeting, but also for Googlebot. These banners allow Google to index content from your different country pages.
410 May Remove Page From Index Faster Than 404
In the mid and long term, a 404 error is the same as a 410 because they will both be dropped from the index and crawling is less frequent. However, a 410 may cause a page to fall out of the index a little bit faster than a 404 by a couple of days or so.