Indexing
In order for web pages to be included within search results, they must be in Google’s index. Search engine indexing is a complex topic and is dependent on a number of different factors. Our SEO Office Hours Notes on indexing cover a range of best practices and compile indexability advice Google has released in their Office Hours sessions to help ensure your website’s important pages are indexed by search engines.
Dynamic Rendering Can be Used to Show Googlebot Fully Rendered Pages
You can use dynamic rendering to serve Googlebot with pages that are already fully rendered, meaning there won’t be a gap between the initial indexing and rendering.
There Will Continue to be a Delay Between Indexing & Rendering Due to Resource Issues
John explained that for the foreseeable future there will continue to be a delay between initial indexing of HTML and rendering, because JavaScript requires resource to be rendered and this can’t happen immediately with the current system.
Google Have Removed Their Public URL Submission Feature
Google’s public URL submission tool has been removed but URLs can still be submitted through Search Console and sitemaps.
Robots.txt Files Don’t Need to be Indexed by Google
Robots.txt files need to be machine-readable, but they don’t need to be indexable for Google to be able to process them.
Google Doesn’t Have Separate Way of Prioritising Pages For Rendering
Google doesn’t have a separate way of prioritising the order of pages to be rendered in the second wave of indexing that differs to the way sites are prioritised for regular crawling and indexing.
Google’s Indexing Systems Are More Patient With Rendering Than Live Testing Tools
When using Google’s testing tools for JavaScript rendering issues, you won’t get a truly accurate view of how Googlebot is rendering and indexing as the tools have a much stricter timeout limit to give webmasters quick results.
Google Treats Meta Refresh as Redirect Meaning Wrong Content Might be Indexed
Google treats meta refresh as a redirect, which may mean the wrong page is indexed e.g. a product listing page with a meta refresh to a payment page will mean the latter is indexed rather than the actual content.
It Will Take Years to Switch All Sites Over to Mobile-first Indexing
John confirmed that it could take years to switch all sites to mobile-first indexing because there are many sites that aren’t ready yet. Google is assessing how to best provide more information on how to help people make their sites mobile-first ready.
Different Signals Determine Google’s Canonical Selection
John confirmed that rel canonical, redirects, internal linking, URL parameters and sitemaps are all signals Google uses to decide which page is the canonical from a group of pages that have been folded together.
Only Index Original Language Content if You Use Auto-Translate
Auto-translated content is seen as auto-generated content by Google. If you want to do any auto-translating, let Googlebot index the original language content and give a ‘translate this page’ option to users with JavaScript.