Indexing
In order for web pages to be included within search results, they must be in Google’s index. Search engine indexing is a complex topic and is dependent on a number of different factors. Our SEO Office Hours Notes on indexing cover a range of best practices and compile indexability advice Google has released in their Office Hours sessions to help ensure your website’s important pages are indexed by search engines.
Low Proportion of Indexed Pages Points to Technical Issue
If a site has a low proportion of indexed pages, this usually points to a technical issue than a quality issue. Compare the site map index counts and index status report for differences. Try splitting up sitemap file , checking indexed pages using info: query, that rel canonicals match those in sitemap file, hreflang and internal linking. Also, uppercase, lowercase, trailing slashes all matter. Then check crawl stats to get idea of crawl rate and if it’s reasonable.
Align Linking & Rel Canonical If Want Particular Page Indexed
Ensure internal links and rel canonical are pointing to preferred page to ensure you aren’t giving Google conflicting signals about which page should be indexed.
Only Change URLs When Absolutely Necessary as Can Cause Drop in SERPs
John recommends against removing old fashioned URL suffixes, like .html, as Google will treat these as new URLs and will recrawl and reindex them having to learn a new structure. This will lead to a significant dip in SERPs for a period of time until the URLs have been recrawled and reindexed.
For A/B Testing Show Googlebot Version Most Users Will See
When A/B testing, Google recommends showing Googlebot the version that most users are seeing. If doing 50/50 testing, it is up to webmasters which version to show to Googlebot but Google recommend against randomly varying the displayed version as it will make it difficult for Google to index the page.
Submit Sitemap With Updated Last Modification Date For Faster Crawling of Updated Pages
Submit a sitemap file with an update last modification date to speed up the process of crawling and indexing of pages that have been changed.
Site: Search Operator Isn’t True Indicator of All Indexed Pages
Site: search operator isn’t a true indicator of all pages that are indexed on that site. Use a sitemap file to submit the URLs you care about.
Disallowed Pages May Take Time to be Dropped From Index
Disallowed pages may take a while to be dropped from the index if aren’t crawled very frequently. For critical issues, you can temporarily remove URLs from search results using Search Console.
Google Recognises Differences Between Synonyms
Google folds together synonyms but sometimes recognises subtle differences between them and ranks them differently.
Add Self-Referential Canonical Tags
Add self-referential rel canonicals to pages as it gives Google a clear indication of what page is to be indexed. Even if there is just one page, there may be different variations such as parameters – a rel canonical will clean these up.
Incorrectly Configured Mobile Sites Show in Desktop Search
46m12s