Crawling
Before a page can be indexed (and therefore appear within search results), it must first be crawled by search engine crawlers like Googlebot. There are many things to consider in order to get pages crawled and ensure they are adhering to the correct guidelines. These are covered within our SEO Office Hours notes, as well as further research and recommendations.
For more SEO knowledge on crawling and to optimize your site’s crawlability, check out Lumar’s additional resources:
Googlebot Can Handle Two Versions of Navigation for Responsive
Having two separate sets of navigation coded in the HTML for desktop and mobile on responsive sites doesn’t cause an issue for Googlebot.
Make Sure Content is Equivalent for Long Term A/B Testing
Long term A/B testing doesn’t cause any issues for Google as the content is equivalent between what the user and Googlebot see. Short term A/B testing with significant content changes don’t cause issues.
Separate Mobile Sites Need to be Crawlable at the Same Speed as Desktop
Separate mobile sites need to be able to be crawled at the same speed as on desktop so if your separate mobile site is hosted on a slower server then this will affect Google’s ability to rank it after mobile-first rolls out.
Googlebot Can Recognise Faceted Navigation & Slow Down Crawling
Googlebot understands URL strucures well and can recognise faceted navigation and will slow down when it realises where the primary content is and where it has strayed from that. This is aided by GSC parameter handling.
Google Recrawls All URLs Following Major Structural Changes to Understand Context
Google needs to recrawl and reprocess all URLs following major structural changes to understand the new context of these page on the website.
GSC URL Parameters Are Signals, Not Rules, For Crawling
Rules set in URL Parameters in Search Console are used as a signal by Google for what they shouldn’t crawl, but they aren’t obeyed in the same way as robots directives.
Sitemaps Submitted Through GSC Will be Remembered for Longer
Google’s memory for sitemaps is longer for those submitted through Google Search Console. Sitemaps that are submitted through robots.txt or are pinged anonymously are forgotten once they are removed from the robots.txt, or if they haven’t been pinged for a while.
Significant Onpage Changes Can Impact Google’s Ability to Crawl Page
Rankings in search can change after a migration even if the URLs haven’t changed. This can happen if the layout of a page changes significantly meaning that Google’s ability to crawl a site can get better or worse. Pay particular attention to changes in internal linking and anchor text which can impact Google’s ability to crawl a site.
Sitemaps Are More Critical for Larger Sites with High Churn of Content
Sitemaps are more useful for larger websites that have a lot of new and changing content. It is still best practice to have sitemaps for smaller sites that largely have the same content, but they are less critical for search engines to find new pages.
Static Sitemap Filenames Are Recommended
John recommends having static site map filenames that don’t change every time they are generated so they don’t waste time crawling sitemaps URLs which don’t exist any more.