Crawling
Before a page can be indexed (and therefore appear within search results), it must first be crawled by search engine crawlers like Googlebot. There are many things to consider in order to get pages crawled and ensure they are adhering to the correct guidelines. These are covered within our SEO Office Hours notes, as well as further research and recommendations.
For more SEO knowledge on crawling and to optimize your site’s crawlability, check out Lumar’s additional resources:
Infinite Scroll Sites Should Have Distinct URLs and Pagination
Sites using infinite scroll should make sure URLs are changed so that they remain valid URLs and can be accessed and crawled directly. John also recommends having pagination links so even if a search engine doesn’t support infinite scroll they can still crawl through those pages.
Presence of Internal Links More Important Than Placement
The placement of internal links is less important than Googlebot being able to crawl a site’s internal links and navigation. Sometimes Googlebot isn’t able to crawl sites properly because the internal navigation isn’t available.
Changing Content Every Time Page is Refreshed Makes it Difficult for Google to Understand
Changing landing page content every time it is refreshed is not recommended as it makes it difficult for Google to understand the page, in terms of the importance of the page and how it relates to other pages on the site.
Google Uses Time Taken to Load and Render to Assess Page Speed
Google doesn’t use the time taken to crawl a page as a measure of speed. Google tries to understand what a user would see, rather than what Googlebot sees, by looking at the time pages take to load and render.
Google AdsBot Crawling Doesn’t Impact Crawl Budget For Organic Search
If Google AdsBot is crawling millions of ad pages then this won’t eat into your crawl budget for organic search. John recommends checking for tagged URLs in any ad campaigns to reduce ad crawling.
Google Found No Additional Value Gained From Crawling Outside US
Google have tested crawling from different countries but decided this didn’t provide any additional value, except from a very small number of locations. If Google were to crawl sites from many different countries this would put more stress on a site’s server.
Googlebot Smartphone Uses Nexus 5X For Crawling
At the moment Googlebot Smartphone uses the Nexus 5X for crawling. However, Google changes the device over time to reflect commonly used devices.
Google Differentiates Page Types And Weighs Them Differently
Google differentiates between different page groups (e.g. irrelevant pages, cruft URLs and parameters) and weighs them differently. These pages will be looked at occasionally but won’t be considered as a primary part of the site.
Setting A Higher Crawl Rate Doesn’t Guarantee Google Will Crawl More
Setting your crawl rate to high means Google can crawl more but that doesn’t mean they will.
Crawl Stats Include Fetches for All Content
The Crawl Stats in Google Search Console includes fetches for all content on your site, including Adwords landing page quality score checks, images and Javascript requests.