Home / SEO Office Hours / Crawling / Page 15

Crawling

Before a page can be indexed (and therefore appear within search results), it must first be crawled by search engine crawlers like Googlebot. There are many things to consider in order to get pages crawled and ensure they are adhering to the correct guidelines. These are covered within our SEO Office Hours notes, as well as further research and recommendations.

For more SEO knowledge on crawling and to optimize your site’s crawlability, check out Lumar’s additional resources:

Crawling Might be Blocked by IP Blacklisting, Network Configuration or Bot Protection

Google doesn’t block the crawling of a specific domain, but you could be on an IP which has been blacklisted or a network issue. Check if other sites hosted on the same server are having similar problems. Sometimes a bot protection on a server can cause issues.

24 Feb 2017

Google Sometimes Makes Requsts with If-modified-since headers

Googlebot sometimes makes requests using an if-modificed-since request, in which case a 304 response is fine.

24 Feb 2017

Hamburger’ Menus Don’t Affect Crawling

From a crawling perspective, ‘Hambuger’ style menus are OK

27 Jan 2017

Googlebot Doesn’t See Content Changes Based on Session Information

Googlebot loads every page without any session information, so if you show content such as titles based on session information like an HTML 5 history, it won’t be seen.

27 Jan 2017

Mobile Interstitial Penalty is Calculated on Recrawl

The mobile interstitial penalty is calculated in real time as pages are crawled.

24 Jan 2017

Whitelisting Googlebot for First-click-free Isn’t Cloaking

With First Click-Free, you can whitelist the Googlebot user agent to see all articles, and it won’t be considered cloaking

13 Jan 2017

Google Doesn’t Crawl with a Referrer or Cookies

Googlebot doesn’t include a referrer URL when crawling, and doesn’t use cookies.

13 Jan 2017

Use Nofollow on Links to Noindex Pages to Reduce Crawling

You can add a nofollow on links to noindex pages to reduce the liklehood of them being crawled.

1 Nov 2016

404 Content Isn’t Seen by Google

Google doesn’t look at the content of a 404 page

9 Sep 2016

Don’t Prevent Embedded File caching

If you prevent JS, CSS and image caching, such as a nocache header tag, Google will need to keep requesting the files for rendering, which may slow down crawling of the site.

12 Aug 2016

Back 15/19 Next