Deepcrawl is now Lumar. Read more.
DeepcrawlはLumarになりました。 詳細はこちら

Indexing

In order for web pages to be included within search results, they must be in Google’s index. Search engine indexing is a complex topic and is dependent on a number of different factors. Our SEO Office Hours Notes on indexing cover a range of best practices and compile indexability advice Google has released in their Office Hours sessions to help ensure your website’s important pages are indexed by search engines.

Mixed Migrations May Cause Google to Index HTTP or HTTPS URLs

Forgetting to update your sitemap files following a HTTPS migration could cause some pages to be indexed with the HTTP URL and some HTTPS.

3 Nov 2017

Privacy Policy and Terms of Service Pages Should be Indexable

Privacy policy and terms of service are normal content that people might want to find in search, so they should be indexable.

3 Nov 2017

Geo-targeting Doesn’t Restrict Pages to a Specific Country

Geo-targeting in Search Console indicates to Google that a page is more relevant for a specific country, and it may rank higher for local search queries, not that it will be removed for other countries.

31 Oct 2017

Old Pages Can Still Rank If the Content Is Useful

Old sites can still be useful and rank in search even if they haven’t been updated in years, as long as they are still relevant. Pages can still appear in search even if they aren’t mobile friendly.

31 Oct 2017

Show Paywalled Content to Googlebot Based on User Agent & IP Lookup

It’s OK to show Googlebot paywall pages with class names and schema markup based on user agent. You can also combine that with an IP lookup to recognise when Googlebot is looking at a page as opposed to another crawler.

20 Oct 2017

May Take Time to Index Content for Single Page App Setup While Google Picks up JS Rendered Version

Google indexes the HTML version of a page first then the rendered version. John says that in future these two things will be done more or less at the same time. An example where this difference might be more noticeable is with a single page app setup where one HTML file is served to all pages which has no content and then the content is only later picked up through JavaScript rendering.

20 Oct 2017

Tabbed Content Loaded On-Click Won’t Be Indexed

Content in tabs is fine for mobile as long as it is loaded when page is loaded and not when the tab is clicked on, otherwise it won’t be indexed.

20 Oct 2017

Google Mainly Uses GET Request For Normal Crawling & Indexing

Google pretty much only uses GET requests for normal crawling and indexing. However, that doesn’t mean you’ll never see POST and HEAD requests in your server logs, but probably they’re a lot rarer.

17 Oct 2017

Prevent InfiniteScroll Content Being Indexed by Blocking Onscroll Script with Robots.txt

If you need to prevent onscroll loaded content from being indexed, as with pages using infinitescroll, put the script that’s executed with the onscroll behind a robots.txt block.

17 Oct 2017

Most HTTPS Migrations Take a Day to Change in Index

A HTTPS migration is easier for Google to process than most other types of migrations because it keeps the same domain and same URLs. If a site is restructured with changes to internal linking or the domain name, it means Google has to think about a lot more. However, HTTPS is still a big change and takes time to be processed by Google – most take a day or so to switch over in Google’s index.

3 Oct 2017

Back 13/21 Next