Canonicalization
Canonicalization is a method used to help prevent duplicate content issues and manage the indexing of URLs in search engines. Using canonicals appropriately can be hugely helpful for SEO.
Implementing the canonical tag link attribute “rel=canonical” is a signal to search engines about the preferred page for indexing, and will be followed in most cases when it is correctly implemented to an equivalent page.
The collected SEO Office Hours notes below provide detailed information and best practices (straight from Google’s own search experts) for using canonicals on your website.
For more on canonical tags and related topics, check out Lumar’s additional resources:
Add Hreflang Links to Canonical URLs Only
Hreflang links should point to canonical URLs only, as canonialised URLs won’t be indexed or crawled often enough to notice the connection between the canonical URL and those specified in the hreflang tags.
Hreflang Pages Can Be Canonicalised to One Version
If you hreflang a set of pages, and canonicalise to one version, Google may show the different URLs but all with the title and description of the canonical version. However John advises against this configuration.
Parameter Handling Improves Crawl Efficiency
Parameter handling prevents URLs being crawled so is better than canonical tags for crawl efficiency.
Google Learns Which URL Parameters Return Irrelevant Pages
Google learns which parameters are returning irrelevant pages partly based on canonicalised URLs.
The ‘Parameter Doesn’t Change Content’ Setting is Similar to Canonical
The ‘Parameter Doesn’t Change Content’ setting is similar to canonical as it will aggregate signals to the cleaned URL.
Structured Data and Hreflang Need to be Added to Mobile Pages
When Google moves to mobile first, the rel alternate and canonical tags won’t need to be changed, but the mobile pages will require dedicated structured data and hreflang tags.
AMP Pages Don’t Affect Panda Unless they are Canonical
If you use AMP pages as your canonical pages, they will affect Panda, but if they are canonicalised to another page they won’t.
Google May Choose a Redirect URL Instead of the Target
The selection of a canonical URL is also based on redirects, internal and external links, and Sitemaps, but even in the case of a redirect, Google might still choose to index the redirect source instead of the target.
Content and Links on Canonicalised Page’s Won’t be Seen
Any unique content and links found on paginated pages which have been canonicalised to the first page won’t be found.
Use Noindex or Canonical on Faceted URLs Instead of Disallow
John recommends against using robots.txt disallow to prevent facet URLs from being crawled as they may still be indexed, and allow them to be crawled and use a noindex or canonical tag, unless they are causing a server performance issue.