Home / SEO Office Hours / Duplicate Content / Page 3

Duplicate Content

What is duplicate content? Duplicate content occurs when there is the exact same (or very similar) content appearing in multiple places on a website.

There are several SEO issues that can occur when a website has duplicate content, including crawl budget issues, search engine indexing issues, index bloat, keyword cannibalization, and canonical tag issues.

Our SEO Office Hours recaps below compile best practices Google has recommended for websites dealing with duplicate content issues.

(See Lumar’s full guide to duplicate content for even more actionable tips on how SEOs can address duplicate content issues.)

For even more on website content best practices for SEO, read our Guide to Optimizing Website Content for Search — or explore our Website Intelligence Academy resources on SEO & Content.

Avoid Having Domains Additionally Accessible as CDN Subdomains

If the same content exists on a main domain and as a subdomain of a CDN, it can be indexed separately. This also means Google will have to crawl more to see the same amount of content. Use redirects, canonical tags, internal linking and sitemaps to set a preferred version.

21 Aug 2018

Copyright Violations & Duplicate Content Affect How Google Assesses the Rest of Your Site

If the majority of your content is flagged for something like DMCA copyright violations, Google may decide that the rest of your content isn’t high enough quality to show to users either.

24 Jul 2018

Combine Duplicate Pages Across Owned Sites Into One Page

If you have duplicate pages across different sites, try grouping them into one page and listing the different locations where that service or product is available so you have one strong page to rank with.

24 Jul 2018

GSC May Not Show Data For Your Other Same Language Sites if Content is Identical

Hreflang data may only appear for one of your sites in Search Console if the content is identical across a collection of same language sites e.g. UK and US. Use the ‘Inspect URL’ tool to check for issues like this.

26 Jun 2018

Google Folds Together Different Country Versions in Search Unless Content is Unique

With different country versions of sites on different ccTLDs, Google will fold these together in search unless they have unique content. John recommends providing localised content on these different ccTLDs to make them as relevant as possible to users as well as consulting with experts in this area.

12 Jun 2018

Canonicalise Duplicate Pages Between Your Sites so They’re Not Seen as Doorway Pages

Use the canonical tag if you are offering the same products on lots of different sites so Google doesn’t suspect that these are doorway pages.

29 May 2018

Make Same Language, Different Country Page Versions Unique to Avoid Being Folded Together

International sites with different country versions with the same language can be problematic if Google folds them together in the index e.g. German and Austrian sites with the same content. John recommends making the content on these versions as different as possible, however this isn’t always possible, like with product pages. Webmasters can check if pages are being folded together by using an info: query to check the canonical version.

13 Apr 2018

Use canonicalization Instead of Noindex for Duplicate Content

John recommends using rel=canonical instead of noindex in order to deal with duplicate content in the best way. This way the signals from both page versions can be combined rather than dropping all the signals from the noindexed page.

3 Apr 2018

Duplicate PDFs Are Seen as Duplicate Content

Duplicate PDFs are seen in the same way as duplicate content. For duplicate PDFs Google would pick one to show in the search results.

6 Mar 2018

Google Can Proactively Assume Duplicate Pages Before Crawling Them

Google will sometimes assumes that pages are duplicates before crawling them. This can happen when you have multiple parameters for your URLs that don’t actually change the content being served.