Duplicate Content
What is duplicate content? Duplicate content occurs when there is the exact same (or very similar) content appearing in multiple places on a website.
There are several SEO issues that can occur when a website has duplicate content, including crawl budget issues, search engine indexing issues, index bloat, keyword cannibalization, and canonical tag issues.
Our SEO Office Hours recaps below compile best practices Google has recommended for websites dealing with duplicate content issues.
(See Lumar’s full guide to duplicate content for even more actionable tips on how SEOs can address duplicate content issues.)
For even more on website content best practices for SEO, read our Guide to Optimizing Website Content for Search — or explore our Website Intelligence Academy resources on SEO & Content.
Duplicate Content On Its Own Doesn’t Mean That Site is Low Quality
A website should be able to stand on its own and somewhere where users go to specifically to find content. This usually means providing unique content but the presence of duplicate content doesn’t make a low quality website.
Report Sites Scraping Your Content to Google on Page-By-Page Basis
If your site has been scraped you can submit a DMCA takedown to the website’s hosting service, and to Google’s legal team can investigate. This is required on page level and cannot be done at a site level.
Add Unique Information to Individual Branch Pages
For regional branch pages, add information about products and/or services that are unique to that branch, telephone numbers and opening times, as well implementing relevant structured data.
Google Filters Identical Duplicates During indexing, and Near Duplicates From Search Results Pages
When Google recognises identical pages, it will choose one version to index, and when pages are similar, only one may show up in search results. Google looks at factors such as rel canonicals, redirects and internal and external linking when identical pages are crawled to decide which one to index.
Affiliate Based Sites Require Unique Content
Affiliate links do not affect a site’s quality, but Google requires you to have some unique value.
Google Ignores Keyword Stuffing
Google tries to ignore keyword stuffing in content rather than apply a penalty.
Google Replaces Duplicate and Long Titles on a Per Query Basis
Google will change titles, if they are too long, or include duplication, on a per query basis. It might be a sign that you should improve your titles.
Manual Action Penalties can be Applied to Thin, Spun or Aggregated Content
Thin content penalties can be applied to sites manually by the web spam team where the entire site seems to be thin, ‘spun’, or aggregated from other sources without any unique additional value.
Duplicate Content Filtering is Query Dependent
Duplicate content may still be indexed but filtered out of search results for queries where it would results in an identical snippet.
Google Considers Amount of Unique Content Per Page and Number of Pages with Unique Content
Google looks at how much text on each page is unique, and how many pages have unique content.