General

How to manage duplicate content for SEO

ADVERTISEMENT

 

I’m coming to you today with an important question: have you ever come across the same content (such as duplicate images, text and more) within the same or different sites?

To understand better, imagine you have the same bottles of rum inside the hold of two different ships.

Because of these duplicate content your site’s ranking could deteriorate or the site itself could be removed completely from Google’s index, i.e. search results.

For a thousand whales! Imagine what a catastrophe could happen if your ship was no longer visible to adventurers.

But don’t worry, because today I’m going to help you understand… Why and how to manage duplicate content for SEO. Are you ready to set sail on this new adventure?

Why manage duplicate content

Before we begin, I’d like to make an important distinction.

  • External duplicate content – these are pages where the content is the same as on other sites. Imagine eCommerce selling the same products and using the same descriptions from their suppliers.
  • Internal duplicate content – derived from technical causes. Imagine a site that has separate versions of URLs with “www” (www.buzzynerd.com) and without (buzzynerd.com). The same content is then present within two different URLs, resulting in duplicate content.

But I also want to reassure you and dispel a myth: there’s no such thing as an actual penalty… by Google resulting from the presence of duplicate content.

However, when there is multiple identical (or very similar) content in different URLs on the Internet, Google is at choose which version is more relevant. To do so, the algorithm finds itself considering the indexing date and other factors, including the authority of the site. However, the search engine may not show the desired resource, but the less suitable one, in the search results.

Second, for the Google crawler (search engine software responsible for checking each page of the site, copying its content and indexing it) scanning identical pages is one waste of budget it would have available to index pages with more interesting and, above all, different content.

If you’re still not convinced, you need to remember that users prefer content different rather than the exact same content on different domains!

The goal is for your ship to differentiate itself and outperform all others, understand?

If you’re not ready to go it alone yet, hire experts who have been navigating SEO for many, many years. Hire my crew!

CONTACT US

How to detect duplicate content

Before managing them you need to identify them and understand the causes.

Duplicate content, as we have already seen, does not only result from intentional copying of text, but very often also from technical causes related to CMS operation or, as is the case with eCommerce, for reasons related to the product catalog management.

1. Discover the causes of duplicate content

When it comes to duplicate content, there can be several causes.

  • Plagiarism – this is the misuse of images found on the Web or the copying of thoughts or text from other sites without citing the source.
  • Thin content – content that is excessively short or has nothing original about it, but which repeats sections of the site already published in other URLs.
  • Boilerplate content – content found in headers, footers and sidebars, which for many sites represent much of the text on the page. Since it is present in every URL, it can become a problem.
  • Different versions of the site – as in the case of HTTP/HTTPS, situations that occur when a 301 redirect is not implemented between different versions of the site. For example, without a redirect, a crawler could access the same site page via 4 different URLs:
    • https://buzzynerd.com
    • https://buzzynerd.com
    • https://www.buzzynerd.com
    • http://www.buzzynerd.com

2. Identify duplicate content

There are several tools for identifying duplicate content, whether it’s internal or external to your site. Here are a few!

  • Google Search Console HTML enhancements – this feature of Google’s free tool allows you to identify duplicate title tags and meta descriptions, which can be a useful signal to identify the presence of duplicate content.
  • Siteliner – free tool that allows you to identify duplicate content internal to your site, among other things.
  • Copyscape – easy-to-use tool that allows you to find any copies of your web pages online for free.
  • Duplicate content checker from SEO Review Tools – this tool also allows you to check for duplicate content inside or outside your site.

In general, my advice is to use more than one tool so you can get the most comprehensive results possible for your search.

Manage duplicate content

In the case of personal content copied from other sites the optimal solution is to customize the content itself, while in the case of duplicate content internal to the site it may be necessary to provide Google with indications as to which page is the most important and correct and to intervene on technical issues.

Let’s see better together what to do!

1. Customize your content

When taking content from other websites, the best solution is to customize your textual content as much as possible from what you already have online.

This allows you to differentiate yourself by offering additional value and creating original content. This also allows you to use a descriptive tone and style that reflects your brand identity (arr! Just like ours!).

2. DMCA Dashboard

If, on the other hand, it is other sites that have taken your content, for example a photo, DMCA Dashboard is a free tool offered by Google that is for you.

This tool allows copyright holders to inform it about plagiarism situations so that it can remove a particular page from search results.

3. Canonical Link

The canonical link allows you to specify the official version of the page, indicating to Google not to index any variants it might find when crawling the site.

More simply, if you find that you have several versions of the same content on your site, you can go and choose the one that is the canonical version for you, by inserting the string in the non-canonical pages:

<link href=”https://buzzynerd.com/articolo” rel=”canonical”>

Make your choice fall on the page you think is most important and could help you increase visits to the site or directly on the one that gets the most traffic.

4. Redirect 301

In many cases, the best way to solve the problem of duplicate content is to set up a 301 redirect. In fact, when multiple identical pages are combined into a single page, they not only stop competing with each other for the same ranking, but allow the main page to rank better.

If you’re not yet an expert like me, you can use plugins like Redirection.

5. Rel Alternate

In the presence of different versions of the site, such as in the case of multilingual or mobile, you should use rel=”alternate”:

<link rel=”alternate” href=”http://example.com/article-fr” hreflang=”fr-fr” />

&l;link rel=”alternate” href=”http://example.com/article-it” hreflang=”en-it” />

This way, the crawler will know that it is not duplicate content, but different versions of the same page.

Our journey is over. Now all you have to do is roll up your sleeves and put my valuable advice into practice!

Next Post