This post is about duplicate content. How to identify and mitigate this major problem of SEO. How to implement technical solutions at the core of your website. you will be able to define duplicate content. Explain why duplicate content can create problems for search engine optimization. Identifying common instances of duplicate content lists three potential solutions for duplicate content.
What is duplicate content?
What is duplicate content and how does it affects page SEO? Well, first off, it’s the most common problem you’re going to run into as an SEO. Here’s the basic idea. You have two pages or two pieces of content that are identical. So when a search engine comes to those two pieces of content, it has no idea which one is the original article. Which article should be ranked well in search results. So sometimes it may not rank either or it will arbitrarily pick one version, which you don’t want that as an SEO, because you don’t have any influence on that.
What does duplicate content look like?
There’s a couple of examples of this. One of the more common is when a blog post is written, and then another blog or a scammer blog, copy the same content and post it on their website. So now you have two articles that are exactly the same word for word. And the search engines have to figure out who posted it first and who owns that content. For a lot of reasons that can be very difficult. Another example of this would be duplicate content. On your own website created by the programming or with tracking parameters. This would be when you posted a blog post, or this could be a new product page, for example, and you’ve linked to it from an email and also you’ve linked to it from social media.
So in the URL, it might be slightly different but from the page perspective and the search engine perspective, it’s completely identical. Again, you run into the problem where the search engine doesn’t know which is the most important or the original page. So it will either rank one or neither is both big problems and extremely common.
Duplicate Contents by CMS
Now, many times a CMS or content management system can produce duplicated content. You see, when you create a page, it’s added to a database, and then it’s published through instructions that pull from that database. These instructions can cause the same page to be published in different areas of the website. This is a problem when the same content is accessible at two or more different URLs.
Here’s why. First, because the search engine does not know which page the primary page they choose which to rank and which to ignore. Second, if both pages are being linked to from outside sources, you’re now dividing the link benefit. Ideally, you want all of the links going to a single page for that content to gain relevancy.
Now that we have some understanding of what duplicate content is, let’s explain more of where it surfaces. One of the common examples of this is with WordPress. WordPress is a content management system that is used for a large number of websites on the internet. The problem is the default settings with WordPress create a lot of duplicate content.
You can write a blog post and create a page that advertises your new product or service and in the default settings of WordPress, you’ll see this on tag pages. You will see it on the archive pages. You’ll see it on author pages. You’ll see it on the homepage, in addition to the blog posts page itself, so it’s going to exist in multiple different places. when a search engine comes to it, it’s going to have no idea which is the most important version. So again, you may just arbitrarily pick. So your search results sometimes can be within your own website.
Sub Category of Parent Category
The author section we category section is all can get very confusing, very fast. This also happens when sorting pages like an E-commerce site, you’ll have a list of cars and suddenly you want to sort them to show me the cars that are red. And what’s going to happen is you’ll get a subset of the master list.
Now it has the cars that are read but all of the information describing the cars on that one page is identical to the page before just with less data. So from a search engine perspective, you have all the same information again, that’s just expressed in a slightly different way accessible from a different page. Now functionally, it’s correct and it shows the users the information they need, but it can work against you with search engines and how they determine and deal with duplicated content.
Fixing Duplicate Contents.
Now that we’ve identified how difficult and common of a problem duplicate content is, let’s talk about some of the solutions. Now I need to warn you, some of these are quite technical. If you identified a problem. In order to fix it, you’re going to have to work with an IT team or some skilled programmer.
Remove and Block
The first method is to remove and block. So let’s take WordPress example.
If you have a blog post you just launched you’re excited about you’re quickly moving along and you realize it exists in different places. There’s also on your author page, category page and archive page and probably more so what you’re going to want to do is remove it from those other pages or block it entirely from being spidered by the search engines.
Block Access to the Author’s Page or Content
This can be accomplished in a few ways. First, depending upon the content management system, you can talk to your IT team or provider about this issue. Now secondly, you can block access to the author’s pages or content. Some if you’re using WordPress, you can get plugins to help manage this. blocking access is usually done through the robots.txt file. You can add the directories that contain duplicated content to the disallow instructions.
Rel equals Canonical
Now, the next fix is a tool. It’s a code instruction called rel equals canonical. This is an instruction that the search engines created to help remedy this exact problem. If you have duplicate content, you can use this instruction to tell the search engines that this page where you find this content is not the actual or the original page and it includes a link to the primary original page source. This rel equals canonical solution is a better solution than the next option, which is called no index follow.
No Index Follow
No index follow is a code snippet that again, your IT team or a programmer would have to implement. The general idea is that you add this code the saying that this page is not important. So no index, but follow any links as you still want those to count. Again, this is a code addition. So don’t just do it if you’re not confident. The general idea here is to let you know that there are solutions to this issue. They are created through technical programming issues, they require just as much technical programming expertise to remedy.
The key points for this post.
- Duplicate content is accessible from different URLs.
- Search engines find it difficult to identify the correct version from duplicate content in order to rank in the search results.
- An example of duplicate content is different URLs for one article or blog post. It available in different locations like the homepage, author, page, and category page.
- Some solutions for duplicate content are removed and block rel equals canonical and no index follow.