rel=canonical: the ultimate guide

Share on facebook
Share on google
Share on twitter
Share on linkedin

https://yoast.com/rel-canonical/

A canonical URL lets you tell search engines that certain similar URLs are actually the same. Sometimes you have products or content that can be found on multiple URLs — or even multiple websites, but by using canonical URLs (HTML link tags with the attribute rel=canonical), you can have these on your site without harming your rankings.

We hope you’ll enjoy reading #4 of our best-read posts this year! Find out all about canonical URLs: what are they and how and when should you set them. Don’t forget to check back tomorrow for another holiday countdown surprise!

The rel=canonical element, often called the “canonical link”, is an HTML element that helps webmasters prevent duplicate content issues. It does this by specifying the “canonical URL”, the “preferred” version of a web page – the original source, even. Using it well improves a site’s SEO.

The idea is simple: if you have several similar versions of the same content, you pick one “canonical” version and point the search engines at it. This solves the duplicate content problem where search engines don’t know which version of the content to show in their results. This article takes you through how and when to use them, and how to avoid common mistakes.

The SEO benefit of rel=canonical

Choosing a proper canonical URL for every set of similar URLs improves the SEO of your site. This is because the search engine knows which version is canonical, so it can count all the links pointing at all the different versions as links to the canonical version. Setting a canonical is similar in concept to a 301 redirect, only without actually redirecting.

History of rel=canonical

In February 2009 Google, Bing and Yahoo! introduced the canonical link element – if you want to learn about its history, Matt Cutts’ post gives the clearest explanation. While the idea is simple, the specifics of how to use it are often complex.

The process of canonicalization

Ironic side note

The term Canonical comes from the Roman Catholic tradition, where a list of sacred books was created and accepted as genuine and named the canonical Gospels of the New Testament. The irony is it took the Roman Catholic church about 300 years and numerous fights to come up with the canonical list, and they eventually chose four versions of the same story…

When you have several choices for a product’s URL, canonicalization is the process of picking one of them. In many cases, it’ll be obvious: one URL will be a better choice than others. In some cases, it might not be as obvious, but even then it’s still pretty simple: just pick one! Not canonicalizing your URLs is always worse than canonicalizing your URLs.

How to set canonical URLs

A correct example of using rel=canonical

Let’s assume you have two versions of the same page, each with exactly – 100% – the same content. The only difference is that they’re in separate sections of your site and because of that the background color and the active menu item are different – that’s it. Both versions have been linked to from other sites, so the content itself is clearly valuable. So which version should search engines show in results?

For example, these could be their URLs:

  • https://example.com/wordpress/seo-plugin/
  • https://example.com/wordpress/plugins/seo/

This is what rel=canonical was invented for and, unfortunately, this happens fairly often, especially in a lot of e-commerce systems. A product can have several different URLs depending on how you got there. In this case, you would apply rel=canonical as follows:

  1. Pick one of your two pages as the canonical version. This should be the version you think is the most important. If you don’t care, pick the one with the most links or visitors, and if all else is equal, flip a coin. You just need to choose.
  2. Add a rel=canonical link from the non-canonical page to the canonical one. So if we picked the shortest URL as our canonical URL, the other URL would link to the shortest URL in the <head> section of the page – like this:
    <link rel="canonical" href="https://example.com/wordpress/seo-plugin/" />

    That’s it. Nothing more, nothing less.

What this does is “merge” the two pages into one from a search engine’s perspective. It’s a “soft redirect”, without redirecting the user. Links to both URLs now count as the single, canonical version of the URL.

Setting the canonical URL in Yoast SEO

Our Yoast SEO WordPress plugin lets you change the canonical URL of several page types in the plugin settings. You only need to do this if you want to change the canonical to something different from the current page’s URL. Yoast SEO already renders the correct canonical URL for almost any page type in a WordPress install.

For posts, pages, and custom post types, you can edit the canonical URL in the advanced tab of the Yoast SEO metabox:

Setting a canonical URL in Yoast SEO

For categories, tags and other taxonomy terms, you can change the canonical URL in the same place in the Yoast SEO metabox too. If you have other advanced use cases, you can also use the wpseo_canonical filter to change the Yoast SEO output.

When should you use canonical URLs?

301 redirect or canonical?

If you are unsure whether to do a 301 redirect or set a canonical, what should you do? The answer is simple: you should always do a redirect, unless there are technical reasons not to. If you can’t redirect because that would harm the user experience or be otherwise problematic, then set a canonical URL.

Should a page have a self-referencing canonical URL?

In the example above, we link the non-canonical page to the canonical version. But should a page set a rel=canonical for itself? This question is a much-debated topic amongst SEOs. At Yoast, we strongly recommend having a canonical link element on every page and Google has confirmed that’s best. That’s because most CMS’s will allow URL parameters without changing the content. So all of these URLs would show the same content:

  • https://example.com/wordpress/seo-plugin/
  • https://example.com/wordpress/seo-plugin/?isnt=it-awesome
  • https://example.com/wordpress/seo-plugin/?cmpgn=twitter
  • https://example.com/wordpress/seo-plugin/?cmpgn=facebook

The issue is that if you don’t have a self-referencing canonical on the page that points to the cleanest version of the URL, you risk being hit by this. If you don’t do it yourself, someone else could do it to you and cause a duplicate content issue, so adding a self-referencing canonical to URLs across your site is a good “defensive” SEO move. Luckily, our Yoast SEO plugin does this for you.

Cross-domain canonical URLs

Perhaps you have the same piece of content on several domains. There are sites or blogs that republish articles from other websites on their own, as they feel the content is relevant for their users. In the past, we had websites republishing articles from Yoast.com as well (with express permission), but if you had looked at the HTML of every one of those articles you’d found a rel=canonical link pointing right back to our original article. This means all the links pointing to their version of the article count towards the ranking of our canonical version. They get to use our content to please their audience, and we get a clear benefit from it too. Everybody wins.

Faulty canonical URLs: common issues

There are many examples out there of how a wrong rel=canonical implementation can lead to huge issues. I’ve seen several sites where the canonical on their homepage was pointed at an article, only to see their home page disappear from search results. There are other things you should never do with rel=canonical. Here are the most important:

  • Don’t canonicalize a paginated archive to page 1. The rel=canonical on page 2 should point to page 2. If you point it to page 1, search engines will actually not index the links on those deeper archive pages…
  • Make them 100% specific. For various reasons, many sites use protocol-relative links, meaning they leave the http / https bit from their URLs. Don’t do this for your canonicals. You have a preference, so show it.
  • Base your canonical on the request URL. If you use variables like the domain or request URL used to access the current page while generating your canonical, you’re doing it wrong. Your content should be aware of its own URLs. Otherwise, you could still have the same piece of content on – for instance – example.com and www.example.com and have each of them canonicalize to themselves.
  • Multiple rel=canonical links on a page causing havoc. When we encounter this in WordPress plugins, we try to reach out to the developer doing it and teach them not to, but it still happens. And when it does, the results are wholly unpredictable.

rel=canonical and social networks

Facebook and Twitter honor rel=canonical too, and this might lead to weird situations. If you share a URL on Facebook that has a canonical pointing elsewhere, Facebook will share the details from the canonical URL. In fact, if you add a ‘like’ button on a page that has a canonical pointing elsewhere, it will show the like count for the canonical URL, not for the current URL. Twitter works in the same way.

Advanced uses of rel=canonical

Google also supports a canonical link HTTP header. The header looks like this:

Link: <https://www.example.com/white-paper.pdf>;    rel="canonical" 

Canonical link HTTP headers can be very useful when canonicalizing files like PDFs, so it’s good to know that the option exists.

Using rel=canonical on not so similar pages

While I wouldn’t recommend this, you can definitely use rel=canonical very aggressively. Google honors it to an almost ridiculous extent, where you can canonicalize a very different piece of content to another piece of content. However, if Google catches you doing this, it will stop trusting your site’s canonicals and thus cause you more harm…

Using rel=canonical in combination with hreflang

We also talk about canonical in our ultimate guide to hreflang. That’s because it’s very important that when you use hreflang, each language’s canonical points to itself. Make sure that you understand how to use canonical well when you’re implementing hreflang, as otherwise, you might kill your entire hreflang implementation.

Conclusion: rel=canonical is a power tool

Rel=canonical is a powerful tool in an SEO’s toolbox, but like any power tool, you should use it wisely as it’s easy to cut yourself. For larger sites, the process of canonicalization can be very important and lead to major SEO improvements.

Read more: WordPress SEO: The definitive guide to higher rankings for WordPress sites »

The post rel=canonical: the ultimate guide appeared first on Yoast.

Sign up for our Newsletter

Agree to our privacy policy.