September 20, 2021

The 2021 Beginner’s Guide to Canonical Tags

SEO is important for every website, whether it’s an online store, social media hub, or a place to do business. As websites get created and landing pages fill with content, some of that content can get duplicated along the way, confusing search engines as they work to crawl and index pages for better search rankings.

But what happens when those duplications happen, and what does that mean for your website as a whole? In this article, we’ll go over the fix for these duplications, a process known as canonicalization, with tips on what you can do to ensure your website maintains a positive user and search engine experience.

What is canonicalization?

Put simply, canonicalization is the process of declaring that one page or URL is the “primary” version of multiple. For instance, let’s say you have an eCommerce website dedicated to marketing your fashion brand. You’ve created pages for all the colors of a top-selling dress, but each of those pages contains roughly the same content and/or product description. The only difference is the color you’re presenting.

While this might seem like the logical thing to do, try to imagine how many pages your website could have if every item has two, three, or ten separate pages for colors. Now try to imagine a search crawler having to determine which of those pages is the most important for indexing. Sounds a bit much, right? Well, it feels the same for search crawlers, meaning they can get confused as they work to figure out where to focus their attention. To them, it looks as if you’re padding your site with duplicate content. As a result, they may crawl it less, missing some of your more unique content, ranking your content lower, or choosing the wrong “original” URL for that content altogether. In short, it’s an SEO nightmare.

Why is canonicalization important?

Canonicalization is important because it provides an SEO-friendly answer to duplicate content. It tells Google and other search engines that while you have multiple pages for similar content, you’d like them to focus on one primary page or URL for crawling and indexing purposes. In other words, it says “These are all important pages, but this one is the most important.”

Going back to the example from before, let’s pretend a search crawler comes across the main category page for store’s your top-selling dress, plus variations of that dress’s different color and size options.

Likewise, let’s pretend this is the URL of that dress’s main category page:
www.yourfashionstore.com/clothing/dress/topsellingdress

Now, let’s filter for the color green, keeping note of the parameter added:
www.yourfashionstore.com/clothing/dress/topsellingdress?color=green

If we filter for a size 10, we can see another parameter added as well:
www.yourfashionstore.com/clothing/dress/topsellingdress?size10&color=green

Likewise, if a product category has multiple pages, it’s ideal to have the canonical be the main category page to prevent duplicate content. That URL could look something like this:
www.yourfashionstore.com/clothing/dress/topsellingdress?page=2

To a human, these URLs all represent a single page. To a search crawler, however, each of these URLs represents a unique “page.” Even in this limited example, it’s easy to understand how a search crawler could get confused, decide to stop crawling, or pick the wrong URL as the “primary.” After all, the content is only marginally different. When it comes to specific content, like tracking metrics for a single product or topic, this makes consolidating those metrics especially difficult.

This same kind of duplication happens in other URL types, from search parameters and session IDs to www variants, https variants, and more. Should the search crawler pick the wrong “original” URL, It’s these exact kinds of scenarios where you want to employ the use of canonicalization, aka a canonical tag.

What does a canonical tag look like?

In a nutshell, canonical tags are the snippets of HTML code that define the main URL between duplicate, near-duplicate, or similar web pages. They’re the visualization of the process we’ve discussed above and look like this: rel=”canonical”

Placed within the header of a page, they use simple and consistent syntax, making them an easy-to-use solution to problems associated with duplicate content. As an added bonus, they work especially well for syndicated content across multiple domains, as they help to consolidate page ranking to your preferred URL. This means that similar or duplicate content won’t have to compete with traffic or ranking in search engines.

How can I implement canonical tags on my website?

HTML Tag:

As the most obvious way to implement canonical tags, HTML also provides the simplest. All you need to do is add the following code to the <head> section of any duplicate page. Here’s the code factored in for our fashion store example: <link rel=“canonical” href=“https://www.yourfashionstore.com/canonical-page/” />

HTTP Header:

Webpages allow you to set canonicals in the HTTP headers. At the same time, documents like PDFs don’t contain a page <head> section, so you’ll need to also use HTTP headers to implement those canonicals as well.

Sitemap:

Google has made it clear that when it comes to sitemaps, only canonical URLs should be listed. In other words, because sitemaps are a useful way to tell Google what pages you deem to be the most important on your site, it’s a simple way of defining canonicals for larger websites.

Internal Links: Internal links also play a role in canonicalization, acting as signals when you link from one page of your site to another. Likewise, Google has a preference for HTTPS URLs over HTTP, preferring prettier URLs as well.

What are the SEO best practices for canonical tags?

As with any website and SEO strategy, there are some ground rules when it comes to applying canonical tags to your pages. These are as follows:

Use absolute URLs: Avoid using relative paths when it comes to the rel=”canonical” link element. In the above store example, this would mean you’d want your canonical URL to look like so: <link rel=“canonical” href=“https://www.yourfashionstore.com” />.
Use lowercase URLs: Keep in mind that Google may treat uppercase and lowercase URLs as two different URL paths. It’s therefore good practice to use lowercase URLs in your canonical tags and throughout your site.
Use self-referential canonical tags: While not mandatory, self-referential canonical tags provide an SEO benefit in that they give a clear message as to which page you want to have indexed. Though most modern CMS platforms add these automatically, you’ll want to hardcode them in yourself if your CMS happens to be custom.
Use the correct domain protocol (HTTP vs. HTTPS): One of the more common mistakes people make when swapping from HTTP to HTTPS is that they forget to fix their canonical tags. In other words, if you’re running your website on HTTPS, the canonical tag could be telling the search engine to look at the HTTP version instead. If you happen to set up your 301 redirects correctly, that redirect sends Google back to the HTTPS in a never-ending loop.
Use only one canonical tag per page: Another common mistake with canonical tags is the use of more than one canonical tag in a page. While this typically happens when a webmaster copies a page template without thinking to change the target of rel=canonical, it runs the risk of Google ignoring all of those tags.

Want to Know More About Canonical Tags? Chat with the Web Development Experts at VELOX Media.

While they might seem complicated at first, canonical tags are a valuable and easy-to-implement part of SEO optimization for your website. That being said, should you have any further questions about how they work or how to use them, we’d love to help! Our web development experts work with a variety of clients daily, bringing search-informed data structure and intelligent UX to websites across multiple industries.

Contact VELOX Media to learn more about how we can help improve website performance for your business today.