It's not that often that something truly new comes along in the world of Search Engines, especially something that the Search Engine companies themselves promote. Whenever it happens, there's always a flurry of activity and a lot of false information being thrown about.
Something new just came along at the SMX conference, the 'Canonical Link Element'. You've probably already read about this, but if you haven't, it's a tag you can add to the header of your webpages to tell the Search Engines what the 'Canonical' (also Normalized) Url is. Canonical or Normalized Urls are the 'one true Url' for that page.
Background on Canonical Urls and Duplicate Content
Canonical Urls are the opposite of duplicate content. Duplicate content is the result of having one distinct page of content returned for different urls. In DotNetNuke, consider all these Urls for a mythical 'domain.com' dnn site:
www.domain.com/
domain.com/
domain.com/default.aspx
domain.com/default.aspx?tabid=38
domain.com/home/tabid/38/default.aspx
domain.com/tabid/38/default.aspx
domain.com/home.aspx
Every single one of the above Urls returns the same page of content : the home page of the site.
The Problem with Duplicate Urls
Many people don't understand what the issue is with duplicate Urls. As I've posted before, there really are two main issues with this:
1. Incorrect Links : If you accept that your site's ranking is a net result of quality sites linking to yours, then the link that is used it is important. Because if one person links to domain.com and someone else links to domain.com/default.aspx, each of those Urls has one link. It would have been better if domain.com had two links. You can see this in action on big DNN sites like dotnetnuke.com, which has a Google PR of 3, while the main domain of dotnetnuke.com has a PR of 8. All of the links currently going to dotnetnuke.com/default.aspx could are not helping the main dotnetnuke.com domain. Perhaps it could be a PR of 9 if those extra links were redirected, or not created in the first place.
2. Enforcing a single Url : I've written recently in Why you should have Long Urls the reason why search engines place weight in the Url link text. It's because it can't be spammed, and has to be chosen carefully. If we could all pick different urls for the same page of content, pretty soon someone like me would be writing a utility which allowed you to use 1,000 different urls, all containing different keywords, for the one page. By forcing people to choose a single URL, the search engines force you to carefully define what's on the page.
Canonical Urls
A Canonical Url is simply a unique Url for a web page. No other Url on the web should show the same content for a single Url. All the search engines want to encourage you to have canonical Urls for your pages, which is another way of saying they don't want you showing duplicate content.
The Canonical Link Element
To help webmasters define Canonical Urls for their web pages, the Search Engines (Google, Yahoo, Microsoft et al) have come up with the concept of the Canonical Url Element. This is a very simple tag shown in the header of the page:
<head>
<link rel="canonical" href="http://domain.com/">
</head>
So all we have to do is include this link on our pages, and the duplicate content is fixed forever, right?
Well, yes and no. In his recreation of his SMX Canonical Link Element Presentation, Matt Cutts from Google firsts suggests that you should do these actions (these following points are taken from the slides in the linked presentation):
- Change your Content Management System (CMS) to generate only the Urls you want. "Normalize" Urls.
- Pick one "canonical" url and ensure you link consistently within your site
- Make all the non-canonical urls do a permanent (301) HTTP redirect to the canonical/preferred Url
- Google Webmaster Tools : specify www vs non-www
- Break ties in Google by submitting your preferred url in a Sitemaps file
Matt also raises some problems with Duplicate Content:
- Can't help how people choose to link to you
- Uppercase/lowercase paths
- Session Id's, tracking codes, analytics (utm parameters), landing pages)
- Sorting by ascending / descending links
In the presentation, Matt urges that the you first try to clean up your content using 301 redirects and avoiding duplicate content Urls in the first place, and then, after that, to use the Canonical link tag. I assume this is because it's easier for search engines to keep track of urls when they are redirected with 301 Permanent Redirects than with the Canonical Link element.
The main reason I'm assuming this advice comes from the further information : like the sitemaps protocol, the Canonical Link element is a hint, not a directive. Compare this to the 301 redirect, which is a directive : the search engine crawler has no choice but to go to the redirected page. The Search Engine can choose to ignore the link if it chooses.
Implementing the Canonical Link Element in DNN
It's very simple to put this link on your DotNetNuke site, you can follow the instructions posted on seablick.com : Using the Canonical Link Tag in DNN
But be aware that a DNN page is not always a DNN page. The DNN page that this blog is on, for example, always uses the header information - yet there is a unique Url for each blog entry. So you can't use the link header as Tom describes on pages that have modules like the Blog module, which extends the Urls with querystring paths. This applies to most advanced content modules for DotNetNuke, as these generally use querystring paths in the Url to show different content, like product catalogs, blog postings, video and image galleries and social networking pages.
Getting your Urls Correct before implementing the Canonical Link Element
Revisiting the recommendations from the Google webmaster presentation linked above, and merging them with a DotNetNuke slant, you should be doing these actions first:
1. Get DotNetNuke to generate only the Urls you want
2. Pick one canonical Url and use that throughout your site
3. Make all the non-canonical Urls 301 redirect to the Canonical Url
4. Specify www vs non-www
5. Eliminate mixed case issues where possible
6. Submit a sitemap file to specify the preferred Url
This is shameless self promotion, but I solved problems 1, 2,3,4 and 5 for my DotNetNuke sites with the creation of the Url Master Module, which performs automatic 301 redirects of all the 'unfriendly' versions of DotNetNuke Urls back to a single, canonical Url. It can also generate only lower-case urls, redirect www to non-www (or vice versa) and eliminates all the various home page variations to show a simple site root Url (www.domain.com/)
Further on point 2 : stay away from the LinkClick.aspx handler. If you use the FCK editor, it's very tempting to use the in-built link chooser to generate the internal links for your site. This generates internal links using a url of 'linkclick.aspx'. Don't be tempted. This quite obviously introduces duplicate Urls (which you'll have to fix one day) but it also fails to place relevant link text into the Url, and invites others to copy those links and use them on their websites, which encapsulates duplicate problems nicely. Just say no to LinkClick.aspx! Learn how to craft your own <a> html and your efforts will pay off.
Problem 6 is solved with my Google Sitemaps Module, which will generate a sitemap file of all the urls in a DotNetNuke site. Unlike the standard 'core' sitemap generator, it allows for other providers to be plugged in, so that the sitemap can generate the full list of urls for modules like the Blog Module, or the popular News Articles module.
Problem Solved, right?
Even after taking care of the points above, there are other times when you need to output a Canonical Link tag. This is particularly important when it comes to complicated DNN Modules that use multiple querystring options for displaying content. I have seen several modules where you can display the same content using slightly different querystring values.
There is another, larger, reason for using Canonical Links : when you are using affiliate, campaign or tracking links. In many cases, website owners would like to track the performance of a particular ad or link. DotNetNuke even has built-in affiliate ID tracking (the 'vendor' features). However, if you create external links by way of banner ads or other text links, those links are going to generate duplicate content problems. That's when Canonical Links are very useful, and indeed, one of the major reasons they were created in the first place.
Canonical Links the Easy way with the iFinity Canonical Linker Module
If we can't solve all the requirements of the Canonical Links with existing software, then more software is needed! To this end, I've created the iFinity Canonical Linker module. This DotNetNuke module installs like a regular module, and you can use it on one page or all pages of your site. It contains advanced configuration settings so that it will work with DotNetNuke content modules that use Urls to identify the content (like the Blog Module, or the Forum Module).
It also has filtering so that you can supply tracking/affiliate link names to the module, and it will strip this information from the Url, so that search engines will index only the base content, without the tracking codes.
The module also shows administrators (only) what the link for the current page looks like, without having to go in and view the Html source.
Best of all, it's quick and easy to use and install, and you can have Canonical Links across your site in a couple of minutes. Like all iFinity modules, it's available as a free trial to download and try, and comes with a range of licensing options. To download a trial, just go to the iFinity Canonical Linker product page.