I tend to receive a lot of email, enquires and forum posts about the field of Url Rewriting and Friendly Urls in DotNetNuke. Generally the questions are related to improving the Urls on a site for the purposes of Search Engine Optimisation.
This post is intended to be a go-to resource for those questions, as the answer generally involves a bit of education on how the whole Url scheme in DNN fits together. It's important to understand the basics before trying to improve upon them. The aim is to educate on the issues and constraints at hand.
I'll use some terms throughout this post, so to clear up confusion, this is what I mean:
Url Rewriting : This is the process whereby the Url for a page is intercepted by the website platform, and modified internally, so that the rest of the processing sees the modified Url, rather than the original Url. In DotNetNuke, this is typically the process of taking a pagename and converting it to a tabid value, so that the rest of the modules in the installation can identify the current page by TabId. The person viewing the page does not see the change in Url, so the page appears to be located at the original url location.
Friendly Urls : Urls that have very few or no 'database' id's in them. As a starting point for DNN, this means having no 'TabId' value in the Url. Friendly Urls also mean not having a query string for relatively static pages.
Querystring Path : This refers to the part of a DNN Url which forms part of a querystring in a rewritten Url. Many popular DNN modules use a querystring path, such as the blog module, forums module, and others. Generally, the Querystring Path contains key/value pairs like EntryId/34/ or /forum/11/post/12. These are rewritten to look like entryid=34 or forum=11&post=12.
SEO Friendly Urls
For the purposes of this entry, I'm going to be making the assumption that an SEO friendly Url has the following features:
- limited amount of path depth (not too many "/"'s in the url)
- maximum amount of relevant keywords in the Url
- minimum amount of non-keyword 'noise' in the Url (chiefly this refers to removing database Id's and productId, TabId, etc)
- all words spaced with a choice of path separation (for example, using +, _ or - to separate keywords in the Url)
The three-legged stool of DNN Urls : Overview of the DNN Url Scheme
I tend to use this analogy when talking about DNN Urls : the whole scheme sits on a 3-legged stool, with each leg being an important part of the system.
Leg 1 is Friendly Url Generation : the process of programmatically creating Friendly Urls for use wherever a page of dynamic content is generated. The most important use of generated Friendly Urls in DNN is in the Menu system. All menu providers dynamically build up the menu using a series of calls to the Friendly Url API in DNN, which returns the destination Urls for the relevant pages. This process is also used for generating other links, like breadcrumbs, blog posts - anywhere a module generates links automatically. Importantly, the only place Friendly Url Generation isn't used is when the editor of a website inserts a link manually into some Html text.
The Standard DNN Friendly Url Provider generates links that look like either of these three options:
/tabid/45/default.aspx : 'search friendly' urls
/pagename/tabid/45/default.aspx : 'friendly' urls
/pagename.aspx : 'human friendly' urls.
The three options are chosen by modifying the Friendly Url Provider 'urlFormat' attribute. All standard DNN installs will use the /pagename/tabid/45/default.aspx scheme.
Leg 2 is Url Rewriting : after Friendly Urls are generated by the system for use in internal site links, they will be clicked on by visitors and come back to the server as requests. These requests don't map to any specific file on the server : all requests for DNN pages are handled by the /default.aspx page. As such, the requests need to be rewritten to be handled by the /default.aspx page, and to contain any other information in a querystring.
The Standard DNN Url Rewriter rewrites these urls in either of two ways:
1) using the Friendlyurls.config file regex rules to identify a match and associated rewrite pattern. This is how the pagename/tabid/45/default.aspx rules get rewritten. If you look through the rules in the config file, you'll see a pattern matching /tabid/xx and rewriting it to /default.aspx?tabid=xx.
2) using a lookup on the tabs table by using the 'human friendly' name, and rewriting the Url that way. So 'pagename' is looked up, identified as tabid 45, and the Url is rewritten as default.aspx?tabid=45
Leg 3 is Url Redirection : identifying requests that are in a format not preferred by the Url rewriter, and redirecting them to the preferred Url. A redirect differs from a rewrite in that the visitor to the site will see the Url change, and their browser will go to the new location. Often this is too fast for the visitor to detect, but they will see that the Url in the address bar of their browser doesn't match the Url that they clicked on.
Redirects in the standard DNN Url Rewriter are used for the following purposes (amongst others)
Handling child portal requests : a 'child' portal shares the top-level domain name with the 'parent' portal. When you create a child portal in DNN, you can see a new directory is created on the webserver, with the same name as the child portal. DNN copies a special 'default.aspx' into this directory, and this 'default.aspx' file simply contains a redirect from domain.com/childportal to domain.com?alias=domain.com/childportal
Identifying the correct portal : because a TabId is unique across an entire DNN installation, it's possible to request domain1.com/tabid/45/default.aspx, where tabid 45 actually belongs to the portal called domain2.com. In this case, the DNN Url Rewriter will redirect your request from the incorrect combination (domain1.com/tabid/45/default.aspx) to the correct combination (domain2.com/tabid/45/default.aspx).
I use the three-legged stool example to clear up confusion amongst many people as to how they can improve the Urls on their DNN site. The metaphor fits in my mind, because if all three legs aren't sound and working together, the stool falls over, and what is sitting on top of it falls over as well. In this case, it's the whole DNN site sitting on the three legged stool. This metaphor gives us the tools to evaluate different DNN url schemes to see if they are a complete solution.
Achieving Advanced Url Rewriting for SEO
With the basics covered, I'll move onto the more involved area of DNN Url handling - having descriptive Urls that drive content on DNN modules. I receive a lot of requests that are something like this:
"I've got a forum/product catalog/blog/photo gallery/etc module, and the urls are like this : /pagename/tabid/45/thingId/345/default.aspx : how can I change it so that it says /page-name/my-shiny-product.aspx?"
This is an example of the 'querystring path' detailed earlier on. There's a combination of two problems to solve in order to achieve the outcome:
- the standard DNN Friendly Url Provider / Url Rewriter, even when in 'humanFriendly' mode, reverts to this more verbose pattern whenever a query string path is added into the Url.
- there's no keywords in the Url, so it's very difficult to make it into a Friendly Url.
- whether it's an existing site being upgraded (and thus needing 301 Redirects to migrate old Urls in search engine indexes) or if it is for a new site, where
There's no clear answer to this question, as it depends on a lot of factors, including whether or not the person has written the code themselves, or can change the code, and the way the module in question is structured.
If you don't have access to the code, the best that you can hope for is removal of the 'tabid/45' and 'default.aspx' portions of the Url. This at least removes as much noise as possible from the Url. But, without keywords in the Url, there's a limited amount to what you can do with Friendly Url Generation and Url Rewriting.
Using the Standard DNN Url Rewriter / Friendly Url Provider
The standard DNN url rewriter includes a couple of features that most people don't know about and don't use.
You can add a rule to the FriendlyUrls.config file to interpret a Url like pagename/tabid/45/thingid-35-the-thing-description.aspx : this is relatively trivial for a confident regex author. However, this is working on only one leg of the Url Stool : the other two legs remain unchanged. As such, it is of little or no help unless you're creating friendly urls in custom module code or perhaps with a Html/Url replacement module like the iFinity Inline Link Master or even PageBlaster.
You can prop up the redirections leg with the DotNetNuke request filter, this gives you the ability to perform custom redirections based on regex patterns. This would at least allow you to
Using a Generic ASP.NET Url Rewriting tool
DotNetNuke runs on top of ASP.NET, and there are a rich set of Url Rewriting tools for ASP.NET : new ones are added all the time to the available set. Some of these are open source/free tools (UrlRewriter.net, UrlRewriting.net) and there are proprietary tools on the market as well. Most of these tools concentrate on closely replicating the functionality of the Mod_Rewrite tool available on Apache.
So, can you use one of these tools to achieve better Url Rewriting with DNN?
The answer is 'no'. There are two reasons : One, you'd need to keep an entire listing of all tabid / tabname values consistent in the set of rules for the site. This is OK for a site that is very static, but as soon as you start adding new pages, you impede the functionality of DNN which automatically recognises new pages as they are added. Two, it's only working on two of the legs of our three-legged stool. While you can rewrite and redirect the Urls with one of these tools, you're missing out on the all-important Friendly Url Generation.
While theoretically you could use a hybrid of a Url-replacement scheme and a custom url rewriter to gain the 'three legs', it would be a cumbersome implementation and a mix of products.
What you need for Advanced Url Rewriting & Friendly Urls
To achieve the desired 'page-name/my-shiny-product.aspx' style Urls, there's really two ways of going about it.
1) Change the module code so that it uses a unique product name lookup instead of a product ID based lookup. (or gallery id, or forum id, etc). In this case, 'my-shiny-product' would be the unique Url name that the database could use to lookup and find an individual record. Obviously this is more difficult than using automatic database-created numeric Id values. I never promised it would be easy! You'll also notice that there is no key/value pair here : just a single name. That's OK, you can either a null key or null value in the querystring - and just lookup the value based on the position. As long as you handle the cases where the value is mistyped or incorrect, it will work just fine.
2) Change the module code to output some type of text into the Url, and then use a custom rewriting solution to abstract away the unwanted parts of the Url. An example of this is the Blog module release 3.5 : it started to include the blog title as part of the Url. This still leaves the entryid/xx section of the path, but the overall SEO value of the blog post increases enormously with relevant keywords in the Url. Abstracting away the unwanted parts of the Url falls under some advanced Url parameter handling, which I'll cover in another blog post. You can see this in action at the Seablick Blog : the simple Urls on that site are a result of a custom Url handling scheme implemented with the Ventrian News Articles module.
As can be seen with the Seablick blog, the Url Master module has extensive features to achieve many of the desired outcomes, but in a lot of cases, the module author may need to modify the module-specific portion of the Url scheme to get the best result. And that will be the topic for a future post.