Case Study: Installing Url Master on dotnetnuke.com
Jul
26
Written by:
Tuesday, July 26, 2011 1:37 PM
One of the projects I have been working on recently was the implementation of the Url Master software on the official http://www.dotnetnuke.com site. This has been a long time ‘in-progress’ – I first spoke to people on the DotNetNuke Corp team about this back in 2009, and the go-ahead on the project was given in about March this year. This was a joint project between myself, R2i, and DNN Corp. It’s been a busy time with the release of DotNetNuke 6.0 (for me as well) but with that out, I thought it was time to share this story.
The objective of the implementation was much the same reason that most people implement the Url Master module :
- to give better, clearer Urls on all pages
- to provide a migration path for old Urls that may no longer be used
- to cleanup duplicate content in search engine indexes
- to give the ability to move pages of content around in the site and create customised Urls
The dotnetnuke.com site is huge in terms of Urls, compared with the average DNN site. The exact number of Urls isn’t really known but Google has well over 300,000 links indexed from the site. The main domain has something like over 3 million incoming links. The dotnetnuke.com site runs on DotNetNuke Professional (now upgraded to DotNetNuke 6.0), and there are a lot of visitors, members, forums, blog posts and everything else under the sun. While I don’t know which site exactly is the largest that uses the Url Master software, the dotnetnuke.com site is the largest that I have done an implementation on personally.
Apart from the technical considerations, the dotnetnuke.com site is the highest profile DotNetNuke site on the web. Every single person that has ever read something about DNN has probably visited at least once. Any failures, slowdowns or errors are immediately noticed by a large (and vocal) number of people.
With all this background, I put a lot of time into making sure that this implementation was going to go well. I definitely wanted a blog post like this to be a happy tale, not one of ‘aw shucks, I broke the site, sorry’. While those with a smaller site can probably take a suck-it-and-see approach, it’s still a good idea to review what the project plan was and see if any of the ideas can be taken on-board.
The Project Plan
The project was broken into two phases : testing and implementation
Testing
The two aims of the testing phase were accuracy and performance. Accuracy refers to making sure that clicking on a Url results in you arriving at the right location, regardless of the process to get you there. Performance means making sure that the overall performance of the site is not materially affected.
- Create a testing environment by copying the live site down to a test server.
- Build a list of all Urls (or a representative set) as they are current linked within the site, and as they are currently indexed on search engines
- Obtain a list of all the rewritten* results for these Urls
- Do performance testing to establish a baseline for the performance of the website.**
- Install the module on the test site
- Submit all Urls to the site in a testing script, and check the rewritten results against the before list.
- Check to make sure any redirects have gone to the correct location
- Do a performance test for the performance of the website after the module has been installed, and compare with the baseline test.
* By rewriting, I mean the native DotNetNuke Url format, which, in the majority of cases for pages, is example.com/default.aspx?tabid=xx. With any type of Url Rewriting scheme, the Urls must be rewritten from a ‘friendly’ format back into this base format. The Friendly Format can be anything – but no matter what type of Friendly Url solution used, all DotNetNuke components and modules expect the /default.aspx?Tabid=xx style of Url. Because of this, it’s possible to compare the results of two different Friendly Url solutions, just be checking the rewritten results. If everything is correct, these values will always match.
** Even though the test site is unlikely to perform the same as the live site, as long as the tests are performed on the exact same environment, it’s possible to check that an adverse performance impact can be avoided just by comparing back-to-back results.
Implementation
The implementation plan is deceptively simple. As I’m not an employee of DNN Corp, I don’t get access to the knobs and switches of the site. So the plan had to be literally implemented as a plan to be followed by the team behind the dotnetnuke.com site.
- Backup site
- Install the module, and copy configuration across manually via the web.config file
- Re-generate all Blog permalinks (see here for more information on regenerating blog permalinks)
- Run ‘smoke tests’ to see if the site is behaving properly
- Run sample Urls via SiteTester program and check results
- Declare a keep/rollback point and take action accordingly.
Creating the Test Environment
Creating a test environment is very simple. This can be done on a local computer, or can be done on a separate webserver. You just copy down the file system of your site, backup your live database and restore it to a new location, then add a new portal alias for your test location, and you’re ready to go. It’s important to copy your live site instead of just creating a test site, because you want to test out that all the Urls are going to work OK, so you need the full set of actual Urls, which will only differ by the domain name in your test environment. In the case of the dotnetnuke.com site, the DNN Corp team setup an exact clone of the live site on a test server, which was important as it contained all of the content and links that the live site has.
Finding Urls and Testing them for Accuracy
There are several items in the list above which are difficult and time-consuming to do. Finding out all the Urls on a site is difficult. Working out the value of the rewritten Url is impossible without some custom code. Testing and verifying before and after would take a long time using a browser and typing in Urls.
Because of these difficulties, and the number of Urls to test, I decided to write some test tools for this project. The two main tools written are called the RewritingReporter component, and the SiteTester application.
The RewritingReporter is a very simple component that is implemented as a Http Module in a site. It then injects a summary of the Rewritten Url into the response header of each Http request/response handled by DNN. This means you can check the rewriting result of any Url rewriting scheme used in DotNetNuke (or any ASP.NET application, for that matter).
The SiteTester is a Windows application and has three main functions:
- Crawling a site to find all available Urls (and their rewritten equivalents) to create Urls for testing
- Building a list of those Urls and editing them to set up a test list, including expected values.
- Submitting the list of Urls against the target site, and recording the results for comparison.

Above : The SiteTester application.
You can download the SiteTester and ReportingRewriter from the Free Downloads page on this site. The SiteTester module is a Winforms app that has an installer, or you can download the source code and just run it via Visual Studio.
The SiteTester has a built-in Html help file, but is covered briefly in this blog post : How to test DotNetNuke Urls for Rewrite and Redirects.
With the dotnetnuke.com project, I built up a test file which contained more than 1500 example Urls from throughout the dotnetnuke.com site. These included samples of forum and blog urls, plus all of the shortcut Urls that the dotnetnuke.com site uses (like blogs.dotnetnuke.com for the blogs page). It also contained many Urls which would have been redirected to a friendly version.
I iteratively ran these tests on a complete copy of the live site, until I could be assured of a clean run where all of the Urls would end up as expected. This assurance is found both because the Url works (and returns a 200,301, etc) but also that the underlying rewritten Url hasn’t changed. Because the entire project is about making sure that, while the ‘Friendly Urls’ change, the underlying rewritten Urls do not change one bit. That’s how you ensure that the site works as it will before – by making sure that all the rewritten results are exactly the same. The combination of the Rewriting Reporter and the SiteTester allowed me to work through and verify that although the friendly Urls would change, the underlying rewritten Urls were no different, and thus, the site would work exactly the same.
Performance Testing
The first thing that I thought of when discussing this project was ‘how can I make sure the site performance is not affected?’. The dotnetnuke.com site is quite snappy these days – but there was a time in the past when it was getting a bit unbearably slow. This is quite common for a site or company going through a rapid growth phase, and it’s OK as long as the site keeps getting faster. So the last thing I wanted to do was introduce a change which made the site slower again.
I’ve already covered performance testing of Url Master in this blog before, so I had plenty to work on. I decided early on to use the WCAT tool to submit performance testing runs to the staging environment set up for the project. The main problem with using this tool is the usability of the tool. It’s a cumbersome use of scripts and commands (there is no GUI). What I have found is that using the Fiddler WCAT Extension is the best way forwards through this. The WCAT Extension for Fiddler uses a set of Fiddler Urls as an input for the WCAT tool. What is needed is a set of Urls within Fiddler in order to submit a representative set of tests against the staging environment.
The answer that I came up with was to use the IIS logfiles from the dotnetnuke.com site itself. By importing a snapshot of logfile, you get a representative cross-section of the type of traffic that the live site receives. Once I had a hold of a couple of these logfiles, I then set about writing a script that would convert the logfile into something that Fiddler could import. This turned out to be a technical blind-alley, as there were just too many obstacles to getting a good result.
Instead, further research showed that Fiddler from version 2.3 allowed you to create your own custom import/export extensions. Armed with this knowledge, I built a Fiddler extension that imported the raw log files from IIS into Fiddler. And once there, they can be submitted using the Fiddler WCAT tool. This was very successful, but for one small hitch – the raw IIS log format from IIS7 installs shows the rewritten Url, and not the originally requested Url. So all the traffic was in /default.aspx?TabId=xx format, instead of the /tabid/xx/default.aspx format that the dotnetnuke.com site was running with. There was also the slight problem of all of the logfiles being for www.dotnetnuke.com instead of the staging domain. This required a further step, which was the writing of a quick script to parse out the log file, replace the domain and convert the rewritten Url back into a Friendly Url. I used 5 minutes worth of traffic from a weekday as the input for the testing script, and set it up to run with multiple connections and a very heavy load.
If people are interested in my Fiddler IIS Logfile import tool (and domain-changing script) then let me know via the comments, and I’ll see about packaging it up for consumption.
With that done, it was just a case of doing a set of performance tests against the staging site in the ‘before change’ and ‘after change’ states, and comparing the pair. These tests used live traffic patterns and did a good job of replicating the type of Urls that get requested from the live site. I also created a ‘new Urls’ version of this test script, which changed the Urls that would get redirected into the ‘Friendly’ version. This reflected what the site would experience ongoing, as the old Urls get phased out and the bulk of the traffic would get requested on the ‘new’ Urls. It’s important to do this because the performance figures will get skewed with a lot of redirects : it takes a lot less server resources to send back a 301 header than a full page of content.
The results were exactly in line with previous testing in this area:

Above : Relative Performance Testing Results
Important : This information is from a staging site, which doesn’t have the same hardware or software configuration of the live site. This is about relative testing, rather than absolute testing. It’s to ensure that the changes haven’t made the site slower than it was before.
As you can see, overall response time declined slightly (up about 10%) which reflects the extra work that the Url Master module does, but bear in mind we’re talking an average increase of about 50 milliseconds, and the median response is about the same. Minimum response isn’t shown because both show 0 ms as the fastest response. By the time you factor in everything else the visitors will not detect this. The interesting result is the middle – which shows the effect that re-issuing redirects instead of full pages has on average performance (WCAT doesn’t ‘follow’ a 301 redirect). This is why the third column is most important, because it shows a more like-for-like testing result.
Testing Outcomes
With the Accuracy of the new Urls and redirects checked via the SiteTester tool, and the performance of the module checked via WCAT, the installation of the module could go ahead. The testing showed that the module would work as expected in all of the sample Urls, and that the performance of the site would not be materially affected.
Implementation – the final phase
The first attempt at installation actually didn’t go well. After doing the install, and doing the ‘smoke test’ (which was just a html file with a set of links to check) the site began behaving strangely so the decision was made to rollback the change right there and then. A site restore wasn’t required as the module only affects the web.config file, so the ‘old’ web.config file was simple replaced. The error was actually a strange compilation error to do with resource files. As it turned out, it wasn’t anything to do with the Url Master module, but right at the point where the site isn’t working properly isn’t the time to be doing this type of research. It’s always the correct call to go back, re-examine things and re-schedule another attempt.
On the second attempt, things went much smoother. The install went without a hitch and everything looked OK. The tests were run and all the Urls checked out OK except for one small problem. The decision was made to go ahead and leave the module in place. This was all done very quietly without alerting visitors to the fact that their Urls had changed. I watched Twitter and the DotNetNuke forums for a while to see if anyone started complaining about the site not working, but nothing came in.
The next day it was found some more adjustments had to happen to the configuration to suit the webfarm better, and then a new 404 handling page was created. You might have stumbled across the new 404 page if you entered a link correctly. And ever since then the module has been running without further adjustment.
Watching the Results
Whenever an important change like this is made to an important site, it’s imperative to keep an eye on both traffic and the ‘webmaster’ statistics. It’s important to keep an eye on the main keywords that feed traffic to your site, and on any keywords you’d like to target but so far haven’t done well in. After all, the main rationale for modifying the Urls of a site is to improve usability and search engine rankings (which are really tied to each other in many ways).
As is typical of a change on a site of this size, there was an initial drop in rankings for dotnetnuke.com on Google as the googlebot literally came across thousands of redirects for Urls which had existed for a long time. This is typical, and it’s at this point it’s important you’ve done your testing and can rely on your method, as it’s not a time you want to panic and undo the changes, further muddying the waters. Thankfully, within about 10 days, the rankings began to return as Google re-indexed the site and updated the Urls it has cached. I haven’t checked in with the team in the recent weeks (there’s this thing called DotNetNuke 6.0 out) but I believe there was no loss in rankings for existing terms, and some improvement in rankings for other terms where DotNetNuke wasn’t already number 1.
Now, a bit over 6 weeks later, if you scan through the dotnetnuke.com search engine results you’ll see that the vast majority of Urls indexed on the domain are now in the Friendly Url format familiar to those who already use the Url Master module.
Conclusions
I’m not normally involved at the coalface of implementing the module in a site of this size. I know the module has been used on big sites now and in the past, but this was a first for me to go through and individually collate and check a massive amount of Urls. It taught me a lot about the best processes for working with the module in a real situation, as opposed to throwing it at test platforms which I normally do. And it was a good chance to work with many of the DNN Corp staff on a specific project, which showed me how many good people are on their team, and how making sure the dotnetnuke.com is a showcase of the platform itself is taken quite seriously. A big thanks to all the people I worked with to make this a reality.
If you’ve implemented the module on a large site, please let me know via the comments. If you’ve noticed the improved Urls on dotnetnuke.com, let me know as well!
3 comment(s) so far...
Re: Case Study: Installing Url Master on dotnetnuke.com
Thanks for sharing this very intersting case. Exciting to read about an assignment this big. And congrats to a job well done!
By Pontus Österlin on
Thursday, July 28, 2011 3:53 AM
|
Re: Case Study: Installing Url Master on dotnetnuke.com
Thanks for this information I am currently implementing the URL Master into a portal signup site and have found several issues while working my way through it with many third party modules. but this is a fairly new site and I am working in a live environment while also having everyday people I know test it. Can you tell me why DNN has kept the .aspx at the end of the urls is there an advantage to this or not.
Thanks for all your hard work and well done.
By Paul Turner on
Friday, September 02, 2011 11:34 AM
|
Re: Case Study: Installing Url Master on dotnetnuke.com
@Paul The advantage is that only the .aspx (and .axd, .ashx and a few other) Urls get processed through the ASP.NET runtime, and IIS handles the rest. Removing the .aspx was discussed at the early stages of the project, but was ruled out for a number of reasons, which included concentrating on performance (removing the .aspx does have some performance impact), keeping the number of redirects down to a minimum (by removing the .aspx, all Urls would have to be redirected) and also the strong association between ASP.NET, DotNetNuke and the .aspx extension.
Each site should go through the process of weighing up the advantages and disadvantages and coming up with an answer. For a new site I would probably lean towards running without .aspx, but for a large, complex and existing site I would lean towards keeping the .aspx.
By Bruce Chapman on
Friday, September 02, 2011 11:47 AM
|