WordPress 2.3: Canonical URLs

Canonical URLs is one of the features that I worked on for WordPress 2.3 It’s sort of a geeky concept, but the end result has benefits that a non-geek can appreciate, so I’m going to break it down for you.

WordPress has traditionally been very lenient in the URLs that it will accept.

For instance, say your blog is hosted on http://www.example.com/blog/.

You can likely access the front page of your blog via these alternative URLs:

And those are just the “sane” ones. Try this one on for size:

That’s the front page. We have additional issues for other views. For example, consider if you are using “fancy” permalinks and have a post up at http://www.example.com/blog/2007/09/17/dont-tase-me-bro/ with a post ID of 17. The following alternative URLs will work:

The following issues comprise the majority of incorrect alternative WordPress URLs.

  • Old URL structure when using “fancy” permalinks
  • <www.example.com vs. example.com
  • “Fancy” permalinks with /index.php/ (called “PATH_INFO permalinks”) vs “fancy” permalinks without (“mod_rewrite permalinks”)
  • URLs with trailing slashes vs. URLs without trailing slashes
  • /page/1/ (always redundant)
  • ?paged=4 vs. /page/4/

So, what’s the problem with this? The URLs are all showing the exact same content, so why should it matter? Well, search engines can’t assume that all of these alternative URLs represent the same resource. So they don’t automatically get condensed into a single resource. As a result, you can actually end up competing against yourself in search engine rankings. So to avoid confusing search engines and to consolidate your rankings for your content, there should only be one URL for a resource. We call this URL the canonical URL. Canonical means “standard” or “authoritative.” It’s the one that WordPress generates, and it’s the one that you want everyone to use.

Since version 2.2, WordPress-generated rules have been very well standardized. I personally invested a lot of time making sure things like trailing slashes were consistently standardized. So that’s one piece of the puzzle — making sure that WordPress isn’t working against you by generating non-canonical URLs. But of course, you can’t control who links to you, and third parties can make errors when typing or copy-pasting your URLs. This canonical URL degeneration has a way of propogating. That is, Site A links to your site using a non-canonical URL. Then Site B see’s Site A’s link, and copy-pastes it into their blog. If you allow a non-canonical URL to stay in the address bar, people will use it. That usage is detrimental to your search engine ranking, and damages the web of links that makes up the Web. By redirecting to the canonical URL, we can help stop these errors from propagating, and at least generate 301 redirects for the errors that people may make when linking to your blog.

My goal for WordPress 2.3 was to cover the majority of canonical URL issues that people have and make WordPress automatically redirect those requests to the correct (canonical) URL for that resource. Early tries at this functionality had issues with being too aggressive. I rewrote the functionality multiple times, until I settled upon the current incarnation. I’m quite happy with it.

Ideally, you shouldn’t even be aware of the feature. You might have issues, however, if you have enabled your own form of canonical URL redirection that isn’t redirecting the the URLs that WordPress thinks are the canonical version. For instance, if your blog is http://www.example.com/blog/ but you have a line in your .htaccess that redirects people to http://example.com/blog/, you’re not going to be able to access your site, as the two redirects will “fight” each other in an infinite loop until the browser gives up. You’ll also have issues if your server is generating a non-standard $_SERVER['REQUEST_URI'] value. For this reason, the feature has been disabled for IIS. WordPress can set a correct $_SERVER['REQUEST_URI'] for some IIS incarnations, but fails on others. This is an issue that I hope we’re able to fix in the future. That said, the vast majority of WordPress blogs are not running on IIS, so you’ll likely be fine.

If you’re having issues with infinite redirects, please open a ticket. And in the meantime, you can use this one-line plugin to disable the feature.

309 thoughts on “WordPress 2.3: Canonical URLs

  1. mg says:

    Ah, finally, a way to force people to drop the ‘www.’ in the URL. For the sake of consistent cookie-ing, at the very least. Much ta :)

  2. Trust me, Michael — you don’t want an .htaccess rule that counteracts WordPress redirects. Neither rule is aware of the other, and they will bounce the user back and forth between them until the browser gives up.

  3. I’m rewriting my Page permalinks using the ‘page_link’ filter, so they appear on a different domain.

    If is_page(), filtering ‘option_home’ lets me change the Page permalinks your code generates, so they match the ones I craft via ‘page_link’, thus enforcing my desired canonically.

    Nice stuff Mark!

  4. I’ve never truly understood the difference between using the www. and not using it,.. some places it doesn’t matter some places it does,… you are trying to break it down , thank you, it still sounds like technospeak,…lol

  5. Trust me, Michael — you don’t want an .htaccess rule that counteracts WordPress redirects. Neither rule is aware of the other, and they will bounce the user back and forth between them until the browser gives up.

    Mark, I just found out why wordpress didn’t bounce me back and forth even if I have rewrite rules to remove www prefix from my domain. I put the codes in server config instead of .htaccess. I guess that’s the reason.

  6. What’s with two different WP-Installation under the same domain? I had the problem where a new 2.3 installation (23.domain.tld) “overwrote” an old 2.2 installation (domain.tld).

  7. Cool! I was actually looking at adding something to my .htaccess when 2.3 was released because of issues with AJAX and having the leading www (vs not having). Works great!

  8. Oh man, you really helped me out! I was frustatedly sitting in front of my computer wondering why things doesn’t work as usual after I upgraded to version 2.3.

    Thank you very much, very useful plugin!

    Greetings from Germany

  9. I love you, this is just lovely. I mean it’s just pure loveliness. It’s this kind of polish that make all the alternatives to WordPress pale into miserable insignificance.

  10. Pingback: New Underpinnings
  11. I’m not sure if this is the cause of my problem. I had set up a subdomain earthcomm.mysciencespace.com and pointed it to mysciencespace.com/wordpress As soon as I up graded to 2.3 the subdomain redirected to the root domain instead of the wordpress directory.

    What can I do to fix this?

  12. Wakefield says:

    Wow, this is purely awesome, great work. This is something that should be more prominently featured and announced by WordPress, as it may dramatically mess with people who had modded .htaccess and were getting infinite loops, being forced to use default to old url structure and losing incoming links in the process.

  13. Pingback: WordPress 2.3
  14. In earlier versions of WordPress, I used the Permalink Redirect plug-in (http://fucoder.com/code/permalink-redirect/). While the new WordPress feature makes this plug-in mostly obsolete, there’s one feature of Permalink Redirect that doesn’t seem to have made it to 2.3.

    Specifically, you could enter your Old Permalink Structure, and have Permalink Redirect handle the situation where you had changed your permalink structure, i.e., pretty URLs from both the _old_ and the _new_ permalink structures would work.

    (Yes, I know. “Don’t do that. ;-)” But I needed to change the structure. Now I need to make sure the old links still work.)

    More pressingly (for me), this feature of Permalink Redirect no longer works, probably because the way permalinks are built/handled has changed. Is there a new WordPress 2.3-friendly way to preserve an old permalink structure?

  15. Hey Mark,

    This is no doubt a great idea, but it kills the ability to make complex queries that skip permalinks. So while I usually want my category url to look like:

    site.com/~/general/

    I use a plugin to generate archive urls like:

    site.com/?cat=2&m=200710

    When I go to the second url it forwards me to the first and doesn’t filter it at all for dates.

    If I give the normal pretty url for a category and append the m=200710 to it just forgets the category entirely and takes me to the month page.

    I imagine similar problems with other variables passed in queries.

    Is there a way to avoid this breaking while still using permalinks for normal pages?

    It seems like you’d need either to add in exceptions for times when there are multiple variables being passed in the non-pretty method, or else make sure that all sane combinations of variables can be passed using complex pretty-urls like

    site.com/~/general/2007/10

    That one of course doesn’t work. Do you know a method to get categories and dates to coexist in a pretty url? Otherwise do you think that one or the other of my solutions is viable? We’ll have to turn off the filter for now, but i’d be happy to keep the other features without breaking multiple url variables.

  16. We have our site setup http://www.capitaplus.com using a old version of wordpress version 2.o and im afraid of updating because I might mess up the URL stucture. Can you please advice what I should do so i dont loose my search engine rankings? We buy invoices and our current rankings are very important.

    Thanks

  17. Unfortunately I am one of those few WordPress people running WordPress on IIS (I’m using IIS 6).

    I guess this excellent feature is not enabled for me. In which case I’m using the www-redirect plugin (located here: http://www.justinshattuck.com/wordpress-www-redirect-plugin/). Its probably not as robust and all encompassing as what you describe, but it’ll fix some of the basic ones which are most often the issue.

    I know IIS is not the popular choice but I have to use it for business reasons. And i’ve learned quite a few tricks in how to get WordPress working for IIS 6 users.

    If anyone needs help you can email me. You’ll find my email under the “About Ponder Place” section of my site (which is running off IIS6) http://www.ponderplace.com

  18. This feature caused me to spend the last three of four hours trying to complete my upgrade to 2.3. I couldn’t get any of my permalinks to work and I couldn’t figure out why, so my whole website was unaccessible unless I changed the style of permalinks I use, thus ruining my traffic from anyone linking in. I almost gave up trying to pursue blogging as a career because this was to difficult for me to figure out. I’m glad there was that disable plugin. I just wish I had it with the upgrade so that I could have saved all the deleting and data loss that ensued. Yeesh. I can appreciate optimization. But when there’s a teeny 0.001% bug, please leave the disable fix in the download with the upgrade or something.

  19. My host is Yahoo. Does this have to do with the all the problems I am having? I am using the Customizable Permalinks plugin from Yahoo, but when I activate it I cant get to the page the articles. I downloaded the “Disable Canonical URL Redirection” plugin and that fixes the problem. Activating that pluin creates other problems. Like messing up my connection to Feedburner. How can I fix this problem?

  20. if you need any thing like software,videos,music,lessons(any sort of lessons,formations)help for computer,any thing we can help you with it , so visit this website and tell us what you need we will find what you looking for thank you this is the link Clic here

  21. g1smd says:

    Finally, someone takes this issue seriously!!

    Hurrah!!

    I have been writing about these bugs for about 4 or 5 years. Until now, every forum, blog, cart and CMS system has suffered from these issues.

    It is nice to see it addressed and (as far as initial reports go) mainly fixed. I’ll be running some tests and I’ll file a bug report if I find any major holes still lurking.

  22. I don’t consider this a “feature”… we’ve been receiving complaints from our web hosting customers for the last two weeks and we have spent countless hours trying to figure out why many blogs stopped working.

    It was all due to this… such altering features are not supposed to be enabled by default. There has to be a transition period with plenty of warnings for about a year or two, then enable it by default.

    Sort of the same way as when the PHP team decided to turn register_globals off…

  23. Mark,

    Your plugin fixes the issue of redirects, but kills my feed to feedburner and my BuzzBoost stops working. How can I fix this?

    Thanks!

  24. I can’t open a trac ticket without admin privileges, but we’re having this exact issue — infinite recursion in redirects. I tried adding a functions.php file with the code for the plugin, which works on our development server but not in production.

    How do I get privs to open a trac ticket?

    And I sort of agree with the above poster who said that such changes in upgrade should be better advertised, or at least opt-in for the first revision.

  25. sara says:

    I’m so glad I finally found this. I was using yahoo’s customizable permalinks plugin, and when I upgraded wordpress my website completely broke down. none of the page links worked. I used the plugin you have listed and it appears to be working now.

  26. Goodness… I spent hours on this issue, but your plugin fixed it in a jiffy! I don’t have .htaccess or anything… and my WP is hosted on a Debian box, so I doubt that it’s running on IIS. So what could have gone wrong and is there a real fix?

  27. I think the duplicate content issue is a lot of hype. You still get in the search engines whether it’s duplicate content or not. The way you differentiate yourself from the pack is using your own text on other pages. As far as competing with yourself I don’t see how that happens the hits/visitors all go to your site so I don’t see that as being possible.

  28. Thanks so much for this. I finally switched over to using custom permalinks after almost one and a half years of posts and it was a completely painless process.

  29. that’s right – no “www” on that url :)

    Thanks for the plugin. I use my WP site as an OpenID and have a link rel=’openid.delegate’ pointing to myopenid.com (an OpenID service). That new WP URL canonicalization “feature” essentially locked me out of all my accounts (all the ones that used OpenID authentication).

    Thanks again!

  30. jan says:

    My current blog has over 500 PR4, PR3 and PR2 pages with my permalink beeing

    /%category%/%postname%/

    Because of Google News I have to change it into

    /%category%/%month%/%day%/%postname%/

    Will I loose all my pageranks or does the Canonical URLs automaticly redirect google to the new pages so I won’t loose pagerank on the next update?

  31. With the proliferation of blogs and other websites and then different tools that don’t necessairly operate consistently it is highly likely that there will be duplicates and concatenated content and links. It would be useful to have a consistent practice that could ensure some regularity. Maybe you could write an ebook on it?
    Regards
    Ron

  32. Robert Mathews says:

    I work for a hosting company, and this feature causes us no end of support headaches. People frequently want to move the location of their WordPress installation, and this causes major problems when they do so. We hear from lots of customers unable to access their site.

    It would be much better if this was a preference that could be turned on or off, and if it didn’t affect the admin URLs (so that people could login and turn off the preference if one was available).

  33. Great post. I understand, if search engine spider visit those links, they think that it was duplicate content, you can get banned from google. but it will be no problem if you didn’t give that canonical link to website, google will not spider it!

  34. Hello,

    I am running wp 2.3.3. The site is hosted on Yahoo. The yahoo custom permalinks plugin is installed an activated.

    I have also installed the plugin to disable canonical URLS. I still cannot get permalinks to work correctly. The site in question is http://www.showukare.com.

    An example of the problem is http://www.showukare.com/faq.

    I have a page with the page slug of “faq” but it will not load. The page will load is you use the default link structure:
    http://showukare.com/?page_id=2

    Is there anything I can do to get this to work with this setup?

  35. Pingback: WordPress 2.3
  36. I can’t thank you enough for your efforts. I’d previously rolled back a word press upgrade because of this issue… and its interaction with Yahoo’s Permalinks. When I decided to move to 2.3.3, I’d mostly forgotten about this issue (or at least assumed it’d been resolved). I had a sick feeling in my gut when my permalinks broke all over again. Thankfully, I came across your plugin… and got my blog back up and running.

    THANK YOU!!!!!

    Anthony

  37. Dan Butcher says:

    Thank you! this solved my problem of endless redirects because of some weird settings on my server (that are beyond my control).

  38. ceti says:

    Hmm… This should have been front and center news for wordpress upgraders, as an infinite indirect is definitely grounds for panic.

    Granted, some servers are poorly implemented, however users don’t have any other choice but to use them. I had to install the anti-canonical redirect plugin to restore access to the site, but not before a bit of panic. ugh.

  39. Pingback: euro poker gratis
  40. Home Entertainment Center is nothing less than one of the most amazing one- man variety shows of all time. Besides being hysterically funny, Banks in the course of an hour, sings and plays banjo, flute, bass, drums, harmonica, and both electric and acoustic guitar. He’s not an impersonator, but that doesn’t stop him from doing brief tributes to Elvis, Bob Dylan, Paul Simon, and Van Morrison.

  41. It’s great, but there is one more redirect part that’s missing. Consider the following scheme:

    /%post_id%-%postname%/

    which is great for those long urls which can be truncated in email

    let’s say the URL is:

    /blog/5-my-very-long-url/

    if it somehow gets truncated and person reaches:

    /blog/5-my-very

    it still works (i.e. fetches the article #5), which is perfect! however wordpress (at least 2.5.1) will not automatically rewrite the broken url to:

    /blog/5-my-very-long-url/

    and now someone might start using:

    /blog/5-my-very

    in bookmarks, and elsewhere, which creates “duplicated content”.

    So if you could teach wp to handle that issue it’d be golden.

    Thank you Mark for doing a great job.

  42. ‘ Senate Budget Committee Chairman Kent Conrad, D- N. D., said Saturday he is donating 10,500 to charity and refinancing his loan on an apartment building after reviewing documents showing he received special…

  43. This sure does break a LOT of things. It breaks
    any site which uses cookies, for example, since
    cookies for http://www.site.com won’t be sent back to
    site.com. It breaks PHP sessions for the same
    reason. It breaks stadard basic authentication,
    as a browser which has saved the password for
    http://www.site.com won’t send it to site.com or
    vice versa. I could go on, but I think you get
    the point. In my opinion this should definitely
    not be the default. Generally, anyone who comes to us with WordPress problems, 80% of the time
    the problem traces back to this canonical URL
    misfeature.

  44. Hi Mark,

    I am having a trouble with URL redirect in WP 2.6.3 admin panel. My setup is such that whatever comes to http://www.mydomain.com/blog gets re-directed to the blog.mydomain.com/blog and this is internal and all I expect from wordpress is to use the siteurl from wp-options (or homeurl) to create the prefix for all the links. But this is not happening and the admin panel links end up having http://www.mydomain.com/wp-admin etc.. (the blog part is omited). do you know how I can fix this?

    Thanks
    Maddy

  45. Pingback: The Undeading
  46. wp3423 says:

    Hi Mark,

    I have a thread going at the WP forums regarding no-www and www canonicalization issues: http://wordpress.org/support/topic/224976?replies=5

    I launched a site a few weeks ago with the URL structure http://domain.com & Google has already indexed some pages.

    I updated the URL structure to http://www.domain.com and updated all links on my site. It looks like everything without www is redirecting to http://www.domain.com.

    Is this a global 301 or 302 redirect now from http:// to http://www ? Are there any other steps you recommend taking?

    I am currently using 2.6.1, but will be upgrading to 2.7 soon.

    • wp3423 — All versions of WordPress since 2.3 will do 301 redirects based on the domain you’ve set as your “Blog URL” in Settings → General. You’re all set. Google will follow those redirects and consolidate your “Google juice.”

  47. Hi Mark,

    I’ve installed and uninstalled WordPress 2.7 on godaddy.com three times. It’s set up as http://domain.com/blog.

    I get the infinite Redirect loop because of my .htaccess file.

    Here’s the Firefox 3.0.5 error message:

    Redirect Loop

    Firefox has detected that the server is redirecting the request for this address in a way that will never complete.

    The browser has stopped trying to retrieve the requested item. The site is redirecting the request in a way that will never complete.

    ————————————
    I disabled canonical redirection (domain.com –> http://www.domain.com) in .htaccess and so WordPress works fine.

    However, I really don’t want to disable canonical redirection for the site as a whole just for WordPress.

    The plugin that started this thread… is it still valid? And if so, which file should I modify so that it’s included?

    Any advice would be greatly appreciated!

    Cheers,

    Wayne

    • Wayne,

      Yes, the plugin can still be used to disable canonical URL redirection. Upload it to wp-content/plugins/ and activate it from the Plugins page in your wp-admin.

      Or just remove the contradictory line from your .htaccess

  48. I have several blogs hosted on different accounts. On some accounts, I have no errors while running wordpress while on others I have a ton. I cannot figure out why I keep getting this redirect issues even after uploading your plugin. Does it have anything to do with my changing the permalinks? I have too many blogs set up and grounded to have to uproot them so am hoping for a better solution.

  49. davidsanger says:

    I am just migrating to WP and there seems to be a basic misunderstanding of canonical URLs. There should be only ONE proper representation. This works:

    http://markjaquith.wordpress.com/snufflebunk/wordpress-23-canonical-urls/ does a proper 301 to

    http://markjaquith.wordpress.com/2007/09/25/wordpress-23-canonical-urls/

    a case with categories eg.

    http://www.gl3nnx.net/snornpoogle/nature-shine-wordpress-theme.htm returns a 200 instead of

    http://www.gl3nnx.net/wordpress-themes/nature-shine-wordpress-theme.htm

    this can really mess up google (duplicate content) and everyone else and should never happen.

    its not my site, but same thing on mine. perhaps I am missing a setting

  50. Kate says:

    Another thumbs up from someone who was using Yahoo hosting with infinite loop issues on archive links. This plugin fixes the issues! Thank you!

  51. We’ve just got ourselves in the loop Mark mentions – the blog is at domain.com and the main site is at http://www.domain.com. We redirected in the .htaccess and the blog stopped working so we have to go back and diable the canonical redirect in WP admin but I can’t access the admin – the login seems to loop!

    Bizarrely we have deleted the htaccess redirect yet the site is still redirecting. Is there a way to access my log’s admin panel while the redirect is still in place?

  52. Marko says:

    Mark,

    I have just checked my WP site and it does 302 redirect, not 301. You said all versions after 2.3 do 301? I have WP 2.8.2.

    Should I be worried because it’s 302 and not 301? Will Google juice come with 302 as it comes with 301?

    Thanks!

  53. Greetings! The disable-canonical-redirects plugin solves the infinite redirect problem on WP 2.9.1 when accessing over SSL, but of course it also removes the canonical redirect goodness that we so adored over standard http. How to have our proverbial cake and eat it too? :-o

  54. Ahh, wait a sec. This might be all we need:

    if ($_SERVER['HTTPS'] != “on”) {
    remove_filter(‘template_redirect’, ‘redirect_canonical’);
    }

    Y’think? :-D

  55. This is swell in terms of seo. Naught appears to rag upon it compared to that!Amusingly enough, this is just what was worried about several years prior at the big internet about seo in 1995.

  56. Hi Mark,

    I use /%post_id%/%postname%/ as my permalink structure.

    Going to mydomain.com/blog/17/ does not redirect to
    mydomain.com/blog/17/postname

    Is there any reason for this? it seems especially odd since starting with post_id comes recommended on the WordPress codex.

    Thanks for any light you can shed on this!
    Cheers,
    Frank

  57. This is all very swell, but gave me a full-day headache as I tried to 301 redirect from an old domain to a new while keeping the same permalink structure.

    Should I use the 1-line plugin to disable this otherwise nice feature so I can do the redirect?

  58. If you’re interested in having a guest blog poster please reply and let me know, also if you want to have a guest account on my blog. Cheers, Dave the Personal Trainer

  59. Hello from Germany! May i quote a post a translated part of your blog with a link to you? I’ve tried to contact you for the topic WordPress 2.3: Canonical URLs « Mark on WordPress, but i got no answer, please reply when you have a moment, thanks, Sprueche

  60. Jodi says:

    hi mark

    this may sound silly but i grabbed your php and activated it in wp-admin, and it worked beautifully for the index.php page, however, as soon as i clicked on the ‘about’ page in the default template it jumped back to my live index.html page – i assumed that your plugin would work for all of the wp pages – am i missing something obvious? thank you very much, jodi

  61. I’ve never truly understood the difference between using the www. and not using it,.. some places it doesn’t matter some places it does,… you are trying to break it down , thank you, it still sounds like technospeak

  62. Portal is back and better than ever! The tests are as mind bending as ever, and they’ve added new elements, like light bridges and emancipation gates. The graphics are smooth and attractive, but nothing that’s going to make your jaw drop. The controls are as simple and effective as the first game. But then, none of that was what worried me. I had no doubt that they would be able to recreate and expand upon the technical aspects of the first game. My concern was whether or not they would manage to capture and continue the personality and sense of humor of the first game. I am very pleased to report that the sequel does not disappoint. The sequel pics up right where the first game left off, and GLaDOS holds a grudge. Her insults, passive aggressive comments, and general malice will have you chuckling throughout the game. Your new companion and guide robot also adds to the character of the game, though he can be a bit heavy handed at times. If you enjoyed the first game, you will love the sequel, both for its expanded challenges, and its unique personality. As long as you enjoy a good puzzle and have a sense of humor, you will never regret this purchase.

Comments are closed.