July 12, 2008

Redirecting Blogger posts to WordPress

Filed under: Tech — Chris @ 3:48 pm

My move from Blogger to WordPress was made possible by this tutorial. However, the post redirection widget has some bugs: it doesn’t handle posts with “a”, “an”, or “the” in the title (seriously!) or titles with non-ASCII characters in them.

Here’s an upgraded widget, which works for a broader class of posts. Basically, I just translated the WordPress PHP code for generating a permalink from a title into Javascript. I cut some corners, but it’s good enough to handle 99% of the posts in my archive.

<b:widget id='Redirector' locked='true' title='Blog Posts' type='Blog'>
<b:includable id='main'>
<b:if cond='data:blog.pageType == "item"'>
<b:loop values='data:posts' var='post'>
<div id='redirectorTitle' style='visibility:hidden'><data:post.title/></div>
<script type='text/javascript'>
var new_domain = 'YOUR_BLOG_URL_HERE'

function utf8_uri_encode( str ) {
  var high_code = new RegExp(/[\u0080-\uffff]+/);;
  new_str = str;;
  while( m = high_code.exec( new_str ) ) {
    new_str = new_str.replace(m,encodeURIComponent(m));;
  return new_str;;

var title = document.getElementById('redirectorTitle').innerHTML;;
// [INCOMPLETE] Keep percent signs that aren't part of an octet?
title = title.replace(/&lt;[^&gt;]*?&gt;/g,'');; // remove tags
title = title.replace(/&amp;.+?;/g,'');; // remove entities
title = utf8_uri_encode(title);; // handle UTF-8 characters
title = title.toLowerCase();;
title = title.replace(/[^%a-z0-9 _-]/g,'');; // remove punctuation
title = title.replace(/\s+/g,'-');; // turn spaces into hyphens
title = title.replace(/-+/g, '-');; // collapse runs of hyphens
title = title.replace(/^-+/g,'');; // remove prefixed hyphens
title = title.replace(/-+$/g,'');; // remove suffixed hyphens
var timestamp = '<data:post.timestamp/>';
timestamp = timestamp.split('/');
timestamp = timestamp[2]+'/'+timestamp[0]+'/'+timestamp[1];
var new_page = new_domain + '/' + timestamp + '/' + title + '/';;
document.location.href = new_page;


  • Timestamps on posts must be in MM/DD/YYYY format. This is easily changed in the Blogger control panel via “Settings -> Formatting”.
  • You should set the correct time zone for your blog in WordPress before you import the posts. Otherwise, Blogger and WordPress won’t always agree on the date of a post. I didn’t do this and, as a consequence, about one in five of my archived posts have bad redirect links. (This can be fixed by manually editing the post’s timestamp, but that is a big pain.)
  • This mostly handles Unicode (see, e.g., this post), but there is a bug in there somewhere. I had to manually change the permalink on this post fromएक्ष्केल्लेन्त्-वर्क-टो/


    so that it redirected to the right place. (Can any Hindi readers help me out with that? Is there punctuation in there? I don’t remember what the title was supposed to say. And I can’t figure out how to back-transliterate it into English.)

  • Blogger does some weird things with the widget code after you save. The code will disappear from the “Edit Template” text box, replaced by a tag like:

    <b:widget id='Blog2' locked='true' title='Blog Posts' type='Blog'/>

    (the tag with id Blog1 is your actual posts; don’t delete it). If the widget doesn’t work, or you want to remove it, just remove that tag. If you want to fiddle with the widget, clicking “Expand Widget Templates” will reveal the underlying code (and a lot else besides). The widget will also show up in the Blogger “Layout -> Page Elements” as a mysterious second “Blog Posts” box, with all kinds of spurious configurable elements. Just ignore that.

Blog at