WordPress Migration: Tidying up the comments

This is the seventh in a series of articles on my migration to WordPress. In this post, I’ll talk about how I did some cleaning up of the comments after I imported them.

Clean? What’s to clean

After importing the comments, I noticed, or rather, I was reminded, that Phorum does some things that WordPress doesn’t. Specifically tokens.

Tokens? What tokens?

When I copied over the comments from my old system, I took the raw text. In hindsight now, as I type this, I think I could have injected the token values into the comments instead of just taking the raw text. Now, I have this in a comment:

Thanks for this summary!
One thing I have to remind myself evey time I go to create further client certificates on my FreeBSD server is to start a bash shell before running those commnads as my default shell is out of the box csh (I think) – it may be that others hit this (simple) block…
Simply start the shell (if installed from ports) by typing ‘bash’ before proceeding.

[%sig%]

See that [%sig%]? That’s the Phorum token for the user’s signature. I want to get rid of that and replace it with something less obtrusive.

In my case, I want to change my signature to ‘The Man Behind The Curtain‘.

Very carefully!

I have to do this very carefully. I will be updating a MySQL database which has no transactions. This is one basic reason why I wish WordPress used PostgreSQL, which, by default, has transactions.

Here is the SQL I came up with to verify what I wanted to do:

SELECT comment_ID,
       user_id,
       replace(comment_content, '[%sig%]', '-- \nThe Man Behind The Curtain')
  FROM wp_comments
 WHERE comment_content LIKE '%\[\%sig\%\]%'
   AND user_id = 1154;

Checking the output, I saw the expected results. So now I moved onto the UPDATE command:

UPDATE wp_comments\
   SET comment_content = replace(comment_content, '[%sig%]', '-- \nThe Man Behind The Curtain')
 WHERE comment_content LIKE '%\[\%sig\%\]%'
   AND user_id = 1154;

That’s it. That’s all my comments updated. As you can see, I had user id 1154.

Now we need to update all the other users… Sadly, they will just lose their signatures.

UPDATE wp_comments\
   SET comment_content = replace(comment_content, '[%sig%]', '')
 WHERE comment_content LIKE '%\[\%sig\%\]%';

The HTML tags

In previous posts, I thought I had some <HTML> tags to deal with. Not all comments have them. Some do. Fortunately, it’s an easy fix. At one point, I thought, what if there are multiple tags in the comment? So I tried an query to detect that situation. First, we count to check that removing removes just one such tag per comment. I feared that some comments might contain nested tags, which were relevant to the comment.

SELECT comment_ID, 
       user_id,
       char_length(replace(comment_content, '&lt;HTML&gt;', '')) - char_length(comment_content) <> -12
  FROM wp_comments
 WHERE comment_content LIKE '&lt;HTML&gt;%';

If the third column is all zeroes, we’re good to go. And we were. Now let’s check the closing tag:

SELECT comment_ID, 
       user_id,
       char_length(replace(comment_content, '&lt;/HTML&gt;', '')) - char_length(comment_content) <> -13
  FROM wp_comments
 WHERE comment_content LIKE '&lt;HTML&gt;%';

And we’re still good to go. On with the query!

UPDATE wp_comments
   SET comment_content = replace(comment_content, '&lt;/HTML&gt;', '')
 WHERE comment_content 
  LIKE '&lt;HTML&gt;%';Query;

UPDATE wp_comments
   SET comment_content = replace(comment_content, '&lt;HTML&gt;', '')
 WHERE comment_content 
  LIKE '&lt;HTML&gt;%';Query

Notice that I first removed the tag, then the tag. This was because I used the same WHERE clause on both updates.

OK. What’s next?

There may be more to come in this series. I know I want to do a URL test to ensure all the old URLs still work. Perhaps I’ll do that next. Unless something else comes to mind first.

Website Pin Facebook Twitter Myspace Friendfeed Technorati del.icio.us Digg Google StumbleUpon Premium Responsive

1 thought on “WordPress Migration: Tidying up the comments”

Leave a Comment

Scroll to Top