I just finished migrating ISKsense to PostgreSQL. This was more of a move for convenience’s sake, as MySQL on my server is basically deprecated because PostgreSQL rocks so hard. ISKsense is the last major app to move, and by far the trickiest. The move comes as I prepare to move the server to Ruby 1.9.1 from Ruby 1.8.7, a major change; I had the ISKsense source checked out on my staging server for 1.9.1 testing and decided to go ahead and move the db.
This is kinda a documentation post to describe the move and how you, too, can migrate your databases with a relative amount of sanity remaining intact, unlike me.
Export from MySQL
This is the easy bit.
mysqldump -B yourdb --compatible=postgresql --skip-opt --quick --extended-insert -t > dump.sql
Easy. Except that –compatible=postgresql doesn’t help much.
Working around MySQL’s crap
OK, so now we have some issues to deal with. Let’s get the simple ones out of the way first. Fire up vi and run these:
:%s/,'/,E'/g :%s/\\0/0/g
That does proper string escaping and removes null bytes, which PostgreSQL can’t handle as it uses C-style string termination internally. You may face consistency issues if you’re going from, say, LATIN1 to UTF8 in your migration; if that’s the case, add this to the top of your dumpfile:
SET CLIENT_ENCODING TO 'LATIN1';
Do you use booleans in MySQL? Heavily? You’re going to want to stab a MySQL Core Developer in a moment, so make sure the knives are somewhere you can’t find them when you’re angry. I’ll wait.
OK. MySQL doesn’t have a boolean. It’s actually TINYINT(1), with 1 being a true and 0 being a false. Which makes a sort of sense. But MySQL dump doesn’t export ‘true’ or ‘false’, it exports 1s and 0s. PostgreSQL expects true or false values, and won’t cast integers to booleans. Welcome to hell.
The easiest(!) solution myself, Makurid and the combined efforts of #mysql and #postgresql on Freenode could come up with was to set the column types in PostgreSQL to be character varying, load the dataset, rename the columns as c_columnname, make a new column which is the boolean column columnname and run the following SQL:
UPDATE table SET columnname=(CASE WHEN c_columnname::int=1 THEN true ELSE false END)
Repeat for each column, in each table. Once done, you’re good. Pat yourself on the back and go get the knives out again.
Importing back into PostgreSQL
Now you’re on the home straight.
However, one note. If you’re like me and use an insanely long password for the database user (more than 100 characters), you’ve got a problem! psql accepts passwords as a hidden entry field, but it silently cuts off at around 48 characters. There’s a workaround, though, and this is the command you’d use to import back in:
PGPASSWORD=longpasshere psql -U username database < dump.sql
The PGPASSWORD envvar passes the password in directly, no matter how long. Problem solved! You should now have a database full of data. You may also now have some issues with sequences not being correctly set. For each table:
ALTER SEQUENCE table_id_seq RESTART WITH (SELECT MAX(id) FROM table);
This will set the current value of the sequence to be the current maximum ID in that table, meaning the next assigned id will be that, plus one. Problem solved, again! You now have a database ready to face the world. Feel free to get the knives out again and find that MySQL core developer.