I have got multiple massive (multi gigabyte) datasets I have to import right into a Rails application. The datasets are presently each in their own individual database on my small development machine, and I have to read from their store and make rows in tables during my Rails database in line with the information they contain. The tables during my Rails database won't be exactly like the tables within the source databases.
What is the wisest approach to take relating to this?
I believed migrations, but I am not quite sure how you can connect the migration towards the databases, and even when that's possible, is the fact that likely to be absurdly slow?
without seeing the schemas or understanding the logic you need to affect each row, The quickest method to import this information is to produce a view on the table you need to export within the column order you would like (and process it using sql) and also the perform a choose into outfile on that view. After that you can go ahead and take resulting file and import it in to the target db.
This can not permit you to use any rails model validations around the imported data, though.
Otherwise, you need to go the slow way and make up a model for every source db/table to extract the information (http://programmerassist.com/article/302 informs you the way for connecting to another db for any given model) and import it this way. This will probably be quite slow, however, you could setup an EC2 monster instance and run it as quickly as possible.
Migrations works with this, however i wouldn't recommend it for something similar to this.
Since georgian recommended it, I'll publish my comment being an answer:
When the changes are superficial (column names transformed, posts removed, etc), i quickly would certainly by hand export them in the old database and in to the new, after which operate a migration to alter the posts.