I'm a a new comer to Informix and included in my testing activity I'm looking for creating 2TB+ size data for Oracle, Informix &lifier Sybase. It is possible to database-neutral method of carrying this out?
I'm searching for any free software or free tools too I'm able to look for a couple of for Oracle but almost no for Informix &lifier Sybase.
I have carried this out kind of factor many occasions with a few simple Python, Perl or Ruby script either to generate SQL claims or some CSV style file that the database-specific tool can import.
Two terabytes is though. You might like to get it done in batches.
- You have to select how repetitive the information could be.
- Load your distinct data in whatever tables needed.
Grow that data tremendously with
Place INTO my_table Choose * From the_table
If you want unique primary key fields, replace individuals with sequences in relevant card inserts for Oracle and regardless of the equivalent is perfect for other DBs.
In case your hardware can't handle the burden of doubling 100G+ of information, get it done in more compact batches. Use WHERE rownum < 100000... for Oracle and regardless of the equivalent is perfect for other DBs.
Working across multiple databases in by doing this is really a non-trivial task. Frankly, 2TB is appropriate towards the top finish of what's achievable with one of these items (unless of course you're using Sybase IQ - that you simply did not mention). If you're doing data warehousing or confirming out of this data then you might like to reconsider you product options.
simpler to provide you with advice should you described why you need to load 2TB of test data. Also, the databases? "Methods" that actually work for loading in Oracle will change for Sybase. Anyway here's my generic advice…
First, take a look at DDL and completely remove every constraints and auto incrementing values. The DB stays lots of CPU and IO cycles checking these values when you are doing any type of place, so eliminate them. It will likely be faster to re-apply them later anyway, if required.
Second, produce a 1 column table for every column that you would like to possess inside your final table. For example, if this sounds like a previous address table, you may have:
First_Name, Last_Name, Address_1, etc.
Populate all of this tables having a small sample from the values you anticipate within the real data, say 10 rows per table.
Let's focus on the miracle: you mix join many of these 1-column tables together inside a cartesian product. This provides you with 1 row for each possible mixture of your 1 column tables and therefore "inflate" these to the dimensions you need.
Example Query: (syntax can vary per db)
SELECT * FROM First_Name CROSS JOIN Last_Name CROSS JOIN Address_1 … CROSS JOIN Post_Code
You are able to calculate just how much data is going to be produced by spreading the row counts.
10 tables w/ 10 rows = 10^10 = 10,000,000,000 = 10 billion rows
Then multiple your rows count through the average row size to obtain a total data volume, excluding db overhead.
(128 byte rows * 10 billion rows) / 1024^4 (Terabyte) = 1.164 Terabytes of sample data.
The Easiest Way
Download an effort copy of Benchmark Factory from Mission. This will help you to push a number of different benchmark data sets to your database and run them. Expensive though if you wish to keep utilizing it.