I am going to write some example programs and associated documents evaluating methods for being able to access information saved in relational databases. To show real-existence needs, I have to incorporate a realistic dataset of 100s of 1000's of details.
Is anybody conscious of openly available, free datasets of this magnitude, of datasets of human names with human-level variance, or hierarchical datasets of either large business hierarchies, or large hierarchical, categorized, product catalogues?
Please point me within the right direction, if you're.
Part 1, human names: http://timecenter.cs.aau.dk/software.htm
Part 2, hierarchical data: no answer yet
The wikipedia dump is fairly massive: obligatory wikipedia link.
Your personal PC's directory tree is really a large hierarchical structure with a lot of details. You most likely possess a couple of 1000 "Details" that are file names, modification dates, dimensions, extra OS info, etc., etc.
If that is not big enough, look for a server that you could login to. That'll be bigger.
Not big enough? Obtain a web crawler and begin moving a large site. That may be as huge as you will find the persistence to crawl.