I am posting data to some future database which will have one, static MyISAM table (are only read from). I selected MyISAM because so far as I realize it's faster for my needs (I am not so familiar with MySQL / SQL whatsoever).
That table may have various posts for example ID, Title, Gender, Phone, Status... and Country, City, Street posts. Now now you ask ,, must i create tables (e.g Country: Country_ID, Country_Title) during the last 3 posts and make reference to them within the primary table by ID (normalize...[?]), or simply store them as VARCHAR within the primary table (getting replicates, clearly)?
Initially my problem is speed - because the table will not be written into, data integrity isn't a priority. The only real actions is going to be choosing a particular row or trying to find rows much a particular criteria.
Would searching through the Country, City and/or Street posts (and perhaps other posts within the same search) be faster basically simply employ VARCHAR?
EDIT: The table has about 30 posts contributing to 10m rows.
It may be faster to look should you normalize because the database will just compare an integer rather than a string. The table data may also be more compact which causes it to be faster to look weight loss could be loaded into memory at the same time.
In case your tables are indexed properly then it will likely be extremely fast in either case - you most likely will not watch a factor.
You could also want to check out a full text search when you are writing
LIKE '%foo%' because the latter will not have the ability to make use of an index and can lead to a complete table scan.
I'll try to provide you with some thing when compared to a "It Is dependent" answer.
#1 - Things are fast for small N - for those who have under 100,000 rows, just load it flat, index it since you need to and move onto something greater priority.
Keeping everything flat in a single table is faster for reading through everything (all posts), but to find or search in it you typically need indexes, in case your data is large with redundant City and Country information, it may be easier to have surrogate foreign secrets into separate tables, but you will never say solid.
For this reason some type of data modeling concepts are nearly always used - either traditional stabilized (e.g. Entity-Relationship) or dimensional (e.g. Kimball) is generally used - the guidelines or methods in the two cases are made to assist you to model the information without needing to anticipate every use situation. Clearly, knowing all of the usage designs will prejudice your computer data model towards supporting them - so lots of aggregations and analysis is really a strong indicator to utilize a denormalized dimensional model.
Therefore it really is dependent a great deal in your data profile (row width and row count) and usage designs.
I do not cash more when compared to a "It Is dependent" answer, regrettably.
Opt for just as much normalization since you need for that searches you really do. Should you never really search for those who survive Elm Street in Sacramento or on Walnut Avenue in Colorado, any effort to normalize individuals posts is virtually wasted. Ordinarily you'd normalize something of that nature to prevent update errors, but you've mentioned that data integrity isn't a risk.
Be careful about your slow query log just like a hawk! That will explain what you ought to normalize. Do
EXPLAIN on individuals queries and see whether you can include a catalog to enhance it or whether you have to normalize.
I have labored with a few data models that people would known as "hyper-stabilized." These were in most the correct normal forms, but frequently for stuff that just did not require it based on how we used the information. Individuals types of data models take time and effort to know having a casual glance, and they may be very annoying.