I have read that certain of the advantages of normalization would be to reduce redundancy within the DB. But I am wondering, should you finish up referencing all of the posts within the target table?

For instance, if I've got a Video table that references a Genre table, the Genre table might most likely possess a single column having a dozen fairly static values like 'Horror' 'Sci-Fi' 'Romance' etc.

Inside a situation such as this, will it save any space to split up the 2, or perhaps is the only real benefit which makes it so that you can update all referencing rows in one place?

Right, space-saving is ONE from the benefits, only some of the one.

Within the situation you pointed out, no, you will save no space if you are using that certain column because the PK that is fine.

You can abstract that table having a autonumber/sequence and employ that because the PK, making the present column the candidate key (therefore it stays unique).

But departing your design just as you've layed out, the advantage is within consistency. You will have only individuals 12 values... you will not accidentally enter something for "Horrer" or "PSY-Fi"

Saving space is a help to separating the two tables. Enjoy it was stated before, placing a Genre_ID instead of a real value for example "Horror" or "Adventure" helps you to save space.

For me, the greater a part of carrying this out to to enforce integrity. If you devote the written text values within the Video table, what prevents you against altering the worthiness accidentally? Now some rows might have "Adventure" or "Action/Adventure" and so forth. By getting 2 tables and referencing having a foreign key, you are likely to have better treatments for what values could be a genre.

To sum up, don't be concerned about because you reference all of the posts, particularly if a table has very couple of posts. If you choose to add an ID area, or simply keep your 1 column table as a listing of "acceptable values", your ultimate goal ought to be to enforce integrity first, and save space or I/O costs second.

I'd use surrogate secrets (Autonumber, Identity, etc) and employ that for that foreign key join rather than the particular value.

The concept is much more about data quality than reducing space.

In many db's an INT is going to be more compact than Varchar2 (20)

Yes, it'll save space for those who have a surrogate key (int) that you simply use within the recording table rather than the varchar(20) or regardless of the genra could be.

But you've hit the issue yourself there:

single column having a dozen fairly static values like 'Horror' 'Sci-Fi' 'Romance' etc.

With surrogate secrets and stabilized tables, you simply have "Horror" saved once in database, nevertheless its ID number is saved in a number of places (an easy number is more compact compared to text more often than not, and does save space). It doesn't only boost the maintainability from the database, however it truly does save raw space.

What goes on if you wish to make sure that your rows within the Video table have valid/predetermined records for Genre? Without having an overseas key constraint you'd need an enum for your column within the Video table after which you would need to alter the schema any time you give a new Genre rather than just adding a brand new row to some Genre table.

In the event like this, your key values plus their indexes could be substantially bigger compared to data itself. Another type of doing simple codes like that's to possess a table of codes after which an place increase check constraint to validate them. Which eliminates a join to be able to obtain the genre data out. Which way you're doing so is type of a toss up and would rely on what the application queries are usually.

Data modification anomalies

  • Let's say you give a new genre?
  • Is Sci-Fi just like SciFi?
  • Is Sci-Fi just like Sci-fi?

It will get worse should you another table, say, "Books" that have a similar Genres.