Before I describe my problem, Let me obtain a couple things taken care of:
- I am a skilled (though not expert) database designer. In my opinion I've got a good grasp from the relational model.
- I do not have this type of firm knowledge of the relational model which i know what to do in each and every situation. I am still learning.
Let us say we obtain an Stand out spreadsheet monthly from the bank, although not always exactly the same bank. The spreadsheet just six posts: bank title, account number, balance, customer (accountholder) title, customer SSN and accountholder address. Each row includes a different account number with no account number shows up in several row. You want to import this spreadsheet right into a database and, anytime later on, say, "That which was John Smith's address on October 13, 2010?"
For simplicity, let us state that every customer has only one address which every customer might have zero or even more accounts. And for one second, let us pretend that people just do one Stand out sheet import EVER, the industry silly premise, but bear beside me. If that is the situation, the next design would suffice:
bank -------- id name account -------- id bank_id customer_id number balance customer -------- id name ssn address city state_id zip state -------- id name
The relaxation of my question is dependant on the premise that you simply agree that that schema is "correct", so hopefully you are fine by using it.
Now, that might be fine when we only ever did one import, but we'll do 12 imports per bank each year. Here's how I believed of accounting for your:
bank -------- id name account -------- id import_id bank_id customer_id number balance customer -------- id name ssn address city state_id zip state -------- id name import -------- id date excel_file (blob)
Now every account is associated with an import and that we know with certainty such things as "Account 12345 originated from import 572 on 10/13/10." It will get potentially a bit more ambiguous whenever you take a look at, say, the
customer table. Since you will find less rows within the
customer table compared to the
account table (because some clients have multiple accounts), we do not have that certain-to-one relationship between clients and imports like we all do for accounts and imports. I understand there is no loss of data and there is no data loss integrity, however it still feels as though some kind of sacrifice in some way.
My real question is (which might be too open-ended): Do you consider this is an excellent method to keep data? Would you have carried out it in a different way?
Edit: there's an essential thought process about these organizations you need to be familiar with. Don't think about an
account as you account that is available with time. Think about an
account like a snapshot of the account in a certain time. Therefore, account 12345 with balance $100 isn't the same
account as account 12345 with balance $150. Yes, both records are associated with exactly the same banking account within the real life, but what I am storing is really a snapshot from the account in a certain time. Similar (although not identical) situation with clients.
I apologize, I can not reconcile the claims "each client only has one address" and "you want to say 'What was John Smith's address on October thirteenth, 2010'". Are you currently recommending that on each import, you'll produce a new customer record for each individual based in the import? If that's the case, how would you realize that John Cruz in a single import is identical John Cruz from another import when the account amounts will vary?
And when you reuse exactly the same customer record for the similar customer (which appears correct for me personally) where you can you discover prior address information?
[After comments and changes through the poster]
Okay, you are almost there. You need to add the client address towards the Account table (that ought to be re-named AccountImports or something like that like this). Like each import may have another address.
Storing the address in AccountImports is slightly not-normal when the address frequently stays exactly the same from import to import. If that's the case, you can include a CustomerAddressHistory table. Throughout each import, look into the latest address for that SSN in CustomerAddressHistory and, if different then the import, add the brand new address to a different record for the reason that table.