I've got a Data intensive problem which requires lots of rubbing and data manipulation and I am putting this available to ascertain if anybody comes with an idea regarding how to overcome it.
In easiest form. I've got a large amount of tables which may be became a member of together to provide us a cost listing for dental practitioners and just how much each charges for any procedure.
therefore we have multiple tables that appears such as this.
Dentist | Procedure1 | Procedure2 | Procedure3 | .........| Procedure? John | 500 | 342 | 434 | .........| 843 Dave | 343 | 434 | 322 | NULLs....| Mary | 500 | 342 | 434 | .........| 843 Linda | 500 | 342 | Null | .........| 843
Dental practitioners might have different quantity of methods and various prices for every methods. But you will find lots of Dental practitioners that have a similar quantity of methods and also the same rates that complements it. Internally, we produce a unique ID for all these so-known as fee entries.
like John could be 001, Dork could be 002, but Mary could be fee 001 and Linda could be 003 It isn't so bad basically suffer from this data once however these fee entries is available in flat files (csvs) that we essentially need to DTS up to and including SQL server to utilize. plus they seriously a monthly bases. The prices could change monthly for every dental professional which in turn would insert them in another unique ID internally.
Can someone shed some light on regarding the proper way to approach this issue to ensure that it's most effective to process monthly without needing to do a lot of data manipulation?
- what's a great way to discovering the replicates from the fee entries?
- How do you keep an eye on upgrading a Dentist's fee listing incase they alter their rates the following month? if Mary decides to charge another fee for procedure2, then she'd possess a different unique ID internally. how do you keep an eye on that on the monthly bases without needing to remove everything and re-place?
- You will find a couple of million fee entries that I am dealing with plus some have standard rules that derive from zipcodes plus some are simply unique fee entries, what is the approach here?
- I'm able to write some type of ad-hoc .internet program to utilize it but it is lots of data and dealing straight in SQL server could be simpler for me personally.
any help could be great, thanks men.
You most likely have to unpivot the information to normalize it - to ensure that you finish track of:
Doctor: DoctorID, DoctorDetails... FeeSchedule: DoctorID, ScheduleID, EffectiveDate, OtherDetailAtThisLevel... FeeScheduleDetail: ScheduleID, ProcedureCode, Fee, OtherDetailAtThisLevel...
Once the data is available in for any physician, it's pivoted, a brand new schedule is produced and also the detail rows are produced in the unpivoted data.
SSIS comes with an unpivot component that is fine - you'd load the schedule first and so the detail. When the format varies considerably, you may need a custom databases or simply avoid SSIS.
This technique would keep an eye on new agendas for doctors. When the schedule is similar for any physician, you can not place it.
If the logic is extensive, you can load the information to staging tables (SSIS or whatever) and do all of this in SQL (T-SQL also offers an UNPIVOT operator). That may have advantages for the reason that the code is on this page and may do its procedures in sets.
Concerning the zip codes, when the physician does not possess a fee, are these like usual and customary fee? This might just be determined in the zipcode from the physician row. Within this situation you've got a couple of options. You are able to overlay the physician fee schedule on the zipcode fee schedule:
ZipCodeSchedule: ZipScheduleID, ZipCode, EffectiveDate ZipCodeScheduleDetail: ZipScheduleID, ProcedureCode, Fee
Or you might save this within the regular feeschedule (potentially with a few type of flag it was past due towards the UCR).