I've database schema to have an integration project by which I have to have the ability to query for records which have transformed, only with different given group of fields within that record.

So, for example, here's a good example table:


  • ID
  • Title
  • Phone
  • Fax
  • Balance

I have to be to question to fetch records whose Title, Phone, or Fax fields have transformed. However, other fields shouldn't be taken into consideration, i.e. if only the Balance area changes, my query shouldn't pull that record in (thus, an timestamp area that updates instantly whenever the record is modified doesn't work).

Also, this needs to operate on a variety of databases and platforms, so TRIGGERS or something like that similar aren't actually a choice unless of course they'll operate on MySQL, PostgreSQL, SQL Server, and SQLLite.

The fields are modified with a third-party application which i can't modify, and so i can't just give a flag and also have the third-party application set the flag to TRUE whenever it modifies another area.

My primary means to fix this really is to calculate a HASH from the relevant fields and store it inside a new area 'LastHash' or something like that. Then, I'm able to calculate the hash from the relevant fields for that data presently within the record, and when it does not match the saved LastHash, I understand it's transformed.

That appears pretty untidy... however it appears enjoy it works. It is possible to better way? Otherwise, can there be a great way to implement that hash therefore it is efficient and never too time-consuming to extract individuals transformed records?


Some clarifications: Both my application and another application update and place in to these tables. I can make my application calculate the first hash. I can not result in the other application calculate it though.

Timestamp posts that instantly update each time a record changes are do-able, individuals are simple to duplicate in most database systems using different column types or quite simple triggers.


If hashing is what you want... can there be any kind of efficient hash formula that will not take a long time to calculate on many of these records? MD5 or SHA1 might work, however they appear like they'd be sllloowwww.

This is a tough one. You are still going to need to table scan (or index scan), since You need to calculate the brand new hash and compare it towards the old hash saved.

If triggers aren't possible due to mix-platform concerns, you may have the ability to possess the database engine calculate the present hash (i.e. endured calculated column - effectively just like a trigger). This is mix-platform problem, though! Then should you index the present hash as well as your hash, it is a relatively simpler search.

Are you able to a minimum of make use of the timestamp area to lessen the amount of hashes you have to check?

Another factor to keep in mind is the fact that there is no such factor like a perfect hash function, so you may choose to have false disadvantages (accidental hash collision creates a change not detected). Is the fact that (astronomically small) risk worth taking?

I'd standardize the way your application inspections for any difference, not the way the database implements it. Try something similar to utilizing a view having a particular column that signifies a big change. Then make use of the proper methods implemented in every database to create that notice a reality. Code that is dependent on checking with this difference would then function as the same, utilizing the same view and column.