I have seen several database cache engines, all are pretty dumb (i.e.: keep this query cached for X minutes) and require that you simply by hand remove the entire cache repository following a INSERT / UPDATE / DELETE query continues to be performed.

About two or three years back I developed an alternate DB cache system for any project I had been focusing on, the concept was essentially to make use of regular expressions to obtain the table(s) involved with a specific SQL query:

$query_patterns = array
(
    'INSERT' => '/INTO\s+(\w+)\s+/i',
    'SELECT' => '/FROM\s+((?:[\w]|,\s*)+)(?:\s+(?:[LEFT|RIGHT|OUTER|INNER|NATURAL|CROSS]\s*)*JOIN\s+((?:[\w]|,\s*)+)\s*)*/i',
    'UPDATE' => '/UPDATE\s+(\w+)\s+SET/i',
    'DELETE' => '/FROM\s+((?:[\w]|,\s*)+)/i',
    'REPLACE' => '/INTO\s+(\w+)\s+/i',
    'TRUNCATE' => '/TRUNCATE\s+(\w+)/i',
    'LOAD' => '/INTO\s+TABLE\s+(\w+)/i',
);

I understand these regexs most likely possess some defects (my regex abilities were pretty eco-friendly in those days) and clearly don't match nested queries, consider I never rely on them that is not an issue for me personally.

Anyway, after locating the involved tables I'd alphabetically sort them and make up a new folder within the cache repository using the following naming convention:

+table_a+table_b+table_c+table_...+

Just in case of the SELECT query, I'd fetch the outcomes in the database, serialize() them and store them within the appropriate cache folder, so for example the outcomes from the following query:

SELECT `table_a`.`title`, `table_b`.`description` FROM `table_a`, `table_b` WHERE `table_a`.`id` <= 10 ORDER BY `table_a`.`id` ASC;

Could be saved in:

/cache/+table_a+table_b+/079138e64d88039ab9cb2eab3b6bdb7b.md5

The MD5 being the query itself. Upon a consequent Choose query the outcomes could be trivial to fetch.

Just in case of any other kind of write query (INSERT, REPLACE, UPDATE, DELETE and so forth) I'd glob() all of the folders which had +matched_table(s)+ within their title all remove all of the file contents. By doing this it can't be essential to remove the entire cache, only the cache utilized by the affected and related tables.

The machine labored pretty much and also the difference of performance was visible - even though project had a lot more read queries than write queries. Since that time I began using transactions, FK CASCADE UPDATES / DELETES rather than had time to master the machine to really make it use these functions.

I have used MySQL Query Cache previously but I have to admit the performance does not even compare.

I am wondering: shall we be held the only person who sees beauty within this system? Can there be any bottlenecks I might not be conscious of? So why do popular frameworks like CodeIgniter and Kohana (I am unaware of Zend Framework) have such rudimentary DB cache systems?

More to the point, would you check this out like a feature worth going after? If so, can there be anything I possibly could do / use to allow it to be even faster (my primary concerns are disk I/O and (p)serialization of query results)?

I appreciate all input, thanks.

I can tell the wonder within this solution, however, I belive it only is employed by a very specific group of programs. Situations where it's not relevant include:

  • Databases which utilize cascading down removes/updates or any type of triggers. E.g., your Remove to table A could cause a Remove from table B. The regex won't ever catch this.

  • Being able to access the database from points that do not undergo you cache invalidation plan, e.g. crontab scripts etc. Should you ever choose to implement replication across machines (introduce read-only slaves), this may also disturb the cache (because it doesn't undergo cache invalidation etc.)

Even when these situations aren't realistic for the situation it will still answer the question of why frameworks don't implement this type of cache.

Regarding if this sounds like worth going after, everything is dependent in your application. You may choose to supply more details?