Suppose, for illustrative reasons, you're managing a library utilizing a simple MySQL "books" table with three posts:

(id, title, status)

  • id may be the primary key
  • title may be the title from the book
  • status happens to be an enum explaining the book's current condition (e.g. AVAILABLE, CHECKEDOUT, PROCESSING, MISSING)

An easy query to report the number of books fall under each condition is:

Choose status, COUNT(*) FROM books GROUP BY status

in order to particularly find the number of books can be found:

Choose COUNT(*) FROM books WHERE status = "AVAILABLE"

However, when the table develops to countless rows, these queries take several seconds to accomplish. Adding a catalog towards the "status" column does not seem to really make a difference in my opinion.

Apart from periodically caching the outcomes or clearly upgrading summary info inside a separate table every time a book changes condition (via triggers as well as other mechanism), what are the approaches for accelerating most of these queries? It appears the COUNT queries finish up searching at each row, and (not understanding more particulars) I am a bit surprised this information can't in some way be determined in the index.

UPDATE

While using sample table (by having an indexed "status" column) with two million rows, I benchmarked the audience BY query. While using InnoDB storage engine, the query takes 3. - 3.2 seconds on my small machine. Using MyISAM, the query takes .9 - 1.1 seconds. There is no factor between count(*), count(status), or count(1) either in situation.

MyISAM is of course a little faster, however i was curious to ascertain if there is a method to make a similar query run much faster (e.g. 10-50 ms -- fast enough to become known as on every web page request a minimal-traffic site) with no mental overhead of caching and triggers. It may sound like the reply is "there is no method to run the direct query rapidly" that is things i expected - I simply desired to make certain I wasn't missing a simple alternative.

So now you ask ,

what are the approaches for accelerating most of these queries?

Well, not necessarily. A column-based storage engine would most likely be faster with individuals Choose COUNT(*) queries but it might be less performant for virtually every other query.

Your best choice would be to conserve a summary table via triggers. It does not cash overhead and also the Choose part is going to be immediate regardless of how large the table. Here's some boilerplate code:

DELIMITER //

CREATE TRIGGER ai_books AFTER Place ON books

For Every ROW UPDATE books_cnt SET total = total + 1 WHERE status = NEW.status

//

CREATE TRIGGER ad_books AFTER Remove ON books

For Every ROW UPDATE books_cnt SET total = total - 1 WHERE status = OLD.status

//

CREATE TRIGGER au_books AFTER UPDATE ON books

For Every ROW

BEGIN

    IF (OLD.status <> NEW.status)

    THEN

    UPDATE books_cnt SET total = total + IF(status = NEW.status, 1, -1) WHERE status IN (OLD.status, NEW.status)

    Finish IF

Finish

//