I'm attempting to program a wordpress plugin to bbPress (outdoors source forum software) which will work much like Hacker News (http://news.ycombinator.com/).

Particularly, I wish to sort an order of forum-threads (bbPress calls them "subjects") while using following formula:

sort_value = (p - 1) / (t + 2)^1.5
where p = total votes for each topic from users
t = time since submission of each topic in hours

I must have the ability to sort subjects with this calculated sort_value using MySQL.

The appropriate fields within the topics table looks something similar to this:

topic_id            bigint(20)
topic_start_time    datetime

This really is up in mid-air, but I believed you will see another table that stores individual votes by customers so we'll have the ability to know whether a person has chosen already. And the other table will keep current election-totals for every subject. Maybe you will see another area for the reason that table storing the most recent calculated sort_Value?

To become 100% accurate, the type_value ought to be up-to-date after each new election. This could drink too much load towards the database server, though, particularly if we attempted to update All of the subjects. If we must, we're able to limit the dataset by only calculating the typeworth during the last X # of subjects. We're able to also limit the burden by only upgrading the typeworth periodically (e.g. every a few minutes using a cron job).

These cutting corners will make the burden acceptable, however i would rather a far more elegant solution that may scale better.

How does one structure this? :-)

You will find numerous tradeoffs to think about within this. You've suggested their way already inside your question. Timeliness and Exactness versus Load and Scale.

Batching the information is the easiest method to decrease Load and increase Scale if Timeliness and Exactness aren't necessary and also the system encounters a higher load of creates.

You need to kind of examine the the useage from the system and discover what areas you have to optimize for. Optimizing for Write has different constraints than optimizing for Reads. Same for timeliness or Exactness from the data.

Pick which ones are most significant for the application making the right tradeoff.

OK, this really is my idea. I'll begin by creating an old_table which has X rows of subjects having a sort_value area.

I wish to avoid a lot of UPDATE claims on one table, so I'll periodically replace that old table having a freshly calculated table. So far as I am aware, MySQL doesn't support a "replace table" syntax, so every Y minutes, via cron, I'll create an up-to-date version of the table known as new_sort_value. Then I'll do that sequence of instructions:

  • DROP old_table
  • Relabel new_table to old_table

Performs this appear just like a valid approach?