I've got a database table with 100s of 1000's of forum posts, and I must discover what hour-lengthy period consists of probably the most quantity of posts.
I possibly could crawl forward about a minute at any given time, keeping a range of timestamps and monitoring what hour had probably the most inside it, however i seem like there's a far greater method of doing this. I'll be running this operation on the year of posts so checking every minute each year appears pretty awful.
Ideally there will be a method of doing this in the single database query.
Given a table full of every minute around you are looking at
Minutes along with a table
Posts having a
select top 1 minutes.time, count (posts.time) from Minutes left join posts on posts.time >= minutes.time AND posts.time < dateadd(hour, 1, Minutes.Time) group by minutes.time order by count (posts.time) desc
To resolve producing the minutes table, use a function like ufn_GenerateIntegers. Then your function becomes
select top 5 minutes.time, count (posts.time) from (select dateadd(minute, IntValue, '2008-01-01') as Time from ufn_GenerateIntegers(525600)) Minutes left join posts on posts.time >= minutes.time AND posts.time < dateadd(hour, 1, Minutes.Time) group by minutes.time order by count(posts.time) desc
I simply did an evaluation run about 5000 random posts also it required 16 seconds on my small machine. So, not trivial, although not rediculous for that periodic one-off query. Fortunately, this can be a data-point you are able to calculate one each day as well as monthly and cache if you wish to display the worthiness frequently.
Have a look at lassevk's improvement.
Binning works if you wish to take a look at times for example 10:00 - 11:00. However should you have had an abrupt flurry of great interest from 10:30 - 11:30 then it will likely be split across two bins, and therefore might be hidden by an more compact quantity of hits that happened to suit entirely inside a single clock hour.
The only method to avoid this issue would be to generate a listing sorted by some time and step through it. Something similar to this:
max = 0; maxTime = 0 for each $item in the list: push $item onto queue while head of queue is more than an hour before $item drop queue head. if queue.count > max then max = queue.count; maxTime = $item.time
This way you only have to hold single hour window in memory as opposed to the whole list.