I've got a question about SQL Server indexes. I am not really a DBA and assume the reply is obvious for individuals individuals which are. I'm using SQL Server 2008.

I've got a table that's like the following (but has more posts):

CREATE TABLE [dbo].[Results](
    [ResultID] [int] IDENTITY(1,1) NOT NULL,
    [TypeID] [int] NOT NULL,
    [ItemID] [int] NOT NULL,
    [QueryTime] [datetime] NOT NULL,
    [ResultTypeID] [int] NOT NULL,
    [QueryDay]  AS (datepart(day,[querytime])) PERSISTED,
    [QueryMonth]  AS (datepart(month,[querytime])) PERSISTED,
    [QueryYear]  AS (datepart(year,[querytime])) PERSISTED,
 CONSTRAINT [PK_Results] PRIMARY KEY CLUSTERED 
(
    [ResultID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]

The key fields to see listed here are ResultID, the main key, and QueryTime the datetime where the end result was created.

I additionally possess the following index (among others):

CREATE NONCLUSTERED INDEX [IDX_ResultDate] ON [dbo].[Results] 
(
    [QueryTime] ASC
)
INCLUDE ( [ResultID],
[ItemID],
[TypeID]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [PRIMARY]

Inside a database where I've in regards to a million rows within the table, the index can be used when you are performing a question for example:

select top 1 * from results where querytime>'2009-05-01' order by ResultID asc

In another instance of the identical database, with 50 million rows, SQL Server decides to not make use of the index because it rather does a Clustered Index Scan which eventually ends up being horribly slow. (and speed is dependent around the date). Even when I personally use query hints to really make it use IDX_ResultDate, it's still a little slow also it stays 94% of their time sorting by ResultID. I believed that by creating a catalog with both ResultID and QueryTime as sorted posts within the index, I possibly could accelerate my query.

I therefore produced the next:

CREATE NONCLUSTERED INDEX [IDX_ResultDate2] ON [dbo].[Results] 
(
[QueryTime] ASC,    
[ResultID] ASC
)
INCLUDE ( [ItemID],
[TypeID]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [PRIMARY]
GO

I believed it would first make use of the sort by QueryTime to obtain the matching results, which may be sorted by ResultID. However, this isn't the situation because this index changes nothing in performance within the existing one.

Then i attempted the next index:

CREATE NONCLUSTERED INDEX [IDX_ResultDate3] ON [dbo].[Results] 
(
    [ResultID] ASC,
    [QueryTime] ASC
)
INCLUDE ( [ItemID],
[TypeID]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 90) ON [PRIMARY]
GO

That one creates the intended result. It seems to come back in constant time (a part of a second).

However, I'm puzzled at why IDX_ResultDate3 is effective whereas IDX_ResultDate2 does not.

I'd think that a binary search in as sorted listing of QueryTime then peeking in the first lead to it's child listing of ResultIDs may be the quickest way at obtaining the result. (Hence my primary sort order).

Side question: Must I produce a endured column using the date part of QueryTime and index on that rather (I curently have three endured posts as you can tell above)?