Simple situation, two column table [ID, TEXT]. The Written Text column has 1-10 word phrases. 300,000 rows.

Running the query:

SELECT * FROM row 
 WHERE text LIKE '%word%' 

...required .1 seconds. Ok.
And So I produced a second column, the table presently has: [ID, TEXT2, TEXT2] I made TEXT2 = TEXT (utilizing an UPDATE table SET TEXT2 = TEXT]

I Quickly run the query for '%word%' again, also it takes 2.4 seconds.


This leaves me super stumped but after a great deal of blind walkways, I run OPTIMIZE up for grabs, also it would go to about .2 seconds.

Two questions:

  1. Does anybody understand how the information structure get's itself in this mess whereby doubling the information boosts the search time with this query with a factor of 24?
  2. Could it be standard to have an not-indexed search such as this to improve in the rate from the underlying table data structure instead of the information in the column being looked?

Thanks!

Sounds in my experience like you're the victim of Query caching. The 2nd time your run the query (following the optimize), it already has got the answer cached, and then the outcome is came back instantly. Perhaps you have attempted trying to find different search phrases. Try running the query with caching switched off as so:

SELECT SQL_NO_CACHE * FROM row WHERE text LIKE '%word%'

To ascertain if this changes the outcomes, or try trying to find different words, however with similar quantity of results to make sure that your server is not just came back a cached value.

The very first time it will a table scan which sounds about suitable for the timing - no index involved.

Then you definitely added the index and also the mysql optimizer does not notice there is a wildcard around the front, therefore it scans the whole index to obtain the records, then needs two more reads (someone to the PK, the other in to the table after that) to obtain the data record in addition.

OPTIMIZE most likely just updates the optimizer statistics therefore it knows it will scan the table again.

I'd believe that the main difference is triggered through the elevated row length leading to the table to become fragmented around the disk. Optimize will sort this problem out, resulting in the search time coming back to normalcy (more or less a little).