I've got a MySQL database. I store houses within the database and perform literally just 1 query from the database, however i need this question to become carried out very fast, and that is to come back all houses inside a square box geo latitude &lifier longitude.

Choose * FROM houses

WHERE geolat BETWEEN ??? AND ???

AND geolng BETWEEN ??? AND ???

How is the greatest way that i can store my geo data to ensure that I'm able to perform this question of exhibiting all home inside the geolocation box the fastest?

Essentially:

  • Am I while using best SQL statement to do this question the fastest?
  • Does every other method exist, not even utilizing a database, that i can query the quickest way a direct result houses inside a boxed geolocation bounds?

Just in case it will help, I have include my database table schema below:

CREATE TABLE Otherwise Is available `homes` (

  `home_id` int(10) unsigned NOT NULL auto_increment,

  `address` varchar(128) collate utf8_unicode_ci NOT NULL,

  `city` varchar(64) collate utf8_unicode_ci NOT NULL,

  `state` varchar(2) collate utf8_unicode_ci NOT NULL,

  `zip` mediumint(8) unsigned NOT NULL,

  `price` mediumint(8) unsigned NOT NULL,

  `sqft` smallint(5) unsigned NOT NULL,

  `year_built` smallint(5) unsigned NOT NULL,

  `geolat` decimal(10,6) default NULL,

  `geolng` decimal(10,6) default NULL,

  PRIMARY KEY  (`home_id`),

  KEY `geolat` (`geolat`),

  KEY `geolng` (`geolng`),

) ENGINE=InnoDB

UPDATE

I realize spatial will element in the curvature of the world but I am most thinking about coming back geo data the Quickest. Unless of course these spatial database packages in some way return data faster, do not recommend spatial extensions. Thanks

UPDATE 2

Please be aware, nobody below has truly clarified the question. I am really searching toward any assistance I would receive. Thanks ahead of time.

There's a great paper on MySQL geolocation performance here:

http://world wide web.scribd.com/doc/2569355/Geo-Distance-Search-with-MySQL

EDIT Confident this really is using fixed radius. Also I'm not 100% certain the formula for calculating distance is easily the most advanced (i.e. it'll "drill" through Earth).

What's significant would be that the formula is affordable to provide you with a ball park limit on the amount of rows to complete proper distance search.

If you will need to choose performance you are able to define bounding boxes for the data and map the pre-compute bounding boxes for your objects on insertion and employ them later for queries.

Check out this article that describes this method at length. When the resultsets are reasonably small you can still do precision corrections within the application logic (simpler to scale horizontal than the usual database) while enabling for everyone accurate results.

You will find also links to source code like Bret Slatkin's geobox.py which consists of great documentation.

I understand the content is google application engine specific however the solutions is relevant to relational databases too.

I'd still recommend looking at PostgreSQL and PostGIS compared to MySQL if you plan to complete more complicated queries within the expected future.

The indices you're using truly are B-tree indices and offer the BETWEEN keyword inside your query. Which means that the optimizer has the capacity to make use of your indices to obtain the houses in your "box". It will however not mean that it'll always employ the indices. Should you specify a variety that consists of a lot of "hits" the indices won't be used.