I've got a lengthy table with geolocated points:
id lat lon ----------------------- 1 39.4600 110.3523410 2 39.4601 110.3523410 3 39.4605 110.3523410 4 39.4609 110.3523410
A lot of individuals points will overlap when proven on the map, because they are very close. How could one get uniform distribution of points? That's, some points in which the distance together were greater than a given one.
For instance, the length (latitude) between point 1 and point 2 is .0001. Can One obtain a table result that contains only points separated by a lot more than .0003 (or other quantity)?
Utilizing a geospatial database might be easy, but using normal SQL it appears no apparent task (for me personally a minimum of).
The quickest method of doing this (roughly) would be to assign every location to some power grid square as then only keep some point per square. This is a lot more efficient the other techniques listed:
SELECT DISTINCT ROUND(lat*250, 0), ROUND(long*250, 0) FROM sometable;
You might want to average the locations in every power grid square:
SELECT AVERAGE(lat), AVERAGE(long) FROM sometable GROUP BY ROUND(lat*250, 0), ROUND(long*250.0, 0);
To manage the granularity from the grouping, just alter the scaling factor up or lower from 250.
An alternate (and reduced) approach would be to perform a CROSS JOIN to ensure that every location will get combined with almost every other point after which make use of a distance formula to mark pairs which are below the absolute minimum threshold. When the distance formula feels too complex, a less complicated strategy is to limit the join to where
ABS(a.long - b.long) < 0.1 AND ABS(a.lat - b.lat) < 0.1. Which will identify points which are close together.
Observe that a mix join is definitely an O(n**2) operation so there might be issues by trying to scale this method to a lot of points. The answer would be to pre-group the points into more compact regions and run the mix joins over just points in the area.
If you're able to inflict work outdoors of SQL, it might be appropriate to utilize a clustering algorithm.