|
@@ -1,177 +0,0 @@
|
|
|
-[[geohashes]]
|
|
|
-=== Geohashes
|
|
|
-
|
|
|
-http://en.wikipedia.org/wiki/Geohash[Geohashes] are a way of encoding
|
|
|
-`lat/lon` points as strings. The original intention was to have a
|
|
|
-URL-friendly way of specifying geolocations, but geohashes have turned out to
|
|
|
-be a useful way of indexing geo-points and geo- shapes in databases.
|
|
|
-
|
|
|
-Geohashes divide the world up into a grid of 32 cells -- 4 rows and 8 columns
|
|
|
-Greenland, all of Iceland and most of Great Britian. Each cell can be further
|
|
|
-divided into anokther 32 cells, which can be divided into another 32 cells,
|
|
|
-and so on. The `gc`, cell covers Ireland and England, `gcp` covers most of
|
|
|
-London and part of Southern England, and `gcpuuz94k` is the entrance to
|
|
|
-Buckingham Palace, accurate to about 5 metres.
|
|
|
-
|
|
|
-In other words, the longer the geohash string, the more accurate it is. If
|
|
|
-two geohashes share a prefix -- `gcpuux` and `gcpuuz` -- then it implies that
|
|
|
-they are near to each other. The longer the shared prefix, the closer they
|
|
|
-are.
|
|
|
-
|
|
|
-That said, two locations that are right next to each other may have completely
|
|
|
-different geohashes. For instance, the
|
|
|
-http://en.wikipedia.org/wiki/Millennium_Dome[Millenium Dome] in London has
|
|
|
-geohash `u10hbp`, because it falls into the `u` cell, the next top-level cell
|
|
|
-to the east of the `g` cell.
|
|
|
-
|
|
|
-Geo-points can index their associated geohashes automatically, but more
|
|
|
-importantly, they can also index all geohash *prefixes*. Indexing the location
|
|
|
-of the entrance to Buckingham Palace -- latitude `51.501568` and longitude
|
|
|
-`-0.141257` -- would index all of the geohashes listed in the table below,
|
|
|
-along with the approximate dimensions of each geohash cell:
|
|
|
-
|
|
|
-[cols="1m,1m,3d",options="header"]
|
|
|
-|=============================================
|
|
|
-|Geohash |Level| Dimensions
|
|
|
-|g |1 | ~ 5,004km x 5,004km
|
|
|
-|gc |2 | ~ 1,251km x 625km
|
|
|
-|gcp |3 | ~ 156km x 156km
|
|
|
-|gcpu |4 | ~ 39km x 19.5km
|
|
|
-|gcpuu |5 | ~ 4.9km x 4.9km
|
|
|
-|gcpuuz |6 | ~ 1.2km x 0.61km
|
|
|
-|gcpuuz9 |7 | ~ 152.8m x 152.8m
|
|
|
-|gcpuuz94 |8 | ~ 38.2m x 19.1m
|
|
|
-|gcpuuz94k |9 | ~ 4.78m x 4.78m
|
|
|
-|gcpuuz94kk |10 | ~ 1.19m x 0.60m
|
|
|
-|gcpuuz94kkp |11 | ~ 14.9cm x 14.9cm
|
|
|
-|gcpuuz94kkp5 |12 | ~ 3.7cm x 1.8cm
|
|
|
-|=============================================
|
|
|
-
|
|
|
-The {ref}query-dsl-geohash-cell-filter.html[`geohash_cell` filter] can use
|
|
|
-these geohash prefixes to find locations near a specified `lat/lon` point.
|
|
|
-
|
|
|
-[[geohash-mapping]]
|
|
|
-==== Mapping geohashes
|
|
|
-
|
|
|
-The first step is to decide just how much precision you need. While you could
|
|
|
-index all geo-points with the default full 12 levels of precision, do you
|
|
|
-really need to be accurate to within a few centimeters? You can save yourself
|
|
|
-a lot of space in the index by reducing your precision requirements to
|
|
|
-something more realistic, such as `1km`.
|
|
|
-
|
|
|
-[source,json]
|
|
|
-----------------------------
|
|
|
-PUT /attractions
|
|
|
-{
|
|
|
- "mappings": {
|
|
|
- "restaurant": {
|
|
|
- "properties": {
|
|
|
- "name": {
|
|
|
- "type": "string"
|
|
|
- },
|
|
|
- "location": {
|
|
|
- "type": "geo_point",
|
|
|
- "geohash_prefix": true, <1>
|
|
|
- "geohash_precision": "1km" <2>
|
|
|
- }
|
|
|
- }
|
|
|
- }
|
|
|
- }
|
|
|
-}
|
|
|
-----------------------------
|
|
|
-<1> Setting `geohash_prefix` to `true` tells Elasticsearch to index
|
|
|
- all geohash prefixes, up to the specified precision.
|
|
|
-<2> The precision can be specified as an absolute number, representing the
|
|
|
- length of the geohash, or as a distance. A precision of `1km` corresponds
|
|
|
- to a geohash of length `7`.
|
|
|
-
|
|
|
-With this mapping in place, geohash prefixes of lengths 1 to 7 will be indexed,
|
|
|
-providing geohashes accuracate to about 150 meters.
|
|
|
-
|
|
|
-[[geohash-cell-filter]]
|
|
|
-==== `geohash_cell` filter
|
|
|
-
|
|
|
-The `geohash_cell` filter simply translates a `lat/lon` location into a
|
|
|
-geohash with the specified precision and finds all locations which contain
|
|
|
-that geohash -- a very efficient filter indeed.
|
|
|
-
|
|
|
-[source,json]
|
|
|
-----------------------------
|
|
|
-GET /attractions/restaurant/_search
|
|
|
-{
|
|
|
- "query": {
|
|
|
- "filtered": {
|
|
|
- "filter": {
|
|
|
- "geohash_cell": {
|
|
|
- "location": {
|
|
|
- "lat": 40.718,
|
|
|
- "lon": -73.983
|
|
|
- },
|
|
|
- "precision": "2km" <1>
|
|
|
- }
|
|
|
- }
|
|
|
- }
|
|
|
- }
|
|
|
-}
|
|
|
-----------------------------
|
|
|
-<1> The `precision` cannot be more precise than that specified in the
|
|
|
- `geohash_precision` mapping.
|
|
|
-
|
|
|
-This filter translates the `lat/lon` point into a geohash of the appropriate
|
|
|
-length -- in this example `dr5rsk` -- and looks for all locations that contain
|
|
|
-that exact term.
|
|
|
-
|
|
|
-However, the filter as written above may not return all restaurants within 5km
|
|
|
-of the specified point. Remember that a geohash is just a rectangle, and the
|
|
|
-point may fall anywhere within that rectangle. If the point happens to fall
|
|
|
-near the edge of a geohash cell, then the filter may well exclude any
|
|
|
-restaurants in the adjacent cell.
|
|
|
-
|
|
|
-To fix that, we can tell the filter to include the neigbouring cells, by
|
|
|
-setting `neighbors` to `true`:
|
|
|
-
|
|
|
-[source,json]
|
|
|
-----------------------------
|
|
|
-GET /attractions/restaurant/_search
|
|
|
-{
|
|
|
- "query": {
|
|
|
- "filtered": {
|
|
|
- "filter": {
|
|
|
- "geohash_cell": {
|
|
|
- "location": {
|
|
|
- "lat": 40.718,
|
|
|
- "lon": -73.983
|
|
|
- },
|
|
|
- "neighbors": true, <1>
|
|
|
- "precision": "2km"
|
|
|
- }
|
|
|
- }
|
|
|
- }
|
|
|
- }
|
|
|
-}
|
|
|
-----------------------------
|
|
|
-
|
|
|
-<1> This filter will look for the resolved geohash and all of the surrounding
|
|
|
- geohashes.
|
|
|
-
|
|
|
-Clearly, looking for a geohash with precision `2km` plus all the neighbouring
|
|
|
-cells, results in quite a large search area. This filter is not built for
|
|
|
-accuracy, but it is very efficient and can be used as a pre-filtering step
|
|
|
-before applying a more accurate geo-filter.
|
|
|
-
|
|
|
-TIP: Specifying the `precision` as a distance can be misleading. A `precision`
|
|
|
-of `2km` is converted to a geohash of length 6, which actually has dimensions
|
|
|
-of about 1.2km x 0.6km. You may find it more understandable to specify an
|
|
|
-actual length like `5` or `6`.
|
|
|
-
|
|
|
-The other advantage that this filter has over a `geo_bounding_box` filter is
|
|
|
-that it supports multiple locations per field. The `lat_lon` option that we
|
|
|
-discussed in <<optimize-bounding-box>> is very efficient, but only when there
|
|
|
-is a single `lat/lon` point per field.
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-
|
|
|
-
|