MongoDB has supported geospatial queries since MongoDB 1.4, and has added new geospatial features with most releases, including support for simple polygons in MongoDB 2.0, GeoJSON in MongoDB 2.4, and several new capabilities in MongoDB 2.6. In this post we'll cover a new feature in MongoDB 3.0 for searching polygons that are larger than half the earth's surface. I am using PHP for my examples, but of course you can do this from any other supported language too.
Finding objects in a specific area (polygon) can be done like so:
The GeoJSON format, which we use in this query, defines a polygon as an array of linear rings, which themselves are a list of coordinate pairs beginning and ending with the same point to form a closed ring. The first linear ring is required and defines the exterior boundary of the polygon. Subsequent rings, if specified, would define interior boundaries (i.e. holes) within the polygon.
As an image, the geometry in this query looks like:
The GeoJSON format specification does not mention anything about the direction or winding of points. The application is left to determine if the overall geometry defines an inclusive or exclusive search area. Both of the following images are possible interpretations when just taking the GeoJSON specification into account:
These queries would yield very different results! Because GeoJSON does not address winding, sometimes applications make wrong decisions.
MongoDB deterministically chooses the area that is the "smallest of the two". For our query, it would consider the polygon as an inclusive search area, which is the first of the two interpretations shown above.
This logic works for most applications, but it falls apart when you want to find objects within a polygon that spans more than half of the Earth's surface.
In the small example of London, you would certainly expect the smallest area, the one that covers London to be the search area. But when we consider this much larger polygon (a circle centred around 25°N, 90°E with a diameter of 115°), then it is not so obvious:
When the polygon is larger than half of the Earth's surface, the "smallest of the two" is actually the area we intended to exclude. Thankfully, MongoDB supports a special coordinate reference system (CRS), which can force geospatial queries to consider the direction of coordinates in deciding upon an inclusive or exclusive area. Consistent with KML, and WKT/WKB, counter-clockwise and clockwise winding will imply inclusive and exclusive behaviour, respectively. This results in the following two possibilities:
Big Polygon support in MongoDB 3.0 addresses this.
MongoDB's implementation of GeoJSON allows us to specify a custom coordinate reference system on any geometry object via a crs property. Currently MongoDB only supports our custom CRS, to specify strict winding. In the future, MongoDB is likely to support others, such as the US National Grid, UTM or the UK's Ordnance Survey National Grid as well. In the query, our custom CRS would appear as an extra option within the $geoWithin or $geoIntersects criteria:
This example is only relevant when searching an area that is larger than half of the Earth's surface. In most cases, the default "smallest of the two" behavior will be sufficient and you will not need to specify a custom coordinate reference system.
To learn more about what's new in MongoDB 3.0, download the white paper here: