This example is inspired by the post Z-Order Brush D3 Example. The Z-Order is a space filling curve (SFC) and a commonly used technique in databases to store and query multidimensional point data (Tropf et al., Orenstein et al., Geo-Wave). The basic query pattern is a multidimensional rectangular query. Since SFC generates a linear order for a data sets, the query box should be split into minimal number of segments (ranges) for efficient processing of rectangular query.
For a given query rectangle, we generate a minimal set of ranges that cover all points inside. The goal is to omit range segments that lie outside the query box, therefore avoiding unnecessary database scans (Tropf et al., Orenstein et al.). Data points can be stored and indexed according to their Z-order keys in any database that supports efficient range scans (e.g. relational Database, Apache HBase or Apache Accumulo, both are open source Google Bigtable implementation and support efficient range and parallel scans).
The algorithm follows “divide & conquer” pattern. We assume bit strings as an input. This algorithm is called with a map function as parameter and a query box represented by bit strings of lower left corner and upper right corner. For the sake of simplicity, we use a bit array (bit string) as a map function. E.g. [0,1,0,1,0,1,0] is a “classical” pattern to interleave 4 bits numbers.
The main idea of this algorithm is to check if the query box consists solely of one continuous curve segment. If this true, we stop and return a segment (in worst case this can be a single cell), otherwise we “cut” the query box along the hyperplane, resulting two smaller query boxes. Then we conduct the our algorithm on both of them. After processing both sides, we check if the segments can be combined in one continuous segment. In worst case we return a set (list) of single cell addresses, that covered by the query box (e.g. consider a horizontal line query).
Splitting algorithm can be described as recursive routine with the following steps (function getRanges(hyperPlanePosition, queryRec, mapFunction, highstBit)
) :
function cutBox(hyperplaneDimension, index, queryRec, mapFunction)
) : “cuts” the box in two parts. Before we run this procedure, we compute the index of a first position where Z-Bit-String of lower left corner and Z-Bit-String of an upper right corner are different (and a suffix does not follow the pattern above). Then we define the hyperplane (dimension) for this index position and compute two new query boxes (“left” from hyperplane and “right” from hyperplane)