Using Indexes to Optimize ORDER BY Queries

Using indexes can help you improve the performance of queries that have an ORDER BY clause. This includes ORDER BY clauses with either ascending or descending sorts, as well as more than one ordering field. The same index can optimize both filter conditions and the ORDER BY clause.

To use the index for an ORDER BY query, the index's key list order and sort order must match the orderings specified in the query. If the index’s keys also match filter conditions in the query, using the index also reduces the amount of data read from the index.

Index Key List Order and Sort Order Examples

The following table provides examples of when MapR Database can and cannot use an index for ordering, based on the index key list ordering and sort ordering specified. Assume that you have a table that has a composite index on fields Address.State and FullName.LastName. You have defined both keys in ascending order. Further assume that the query has an ORDER BY on the fields Address.State and FullName.LastName, both in ascending order:

Ordering in Query Use of Index for Ordering
Address.State:ASC
Yes
Address.State:DESC
No

Sort direction does not match.

Address.State:ASC,
FullName.LastName ASC
Yes
FullName.LastName:ASC
No

Address.State must be included as a prefix in the ordering.

FullName.LastName:ASC,
Address.State:ASC
No

Sort directions match, but the order of fields does not match.

Filtering and ORDER BY Query Examples

Assume that you have a composite index defined with the following two indexed fields:
  • Address.State:ASC
  • FullName.LastName:ASC
The following table shows examples for different filtering and ORDER BY scenarios using this composite index:
Query Condition Ordering in Query Index Use
{"$eq":{"Address.State":"CA"}}
FullName.LastName:ASC
Both filtering and ordering
{"$gt":{"Address.State":"CA"}}
Address.State:ASC
Both filtering and ordering
{"$gt":{"Address.State":"CA"}}
Address.State:DESC
Only filtering, because the sort direction does not match
{
  "$and":[
    {"$eq":{"Address.State":"CA"}},
    {"$ge":{"FullName.LastName":"Smith"}}
  ]
}
FullName.LastName:ASC
Both filtering and ordering
{"$gt":{"Address.State":"CA"}}
Address.State:ASC,
FullName.LastName:ASC
Both filtering and ordering
{"$gt":{"Address.State":"CA"}}
FullName.LastName:ASC
Only filtering
{"$in":{"Address.State":["CA","TX"]}}
FullName.LastName:ASC
Only filtering
{"$ge":{"FullName.LastName":"Smith"}}
Address.State:ASC,
FullName.LastName:ASC
Only ordering, because FullName.LastName is not a prefix in the filter lookup

Index Sort Order for Complex Types

Although you can define indexes on complex data types, there are limitations in the behavior.

Arrays and Nested Documents

Indexes defined on arrays and nested documents do not have a meaningful ordering because these types do not have a defined ordering.

Container Field Paths

You cannot order on a container field path.

For example, you can define an index on the field a[].b, but you cannot order on it.

Partial Sorts with Non-Covering Indexes

MapR Database updates secondary indexes asynchronously, which can result in updates to the index lagging the parent JSON table. You can avoid this behavior in your OJAI application by setting a query option in your application. See Avoiding Partial Sorts with Secondary Indexes in OJAI for details about how to do this.

One consequence of this index update lag is the impact on queries that use non-covering indexes to provide the ordering of a query. Since the index is not fully synchronized with its parent JSON data, data read through the index might be out of date.

The following example illustrates this behavior.

  • Suppose you have a query with the following criteria:
    • Selects docField
    • Filter condition where docField >= 5
    • Order by on docField
  • You have an index where docField is an indexed field. The index optimizes both the filter condition and order by clause. The query also selects other fields, so the index is a non-covering index for the query.
  • When reading through the index, MapR Database reads a document in which the docField value is 9. The data for that field in the JSON table is also 9. The data is consistent.
  • The next entry in the index has docField set to 10. This value is in the proper sort order relative to the previous value of 9, but the data in the JSON table has changed from 10 to 6. The update is not yet reflected in the index.
  • MapR Database returns the value 6 (not 10), which is out of order, relative to data previously read from the index.

The following table and diagram illustrates this example:

Update State Query Result in docField Sort Order
Before update 5, …, 9, 10
JSON table updated, but not index 5, …, 9, 6
Index updated 5, 6, …, 9
NOTE This behavior does not occur with covering indexes. MapR Database only reads from a single data source, the index, when using covering indexes.

See Asynchronous Secondary Index Updates for a more detailed discussion of asynchronous index updates.