Preparing Clusters for Querying using Secondary Indexes on JSON Tables

Describes the tasks needed to prepare your environment so you can query MapR Database JSON tables using secondary indexes.

Installing with the MapR Installer

To install MapR using the MapR installer, follow the steps outlined at Installing with the MapR Installer.

Starting with MapR 6.0.1, you do not have to enable a separate query service to use secondary indexes. The Operational Applications with MapR Database template installs and configures the replication gateways needed to update secondary indexes in MapR Database JSON and includes the components needed to run OJAI queries.

You must enable the OJAI Distributed Query Service to use certain features. The following table summarizes the differences in the functionality of OJAI queries when you do and do not have the service enabled:

Service Not Enabled Service Enabled
  • Can run queries that use a single secondary index
  • Can sort data in your queries up to a configurable limit
  • Can run queries that use multiple secondary indexes
  • Can sort data in your queries without any limit
  • Can run queries in parallel

Selecting any of the following templates enables the OJAI Distributed Query Service:

  • Operational Applications with MapR Database and Distributed Query Service
  • MapR Converged Cluster: Batch, interactive and real-time analytics
  • Analytics with MapR Database

You can also explicitly enable the OJAI Distributed Query Service by selecting the service in the Custom Services template.

For more information about installer templates, see Auto-Provisioning Templates.

For more information about how secondary index selection and execution works in MapR Database JSON, see Selection and Execution of Secondary Indexes.

NOTE The OJAI Query Service has been renamed to the OJAI Distributed Query Service in MapR 6.0.1. All information about the OJAI Distributed Query Service applies to the OJAI Query Service, except where noted.

Installing without the MapR Installer

Other sections of the documentation describe the detailed steps for installing and configuring without the MapR installer. Generally, you need to perform the following steps:

  1. Install software.

    To install MapR without using the MapR installer, follow the steps outlined at Installing without the MapR Installer. In addition to installing MapR core packages, you also need to install MapR Drill if you want advanced secondary index selection, sorts on large data sets, and parallel query execution. When installing Drill, make sure to Configure the OJAI Distributed Query Service.

  2. Install and configure replication gateways.

    Updates are propagated from the JSON tables using the Gateways for Replicating MapR Database Tables. You need to install the replication gateways. Since the source JSON table and the secondary index are on the same volume within a cluster, configure an intracluster gateway. In this type of gateway, the source and destination clusters are the same.

    If your gateways are running on the same nodes as CLDB, then no additional configuration steps are required. See Configuring Gateways for Table and Stream Replication for details about this scenario and other options for configuring your gateways.

Upgrades

Other sections of the documentation describe the detailed steps for upgrades. Generally, you need to perform the following steps:

  1. Upgrade your MapR software by following the instructions at Upgrading MapR Core or EEP Components.
  2. Install MapR Drill, if you have not already done so and want to sort large data sets and run queries in parallel.
  3. When installing Drill, make sure to Configure the OJAI Distributed Query Service.
  4. If you are upgrading without the MapR installer, follow step 2 in the previous section to install and configure replication gateways.
  5. Enable the replication support needed to propagate index updates by running the following command:
    maprcli cluster feature enable -name mfs.feature.db.streams.v6.support

    See Step 4: Enable New Features for further details.

IMPORTANT If you are using a Manual Rolling Upgrade Description, you must upgrade all nodes running replication gateways before performing updates on tables with indexes. Otherwise, the index updates will hang.