Home
6.0 Development
This section contains information related to application development for ecosystem components and MapR products including MapR-DB (binary and JSON), MapR-FS, and MapR Streams.
Ecosystem Components
The following sections provide information about each open source project that MapR supports.
Drill

MapR 6.0 Documentation

6.0 Development
This section contains information related to application development for ecosystem components and MapR products including MapR-DB (binary and JSON), MapR-FS, and MapR Streams.
- Application Development Process
  Before you start developing applications on MapR’s Converged Data Platform, consider how you will get the data onto the platform, the format it will be stored in, the type of processing or modeling that is required, and how the data will be accessed.
- MapR-FS and Apps
  The following sections provide information about accessing MapR-FS with C and Java applications.
- MapR-DB and Apps
  This section contains information about developing client applications for JSON and binary tables.
- MapR-ES and Apps
  MapR-ES brings integrated publish and subscribe messaging to the MapR Converged Data Platform.
- MapReduce and Apps
  This section contains information associated with developing YARN applications.
- MapR Data Science Refinery
  The MapR Data Science Refinery is an easy-to-deploy and scalable data science toolkit with native access to all platform assets and superior out-of-the-box security.
- MapR Data Fabric for Kubernetes FlexVolume Driver
  This section describes how to use and troubleshoot the MapR Data Fabric for Kubernetes FlexVolume Driver.
- Ecosystem Components
  The following sections provide information about each open source project that MapR supports.
  - MapR Ecosystem Packs
    A MapR Ecosystem Pack (MEP) provides a set of ecosystem components that work together on one or more MapR cluster versions. Only one version of each ecosystem component is available in each MEP. For example, only one version of Hive and one version of Spark is supported in a MEP.
  - AsyncHBase
  - Cascading
  - Drill
    - Drill Tutorial
    - Drill-on-YARN
    - Configuring Drill
    - Working with Drill
    - Securing Drill
      An administrator can install Drill with the default security configuration provided by MapR or manually configure custom security for Drill.
    - Drill Drivers
      MapR provides Drill ODBC and JDBC drivers that you can download. Drivers are updated periodically to include support for new functionality in Drill.
    - Drill Configuration Files
      The Drill installation includes configuration files with start-up options that you can modify prior to starting Drill.
    - Monitoring Drill Metrics
    - Optimizing Queries with Indexes
      MapR-DB provides a highly scalable key-value database platform on which you can run SQL queries using Drill. As of the 6.0 release of the MapR Converged Data Platform, MapR-DB natively supports indexes on secondary fields in JSON tables.
  - Flume
  - HBase Client and MapR-DB Binary Tables
  - HCatalog
  - Hive
  - HttpFS
  - Hue
  - Impala
  - MapR-ES Clients and Tools
  - Myriad
  - OpenStack Manila
  - Oozie
  - Pig
  - Sentry
  - Spark
  - Sqoop
  - Third Party Solutions
- Maven and MapR
  This section discusses topics associated with Maven and MapR.
- Developer's Reference
  This section contains in-depth information for the developer.
- API Documentation
  MapR supports public APIs for MapR-FS, MapR-DB, and MapR-ES. These APIs are available for application development purposes.

Drill

Drill is a low-latency distributed query engine for large-scale datasets, including structured and semi-structured/nested data. Inspired by Google’s Dremel, Drill is designed to scale to several thousands of nodes and query petabytes of data at interactive speeds that BI/Analytics environments require.

Drill includes a distributed execution environment, purpose built for large-scale data processing. At the core of Drill is the "Drillbit" service which is responsible for accepting requests from the client, processing the queries, and returning results to the client.

Installing Drill

You can install Drill on any number of nodes in a cluster. For example, you can install Drill on one node in a cluster or on multiple nodes in a cluster. When a Drillbit runs on each data node in a cluster, Drill can maximize data locality during query execution without moving data over the network or between nodes. Drill uses ZooKeeper to maintain cluster membership and health check information.

See Installing Drill for guidance.

Accessing Drill

After you have installed Drill and configured connections to your data sources, you can access Drill from any of the following user interfaces:

Drill shell (SQLLine)
Drill Web Console
ODBC
JDBC
C++ API
REST API

Using Drill Documentation

You can access Drill documentation from the following locations:

MapR-specific Drill user documentation starting with this home page
MapR-specific Drill release notes
Apache Drill

Additional Resources

See the following MapR sites for more Drill information:

(Topic last modified: 2018-10-16)