Drill
Drill is a low-latency distributed query engine for large-scale datasets, including structured and semi-structured/nested data. Inspired by Google’s Dremel, Drill is designed to scale to several thousands of nodes and query petabytes of data at interactive speeds that BI/Analytics environments require.
Drill includes a distributed execution environment, purpose built for large-scale data processing. At the core of Drill is the "Drillbit" service which is responsible for accepting requests from the client, processing the queries, and returning results to the client.
Installing Drill
You can install and run a Drillbit service on one node or on all of the nodes in a MapR cluster to form a distributed cluster environment. When a Drillbit runs on each data node in a cluster, Drill can maximize data locality during query execution without moving data over the network or between nodes. Drill uses ZooKeeper to maintain cluster membership and health check information.
See Installing Drill for guidance.
Accessing Drill
After you have installed Drill and configured connections to your data sources, you can access Drill from any of the following user interfaces:
- Drill shell (SQLLine)
- Drill Web Console
- ODBC
- JDBC
- MapR driver
- Apache open source driver. Download the
apache-drill-1.10.0-src.tar.gz
from the Apache Drill download site. For instructions, see the Apache Drill documentation for the embedded Apache JDBC driver.
- C++ API
- REST API
Installing Drill Interfaces
For Drill interface installation instructions, see ODBC/JDBC Interfaces in the Apache Drill documentation.
Using Drill Documentation
- MapR-specific Drill user documentation starting with this home page
- MapR-specific Drill Release Notes
- Apache Drill