Configuring Drill

Lists the data-fabric-specific configuration for Drill.

Drill is highly configurable. This document focuses on data-fabric-related configurations and refers to the open source Apache Drill documentation for generic information. Key things to configure are:
Drill memory
Determine the amount of heap and direct memory allocated to a Drillbit for query processing in a Drill cluster. See Configuring Drill Memory.
Parquet block size
Change the Parquet block size to match the filesystem chunk size. See Configuring the Parquet Block Size.
Resources for a shared Drillbit
Configure queues and parallelization for supporting multiple users sharing a Drillbit. Support separate Drillbits running on different nodes in the cluster. See Configuring Resources for a Shared Drillbit.
Multitenancy
Configure a multitenant cluster to account for resources required for Drill. See Configuring a Multitenant Cluster.
User Impersonation
Configure impersonation to allow a service to act on behalf of a client while performing the action requested by the client. See User Impersonation.
User authentication and encryption
Configure user authentication when you want the identity of a user, before permitting the user access to a process running on a system. See Default Security (Tickets) .
SSL/TLS for Encryption
Enable and configure SSL/TLS for encryption when you need to use Plain authentication. See SSL/TLS for Encryption.
Drill impersonation with Hive authorization
Configure Drill impersonation to work with Hive impersonation to authorize access to metadata in the Hive metastore repository and data in the Hive warehouse. See User Impersonation with Hive.
Volumes to use for spooling
Use the drill.exec.spill.directories option to set MapReduce volumes or local volumes for spooling to improve performance and stripe data across as many disks as possible.
Persistent configuration storage
See Persistent Configuration Storage and Configuring the ZooKeeper PStore Location.
Access rights
Configure access rights if you have 777 file-level permissions to a table, and a query returns no results. See Configuring Access Rights.

Drill typically runs along side other workloads, including the following:

  • MapReduce
  • Yarn
  • Hive and Pig
  • Spark

You need to plan and configure these resources for use with Drill and other workloads:

  • Memory
  • CPU
  • Disk

Configuring Access Rights

If the security in your organization limits access to HPE Ezmeral Data Fabric Database tables, you might experience a problem querying the tables. If you have 777 file-level permissions to a table, yet a query returns no results, you might need to add your user name to the maprcli Access Control List (ACL).