Drill 1.9.0 Release Notes

The following release notes apply to the 1.9.0 version of the Apache Drill component included in the MapR Converged Data Platform.
Version 1.9.0
Release Date December 9, 2016
MapR Version Interoperability MapR Drill 1.9.0 is certified on the MapR v5.1.0 and v5.2.0 Converged Data Platform. See Interoperability Matrices and Drill Support Matrix.
Packages See Package Names for Ezmeral Ecosystem Packs (EEPs)

Noteworthy New Features in the MapR Distribution of Drill

This release provides enhanced query improvements with the following bug fixes and improvements:
  • The Asynchronous Parquet Reader improves the performance of the Parquet Scan operator by increasing the speed at which the Parquet reader scans, decompresses, and decodes data. This feature is disabled by default.
  • Parquet Filter Pushdown optimizes the performance by pruning extraneous data from a Parquet file to reduce the amount of data that Drill scans and reads when a query on a Parquet file contains a filter expression.
  • Dynamic UDFs enable users to register and unregister UDFs in a muti-tenant environment using the new CREATE FUNCTION USING JAR and DROP FUNCTION USING JAR commands.
  • Support for a variety of JOIN syntax generated by Tableau and other BI tools, including joins between tables with NULL column values.
  • HTTPD Format Plugin adds the capability to query HTTP web server logs natively and also includes parse_url() and parse_query() UDFs that return maps of the URL and the query string.

Additional bug fixes and enhancements listed in the Apache Drill 1.9.0 Release Notes.

Default Configuration Changes

The default value for the store.parquet.block-size parameter is now 268435456 (256MB), the same size as MapR filesystem chunk sizes. Prior to this release, the default value was 536870912 (512 MB).

The planner.enable_limit0_optimization parameter is now enabled by default to optimize limit0 queries. Prior to this release, the option was disabled by default.

You can modify parameter values using the ALTER SYSTEM|SESSION command.

Resolved Issues

The following table lists resolved issues in Drill 1.9.0:
Issue Description
MD-1217 When you use the JDBC URL format for a direct Drillbit connection, the driver can now shuffle between the Drillbits listed in the connection string instead of always selecting the first Drillbit listed.
MD-1163 The avgwidth stat for the timestamp data type (int64) is now 8 instead of 4 bytes.
MD-540 Query profiles now include the session options used for a query.

Known Issues

The following table lists the known issues in Drill 1.9.0:

Issue Description
MD-1256 The MAPR_TICKETFILE_LOCATION variable in drill-env.sh needs to be unset so that a user other than the mapr user can connect to Drill through SQLLine and MapR-SASL.
MD-1229 Hive table partition pruning does not prune out all unnecessary partitions in some scenarios.
MD-1226 MapR-SASL plain authentication goes through Kerberos, by default, requiring regular users to have a valid Kerberos ticket when connecting to Drill using SQLLine through ZooKeeper.
MD-1208 Parquet filter pushdown does not prune enough partitions in queries where the predicate contains the TIME data type.

Limitations

See Drill-on-YARN Limitations.