Drill 1.16.0.0-1904 (EEP 6.2.0) Release Notes

This section provides reference information, including new features, improvements, resolved issues, known issues, and limitations for Drill 1.16.0.0-1904.

These release notes contain MapR-specific information and are not necessarily cumulative in nature. For information about how to use the release notes, see Ecosystem Component Release Notes.

The following release notes apply to the 1.16.0.0 version of the Drill component:

Version	1.16.0.0
Release Date	May 2019
MapR Version Interoperability	See Component Versions for Released EEPs.
Package Names	Navigate to https://package.ezmeral.hpe.com/releases/MEP/, and select your EEP and OS to view the list of package names, for example: mapr-drill-1.16.0.0.201905231125-1.noarch.rpm mapr-drill-internal-1.16.0.0.201905231125-1.noarch.rpm mapr-drill-yarn-1.16.0.0.201905231125-1.noarch.rpm

New in this Release

Drill 1.16.0.0 includes the following new features and improvements in the following areas:

SQL

ANALYZE TABLE COMPUTE STATISTICS generates table statistics for more efficient query plans. (DRILL-1328)
ANALYZE TABLE REFRESH METADATA generates metadata cache files for specific columns instead of an entire table or directory. (DRILL-7058)
CREATE OR REPLACE SCHEMA command defines schema for text files. (MD-5202, DRILL-6964)
NEARESTDATE function for time series analysis.
NEAREST DATE function to facilitate time series analysis (DRILL-7077)
By default, Drill no longer writes the profiles for SET queries to the persistent store. Setting the exec.query_profile.alter_session.skip option to false reverts this behavior.

Storage

SYSLOG (RFC-5424) Format Plugin (DRILL-6582)
Format plugin for LTSV files (DRILL-7014)
A new maprdb format plugin option, readTimestampWithZoneOffset, converts timestamp values from UTC to local time zone when values are read from MapR Database. This option is disabled by default. (MD-5272)
Drill can query views defined in Hive similarly to querying Hive tables in a hive schema, for example:
```
SELECT * FROM hive.`hive_view`;
```
(DRILL-540)

Configuration Options

A new Drill configuration option, store.hive.maprdb_json.read_timestamp_with_timezone_offset, enables Drill to read timestamp values with a timezone offset when using the hive plugin with the Drill native MaprDB JSON reader enabled. This option is disabled by default. (MD-5272)

Web UI

Several Web UI improvements, including:

SQLLine (Drill shell)

Upgrade to SQLLine 1.7. (DRILL-6989)

Calcite

Upgrade to Calcite 1.18.0. (MD-5050)

For a list of additional features and improvements, see the Apache Drill 1.16 release notes.

Resolved Issues

Drill 1.16.0.0 includes the following resolved issues and improvements:

MapR Tracking Number	Resolved Issue
MD-5673	DRILL-7150: Drill timestamp timezone conversion uses current daylight savings time instead of the one active during timestamp date
MD-5647	DRILL-7118: Filter not getting pushed down on MapR-DB tables.
MD-5638	DRILL-7130: IllegalStateException: Read batch count [0] should be greater than zero
MD-5630	DRILL-7113: Drill on MapRDB can not understand null value
MD-5624	DRILL-7125: REFRESH TABLE METADATA fails after upgrade from Drill 1.13.0 to Drill 1.15.0
MD-5623	Unable to connect to Drill 1.15 through ZK
MD-5609	DRILL-7079: Drill can't query views from the S3 storage when plain authentication is enabled
MD-5606	DRILL-7100: parquet RecordBatchSizerManager : IllegalArgumentException: the requested size must be non-negative
MD-5561	DRILL-7060: Query on audit logs fails by DATA_READ ERROR Error Parsing JSON - Unrecognized character escape 'S' (code 83)
MD-5552	DRILL-7119: Modify selectivity calculations to use histograms
MD-5550	DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option
MD-5523	Physical plan generation failure after upgrade from 1.10 to 1.14
MD-5490	DRILL-7117: Support creation of column Histograms for numeric data types
MD-5428	Include links to pre and post procedures in Drill upgrade documentation
MD-5379	DRILL-7018: Drill Query (when store.parquet.reader.int96_as_timestamp=true) on Parquet File fails with Error: SYSTEM ERROR: IndexOutOfBoundsException: readerIndex: 0, writerIndex: 372 (expected: 0 <= readerIndex <= writerIndex <= capacity(256))
MD-5374	DRILL-6971: Display query state in query result page of Web UI
MD-5369	DRILL-7115: Improve Hive schema show tables performance
MD-5368	DRILL-4858: repeated_count on JSON array of objects (maps) implementation is missing in Drill 1.14
MD-5363	DRILL-4858: Missing function implementation: [repeated_count(LIST-REPEATED)]
MD-5356	DRILL-4858: Implement - Missing function implementation: [repeated_count(MAP-REPEATED)].
MD-5348	DRILL-6997: TPCDS queries 56, 60, 83 are slower with plan change
MD-5330	DRILL-6967: TIMESTAMPDIFF returns incorrect value for SQL_TSI_QUARTER
MD-5319	DRILL-6997: TPCDS query 95 slower with plan change
MD-5278	DRILL-6931: Drill "SHOW FILES" command duplicates empty S3 folders as subfolders
MD-5277	DRILL-6928: exec.query.return_result_set_for_ddl does not affect Web UI query results
MD-5272	DRILL-6969: Drill on maprdb native reader reads a wrong timezone comparing to hive
MD-5253	to_timestamp function is losing precision for milliseconds
MD-5251	DRILL-6894: CTAS and CTTAS are not working on S3 storage when cache is disabled
MD-5236	DRILL-7023: Tableau query fails with IndexOutOfBoundsException after upgrade from drill 1.13.0 to drill 1.14.0
MD-5226	DRILL-6918: Querying empty topics fails with "NumberFormatException"
MD-5198	DRILL-6880: TPCDS query 35 slower due to nulls
MD-5179	DRILL-6874: CTAS from json to parquet is not working on S3 storage
MD-5095	DRILL-7051: Update Drill's Jetty Server to 9.3
MD-4863	Simba JDBC driver does not return some values
MD-4862	Simba JDBC driver returns incorrect time value
MD-4826	The COALESCE function returns results when the columns referenced in the function do not exist in the files being queried. You do not have to CAST the columns to a specific data type for the COALESCE function to return results.
MD-4617	"direct.used" metrics(jvm_direct_current) doesn't catch the direct memory usage.
MD-4362	Query on data containing reserved word 'date' as column name fails to generate non-covering index plan
MD-3723	Querying Hbase row_key column with non- existing column returns different results in different Drill Versions
MD-1585	Need More Accurate Filter Estimation Before Running a Query
MD-1008	DRILL-7038: Performance - Queries on partitioned columns currently scan the entire datasets
MD-880	HashJoin's not fully parallelized in query plan
MD-680	DRILL-7069: Planning time unaccounted for query with longer planning time

Known Issues

Drill 1.16.0.0 has the following known issues:

MapR Tracking Number	Known Issue
MD-5792	TPCH query 5 runs 10-20% slower at sf100/sf1000, possibly due to hash join ordering
MD-5786	TPCDS query 98 is 2x slower with Statistics enabled due to hash join order for sf100 and sf1000
MD-5782	Need better error message when analyze command fails due to schema change
MD-5770	TPCH query 9 runs 18% slower at sf 100/sf1000, possibly due to hash join
MD-5758	TPCDS query 78 runs 30x slower with Statistics enabled at sf100
MD-5755	DirectScan lists all partitions in explain plan, even for full table scan
MD-5744	[DRILL-7216] Auto limit is happening on the Drill Web-UI while the limit check box is unchecked
MD-5740	REFRESH TABLE METADATA does not count null values for decimal, varchar, and interval data types.
MD-5694	The first query to use a new metadata cache file may take a while to run because the first query triggers a refresh of the metadata cache file.
MD-5684	Drill timeout when querying a large number of files
MD-5676	Drill parquet file may not have statistics for decimal and varchar data types.
MD-5608	Running analyze command on a view fails correctly but the error is confusing
MD-5528	Compute stats on non existent columns fails with exception
MD-5388	Running analyze cmd on duplicate column names is resulting in IndexOutOfBoundsException
MD-5371	Error msg not clear when analyze cmd is run on table with complex types
MD-5342	DRILL-6839 : regarding aggs in cross join queries

Fixes

None.

Limitations

None.