Drill 1.16.0.0-1904 (EEP 6.2.0) Release Notes
This section provides reference information, including new features, improvements, resolved issues, known issues, and limitations for Drill 1.16.0.0-1904.
These release notes contain MapR-specific information and are not necessarily cumulative in nature. For information about how to use the release notes, see Ecosystem Component Release Notes.
The following release notes apply to the 1.16.0.0 version of the Drill component:
Version | 1.16.0.0 |
Release Date | May 2019 |
MapR Version Interoperability | See Component Versions for Released EEPs. |
Package Names | Navigate to https://package.ezmeral.hpe.com/releases/MEP/, and select your EEP and OS to view the list of package names, for
example:
|
New in this Release
Drill 1.16.0.0 includes the following new features and improvements in the following areas:
- SQL
-
- ANALYZE TABLE COMPUTE STATISTICS generates table statistics for more efficient query plans. (DRILL-1328)
- ANALYZE TABLE REFRESH METADATA generates metadata cache files for specific columns instead of an entire table or directory. (DRILL-7058)
- CREATE OR REPLACE SCHEMA command defines schema for text files. (MD-5202, DRILL-6964)
- NEARESTDATE function for time series analysis.
- NEAREST DATE function to facilitate time series analysis (DRILL-7077)
- By default, Drill no longer writes the profiles for SET queries to the
persistent store. Setting the
exec.query_profile.alter_session.skip
option to false reverts this behavior.
- Storage
-
- SYSLOG (RFC-5424) Format Plugin (DRILL-6582)
- Format plugin for LTSV files (DRILL-7014)
- A new maprdb format plugin option,
readTimestampWithZoneOffset
, converts timestamp values from UTC to local time zone when values are read from MapR Database. This option is disabled by default. (MD-5272) - Drill can query views defined in Hive similarly to querying Hive tables in a
hive schema, for
example:
(DRILL-540)SELECT * FROM hive.`hive_view`;
- Configuration Options
-
- A new Drill configuration option,
store.hive.maprdb_json.read_timestamp_with_timezone_offset
, enables Drill to read timestamp values with a timezone offset when using the hive plugin with the Drill native MaprDB JSON reader enabled. This option is disabled by default. (MD-5272)
- A new Drill configuration option,
- Web UI
- Several Web UI improvements, including:
- Storage plugin management improvements (DRILL-6562)
- Query progress indicators and warnings (DRILL-6879)
- Ability to limit the result size for better UI response (DRILL-6050)
- Ability to sort the list of profiles in the Drill Web UI (DRILL-6942)
- Display query state in query result page (DRILL-6939)
- Button to reset the options filter (DRILL-6921)
- SQLLine (Drill shell)
-
- Upgrade to SQLLine 1.7. (DRILL-6989)
- Calcite
-
- Upgrade to Calcite 1.18.0. (MD-5050)
For a list of additional features and improvements, see the Apache Drill 1.16 release notes.
Resolved Issues
Drill 1.16.0.0 includes the following resolved issues and improvements:
MapR Tracking Number | Resolved Issue |
MD-5673 | DRILL-7150: Drill timestamp timezone conversion uses current daylight savings time instead of the one active during timestamp date |
MD-5647 | DRILL-7118: Filter not getting pushed down on MapR-DB tables. |
MD-5638 | DRILL-7130: IllegalStateException: Read batch count [0] should be greater than zero |
MD-5630 | DRILL-7113: Drill on MapRDB can not understand null value |
MD-5624 | DRILL-7125: REFRESH TABLE METADATA fails after upgrade from Drill 1.13.0 to Drill 1.15.0 |
MD-5623 | Unable to connect to Drill 1.15 through ZK |
MD-5609 | DRILL-7079: Drill can't query views from the S3 storage when plain authentication is enabled |
MD-5606 | DRILL-7100: parquet RecordBatchSizerManager : IllegalArgumentException: the requested size must be non-negative |
MD-5561 | DRILL-7060: Query on audit logs fails by DATA_READ ERROR Error Parsing JSON - Unrecognized character escape 'S' (code 83) |
MD-5552 | DRILL-7119: Modify selectivity calculations to use histograms |
MD-5550 | DRILL-7048: Implement JDBC Statement.setMaxRows() with System Option |
MD-5523 | Physical plan generation failure after upgrade from 1.10 to 1.14 |
MD-5490 | DRILL-7117: Support creation of column Histograms for numeric data types |
MD-5428 | Include links to pre and post procedures in Drill upgrade documentation |
MD-5379 | DRILL-7018: Drill Query (when store.parquet.reader.int96_as_timestamp=true) on Parquet File fails with Error: SYSTEM ERROR: IndexOutOfBoundsException: readerIndex: 0, writerIndex: 372 (expected: 0 <= readerIndex <= writerIndex <= capacity(256)) |
MD-5374 | DRILL-6971: Display query state in query result page of Web UI |
MD-5369 | DRILL-7115: Improve Hive schema show tables performance |
MD-5368 | DRILL-4858: repeated_count on JSON array of objects (maps) implementation is **missing** in Drill 1.14 |
MD-5363 | DRILL-4858: Missing function implementation: [repeated_count(LIST-REPEATED)] |
MD-5356 | DRILL-4858: Implement - Missing function implementation: [repeated_count(MAP-REPEATED)]. |
MD-5348 | DRILL-6997: TPCDS queries 56, 60, 83 are slower with plan change |
MD-5330 | DRILL-6967: TIMESTAMPDIFF returns incorrect value for SQL_TSI_QUARTER |
MD-5319 | DRILL-6997: TPCDS query 95 slower with plan change |
MD-5278 | DRILL-6931: Drill "SHOW FILES" command duplicates empty S3 folders as subfolders |
MD-5277 | DRILL-6928: exec.query.return_result_set_for_ddl does not affect Web UI query results |
MD-5272 | DRILL-6969: Drill on maprdb native reader reads a wrong timezone comparing to hive |
MD-5253 | to_timestamp function is losing precision for milliseconds |
MD-5251 | DRILL-6894: CTAS and CTTAS are not working on S3 storage when cache is disabled |
MD-5236 | DRILL-7023: Tableau query fails with IndexOutOfBoundsException after upgrade from drill 1.13.0 to drill 1.14.0 |
MD-5226 | DRILL-6918: Querying empty topics fails with "NumberFormatException" |
MD-5198 | DRILL-6880: TPCDS query 35 slower due to nulls |
MD-5179 | DRILL-6874: CTAS from json to parquet is not working on S3 storage |
MD-5095 | DRILL-7051: Update Drill's Jetty Server to 9.3 |
MD-4863 | Simba JDBC driver does not return some values |
MD-4862 | Simba JDBC driver returns incorrect time value |
MD-4826 | The COALESCE function returns results when the columns referenced in the function do not exist in the files being queried. You do not have to CAST the columns to a specific data type for the COALESCE function to return results. |
MD-4617 | "direct.used" metrics(jvm_direct_current) doesn't catch the direct memory usage. |
MD-4362 | Query on data containing reserved word 'date' as column name fails to generate non-covering index plan |
MD-3723 | Querying Hbase row_key column with non- existing column returns different results in different Drill Versions |
MD-1585 | Need More Accurate Filter Estimation Before Running a Query |
MD-1008 | DRILL-7038: Performance - Queries on partitioned columns currently scan the entire datasets |
MD-880 | HashJoin's not fully parallelized in query plan |
MD-680 | DRILL-7069: Planning time unaccounted for query with longer planning time |
Known Issues
Drill 1.16.0.0 has the following known issues:
MapR Tracking Number | Known Issue |
MD-5792 | TPCH query 5 runs 10-20% slower at sf100/sf1000, possibly due to hash join ordering |
MD-5786 | TPCDS query 98 is 2x slower with Statistics enabled due to hash join order for sf100 and sf1000 |
MD-5782 | Need better error message when analyze command fails due to schema change |
MD-5770 | TPCH query 9 runs 18% slower at sf 100/sf1000, possibly due to hash join |
MD-5758 | TPCDS query 78 runs 30x slower with Statistics enabled at sf100 |
MD-5755 | DirectScan lists all partitions in explain plan, even for full table scan |
MD-5744 | [DRILL-7216] Auto limit is happening on the Drill Web-UI while the limit check box is unchecked |
MD-5740 | REFRESH TABLE METADATA does not count null values for decimal, varchar, and interval data types. |
MD-5694 | The first query to use a new metadata cache file may take a while to run because the first query triggers a refresh of the metadata cache file. |
MD-5684 | Drill timeout when querying a large number of files |
MD-5676 | Drill parquet file may not have statistics for decimal and varchar data types. |
MD-5608 | Running analyze command on a view fails correctly but the error is confusing |
MD-5528 | Compute stats on non existent columns fails with exception |
MD-5388 | Running analyze cmd on duplicate column names is resulting in IndexOutOfBoundsException |
MD-5371 | Error msg not clear when analyze cmd is run on table with complex types |
MD-5342 | DRILL-6839 : regarding aggs in cross join queries |
Fixes
None.
Limitations
None.