Resolved Issues (MapR 6.1.1)

Lists the issues that were resolved in MapR version 6.1.1.

The following MapR issues, which were reported by customers, are resolved in Version 6.1.1.

Apache Kafka

MS-791
KafkaConsumer position does not honor TTL.
MS-946
Consumer poll returns messages before a subscription or re-balance operation with ConsumerRebalanceListener on PartitionsAssigned callback is complete.
MS-947
Kafka message headers should not be Null but can be empty.

Control System

MON-4844
Navigating to Usage tab when editing a volume causes OOPS error.
MON-4845
API server fails to start if the user is not mapr.
MON-4902
Default snapshot policies and times are incorrect.
MON-5075
Cannot modify existing volume quota due to insufficient permissions.
MON-5102
MCS does not allow configuring SMTP without a username and password.
MON-5131
OOPS error when adding a SMTP provider other than smtp, Office 365, and Gmail.
MON-5169
MCS call to schedule/list returns an error.
MON-5194
Incorrect scale on Y-axis for CPU Utilization graph on Overview page.
MON-5206; MON-5141
Volume modification fails when the metrics feature is not enabled.
MON-5270
Memory leak in API server.
MON-5271
Remove the limit on the number of results returned for MAC addresses when configuring a virtual IP.
MON-5272
User name should not be hardcoded as mapr in MCS.
MON-5275
MCS displays MAC addresses incorrectly.
MON-5338
I/O errors occur and the table summary page is not loaded when huge MapR DB tables are queried.
MON-5489
Volume filter options should not be pre-defined but should allow filtering on any volume attribute.
MON-5491
Detailed alarm descriptions were missing in the alarm popup view and the volume details view.
MON-5510
The Disk Space Available column uses the wrong units in the Nodes view.
MON-5517
Enhance the alarm summary command to return the alarm occurrences for each alarm type, indicate whether the alarm is a warning or an error, and specify the total number of occurrences for each alarm type.
MON-5672
The Security Certificate Expiry Alarm (NODE_ALARM_CERTIFICATE_NEAR_EXPIRATION) is incorrectly labeled.
MON-5774
API Server crashes when the tmp directories are mounted with the noexec option.
MON-5900
Resource Manager URL is invalid on Ubuntu.

Flume

MS-770
Array Index Out of Bounds Exception.

Filesystem

CORE-387
MapR user tickets get overwritten when Kerberos authentication is enabled.
CORE-416
The configure.sh script must support environments where root activity is not allowed.
CORE-427
Local Spark shuffle volumes if damaged, need to be automatically recreated when NodeManager starts.
CORE-472
The disk list command runs very slowly and fails to identify the right MapR-FS disk when more than one symlink points to the same disk.
CORE-476
Running configure.sh with the -R option, erases the value of the MAPR_JMXAUTH variable in the env_override.sh file. This prevents NodeManager from starting.
CORE-480
Volume rack path is not updated for volumes with local path set when moving nodes.
CORE-517
The configure.sh script should not silently alter permissions in the /etc/shadow file. Added the -no-auto-permission-update option as the fix.
CORE-566
The hoststats process crashes continuously and causes cluster auditing to fail.
CORE-571
Oozie server does not start because of a missing jar file.
MFS-1984
The maprcli dashboard info command returns incorrect compression statistics.
MFS-1985
POSIX Client service (FUSE) does on auto start on system reboot on Ubuntu 16.
MFS-2015
The maprcli dashboard info command returns incorrect memory statistics.
MFS-2019
API Server hangs intermittently and fails to access CLDB servers, in a multi-NIC environment.
MFS-2051
Improve the clarity of NFS logs.
MFS-2055
CLDB crashes when processing alarms.
MFS-2062
Remote mirroring fails repeatedly even after a source CLDB that went down is restarted and operational.
MFS-2143
MFS does not preserve excluded volume audit data operations on restart.
MFS-2144
Spark streaming tasks are stuck indefinitely when looking up tablets.
MFS-2209
Exception occurs when calling the getVolName() function.
MFS-2211
Name Container master freezes during resync of orphan entries and causes MFS and the resync operation to restart frequently.
MFS-2218
MFS randomly crashes if errors occur when reading data.
MFS-2260
MapR jobs fail since the file client fails to check the MapR filesystem to determine the status of the RPCs sent previously to MFS, before resending them.
MFS-2266
Spark encounters RPC errors when reading files from volumes with wiresecurity enabled.
MFS-2273
Storage Pools fail randomly when MFS is restarted, and many containers go offline without a valid replica.
MFS-2275
Spark jobs fail intermittently when trying to retrieve rows from MapR DB tables.
MFS-2294
CLDB crashes when registering NFS version 4 servers.
MFS-2298
CRC errors occur randomly in Storage Pools and cause them to go offline.
MFS-2306
Master CLDB crashes when MFS nodes are added or removed frequently.
MFS-2307
Add an internal cluster level flag to prevent storage pools from going offline when Read CRC errors are encountered.
MFS-2323
NFS Server version 4 boot script should not contain hardcoded user and group (mapr:mapr).
MFS-2343
The node list command should not display nodes which contain only the POSIX client (edge node). Using the node list command without the -clientsonly true or the -nfsnodes true option, does not list edge nodes. To include edge nodes, use the -nfsnodes true or the -clientsonly true option.
MFS-2344
The NODE_ALARM_NO_HEARTBEAT (No Heartbeat) alarm should not be raised for POSIX clients (edge nodes). CLDB has a new parameter cldb.ignore.posix.only.hb.alarm that controls whether this alarm is raised for edge nodes.
MFS-2392
gfsck fails on secure clusters due to a missing library.
MFS-2423
POSIX only clients should be immediately removed when marked dead.
MFS-2444
FUSE process does not remove shared memory segments resulting in volumes failing to mount.
MFS-2462
Crash in MapR DB when looking up role memberships.
MFS-2498
FUSE does not honor the product build value.
MFS-2573
Mirroring fails with a CLBD internal error when an invalid container ID is found on the source cluster.
MFS-2610
Persistent Volume mounts hang when tickets expire.
MFS-2628
Null Pointer Exception in CLDB Server.
MFS-2630
The mrconfig info threads command crashes the MFS process when attempting to retrieve volume aces, when Extended Attributes are not enabled.
MFS-2631
CLDB shuts down when adding or removing NFS servers.
MFS-2632
CLDB operations fail with the Server Retry error.
MFS-2659
MFS process hangs intermittently in environments with multiple NICs.
MFS-2694
Stack overflow in MFS when deleting a container chain.
MFS-2695
Cross cluster mirroring fails after enabling the Snapshot Lite feature.
MFS-2725
NFS Server version 3 crashes randomly when trying to satisfy mount requests.
MFS-2731
MFS does not automatically retry reconnecting to CLDB after a connection reset request.
MFS-2732
RPC connections between MFS and CLDB fail intermittently with Connection Reset by Peer errors.
MFS-2757
MapR service status is displayed incorrectly due to change in systemd.
MFS-2767
File client tries to connect to the same failed CLDB node repeatedly.
MFS-4480
NFS Server version 4 crashes intermittently.
MFS-4485
FUSE client fails to work with a scoped impersonation ticket.
MFS-4531
Snapshots of mirrors are not deleted after mirroring completes.
MFS-4551
loopbacknfs does not log any messages to the loopbacknfs.log file.
MFS-4562
File stat on the FUSE mount indicates the block size as a fixed value (512) instead of the client's block size.
MFS-4585
CLDB exception occurs when a NFS heartbeat reports a failed Virtual IP.
MFS-4597
Warden and the maprcli command intermittently cannot start dependent services.
MFS-4605
CLDB shuts down when ACL size exceeds the threshold value.
MFS-4667
Replicated operations fail and cause frequent resyncs.
MFS-4776
FUSE RPCs fail intermittently.
MFS-5356
The getAces() API raises a Null Pointer Exception when called on a non-existing object.
MFS-5405
Client sends the NODE_ALARM_SERVICE_NODEMANAGER_DOWN alarm but CLDB raises the NODE_ALARM_SERVICE_OPENTSDB_DOWN alarm.
MFS-5422
The create() API does not create files with the same permissions as the parent directory.
MFS-5430
NFS Server is unable to parse lines exceeding 8K characters in the exports file.
MFS-5482
Memory leak in CLDB master instance.
MFS-5488
Jobs on random nodes fail to create FileClient.
MFS-5502
Path lookup error occurs when client nodes run a newer version of MapR than the CLDB server nodes.
MFS-5711
Unable to access files on EC tier due to I/O error and StripeletIO failure.
MFS-6585
Applications intermittently fail to detect updates to the ticket file.
MFS-6587
Avoid flooding the CLDB log with invalid snapshot ID messages.
MFS-6667
hoststats creates defunct Python processes.
MFS-6748
Automatic offload does not trigger EC offload.
MFS-6873
Cross cluster move operation fails on FUSE.
MFS-6874
Node Manager fails repeatedly during log aggregation.
MFS-8452
Memory leak in loopback NFS.
MFS-8459
Fixed volume access problems for volumes that reused the volume ID of deleted volumes.
MFS-10328
maprlogin renew (ticket renewal) fails to refresh group memberships.
MFS-10743
Node Manager fails to report container failure and loops between slave CLDBs, without contacting the new master CLDB.
MFS-10825
Cluster is unable to self-heal from the VOLUME_ALARM_DEGRADED_EC_STRIPES (Warm-Tier Data Node Down) alarm, and rebuild does not occur.
MFS-10845

Volume creation fails to honor the credentials of the impersonated user while creating the parent directory.

MFS-11002
fsck crashes due to an inode reservation issue.
MFS-11109
Filesystem crashes due to a leak in orphanage reservation.
MFS-11171
Restrict the tenant ticket so that it cannot mount non-tenant volumes in POSIX.
MFS-11221
Drill query crashes in MapR client due to an unexpected exception during fragment initialization.
MFS-11243
Memory leak in FUSE. Added a FUSE tunable (fuse.max.cache.pages) to limit the amount of memory that each FUSE process can use when working with a large number of open files.
MFS-11295
Offloading fails when mastgateway is stuck in compaction state.
MFS-11442
FUSE client does not honour the location of the cluster configuration file as defined by the parameter fuse.cluster.conf.location.
MFS-11609
Hadoop distcp jobs fail when using CLDB hostname and port.
MFS-11647
SlowOPs trace function does not work for NFSv3.
MFS-11674
Enable gfsck to perform CRC checks without blocking the operations on the EC frontend volume. See gfsck for the new -D|--crc option.
MFS-11682
CLDB volume dump fails with an RPC error due to an unknown session key.
MFS-11729
mrconfig info threads crash the filesystem when hardlinks are not enabled.
MFS-11731
Suppress redundant incorrect build version alarms.
MFS-11740
Filesystem crashes when compacting memory.
MFS-11779
MFS dumps core due to stack overflow.
MFS-11823
Service ticket renewal does not honour duration.
MFS-11838
Jobs fail with the "Too many open files" error.
MFS-15415
Volume dump restore fails with error 20020 (ENOTICKET) despite having a ticket and using it successfully for a long time

Hadoop

MAPRHADOOP-61
Kerberos fails for services when a custom ticket location is set in the env.sh file.
MAPRHADOOP-83
Upgrade Tomcat servers to their latest version or remove them if they are not needed.
MAPRHADOOP-102
Error occurs in ACEs when the Hive resource downloader internally copies files from the MapR filesystem to the local filesystem.
MAPRHADOOP-131
Update Jersey to its latest 1.X version.

MapR-DB

MAPRDB-1236
The Tiny Bucket Flush alarm is raised even when the node has sufficient memory.
MAPRDB-1589
Incorrect key sorting when using the orderby clause with conditions.
MAPRDB-1719
DB server crashes when columnset is used without initialization.
MAPRDB-1732
Inserting data into MapR-DB fails intermittently with an Invalid Argument error.
MAPRDB-1889
The Java API findById() intermittently fails to retrieve complete projection details from JSON documents.
MAPRDB-1985
The mapr dbshell find command crashes when run on a table with a huge number of tablets.
MAPRDB-1995
MapR-DB raises intermittent false VOLUME_ALARM_TABLE_REPL_LAG_HIGH alarms for replicated streams.
MAPRDB-2062

Failed to scan table on a remote secure cluster using the mapr dbshell utility because of a wrong ticket that was sent to ZooKeeper.

MAPRDB-2072
Data Access Gateway (DAG) fails to fetch indexes as it queries indexes as the mapr user instead of the impersonated user.
MAPRDB-2091
MapR-DB hangs due to inodes being recycled even when they are in use.
MAPRDB-2092
In MapR-DB, adding a table index or replicating a table fails if the cluster administrator (MAPR_USER) does not have write access to the parent volume of the table.
MAPRDB-2098
MapR DB crashes when multiple threads modify the size_ variable while calculating the serialised JSON document size.
MAPRDB-2103
PUT operations on binary tables fail when the values of the wireSecurityEnabled field vary between the FileClient and the FileServer.
MAPRDB-2120
Drill query on MapR DB intermittently fails with a DB Scan exception.
MAPRDB-2125
OJAI APIs fail to connect to ZooKeeper.
MAPRDB-2159
DB Autosetup, Indexing, and Replication fail as MFS is unable to reach the local Gateway.
MAPRDB-2201
Memory leak in BaseJsonTable caused by a dangling reference of MetaTable.
MAPRDB-2254
AppendStream fails when Gateway closes the inactive stream, and raises the replication lag alarm.
MAPRDB-2267
DB crashes during heap memory allocation.
MAPRDB-2303
The Replication Lag alarm does not display the actual lag value.
MAPRDB-2315
MapR-DB scan fails on large tables.
MAPRDB-2323
The Table Replication (VOLUME_ALARM_TABLE_REPL_ERROR) alarms are missing information such as the actual bucket FID that produced the alarm, if applicable, and the error code and description of the replication error.

Performance

MS-560
MapR cluster nodes experience high network traffic from mapr-stream clients.
MAPRDB-1727
Delay in data retrieval on MFS nodes with a large number of outstanding active buckets and high usage of DB memory.
MAPRDB-2113
MapR-DB needs to select the most appropriate index in cases where more than one index has been defined over the same field of a MapR-DB table.
MAPRDB-2156
When running queries with a set timeout, the number of threads on the MapR client increases up to 500, exhausting the Thread Pool, and causing the client to stop responding completely, even after all queries time out.
MAPRDB-2250
Too many BatchGet operations in parallel when secondary indexes are present on a table, causes MapR DB to crash. Added a parameter mfs.db.max.concurrent.internal.ops to regulate the number of parallel BatchGet operations.
MAPRMR-8
Reduce the number of input splits that are generated when a job is processed through the CombineFileInputFormat() function. Added the parameter mapreduce.input.fileinputformat.split.maxblocknum that determines the number of blocks that can be added to one split.
MFS-2078
Speed up FUSE path lookups. Added the fuse.negative.timeout parameter to cache negative lookup results.
MFS-2082
Optimize directory lookup and traversal to avoid overwhelming MFS with RPCs.
MFS-2324
Optimize disk space reserved for tiering operations.
MFS-2608
Priority of child threads do not change when the priority of their parent process is changed.
MFS-2638
Avoid re-sorting results in CLDB for the default output of the maprcli alarm list -sortby alarmtype command.
MFS-2691
Optimize fetching of Muted and Raised Alarms.
MFS-2711
Optimize removal of expired snapshots to free up CLDB CPU from background activity.
MFS-2749
The Alarm History feature needs to be disabled on large clusters as it can degrade performance. Added a parameter cldb.disable.alarm.history to disable alarm history.
MFS-3291
File client does not honor the number of flusher threads set in the coresite.xml file. See the fs.mapr.threads parameter.
MFS-4532
Jobs fail with I/O error or are very slow to complete.
MFS-4670

CLDB process consumes 78.8 GB approximately every 6-7 hours and triggers CLDB failover very often.

MFS-4687
Memory leak in CLDB.
MFS-4750
Disk Balancer is unable to move containers from full Storage Pools as they fail the Volume underweight check. Added a tunable - prevent.volume.skew.by.diskbalancer to let the Disk Balancer allow or prevent volume skew.
MFS-4805
Fixed memory leak in NFS Server version 3 that occurs when profiling memory. Added the following entities in /opt/mapr/conf/nfsserver.conf:
  • MemDebugEnable - Set to true to enable Memory Debugging.
  • HighMemLimitMB - Sets the maximum amount of memory that the NFS Server can use.
MFS-5227
NFS Server hangs or is very slow and causes replication and resync failures.
MFS-5700
CLDB master failover time is very high.
MFS-6539
NFS Version 3 Server on edge nodes does not have the ulimit setting, as warden is not available on these nodes.
MFS-5724
The MFS configuration parameter mfs.max.restore.count is not being honored causing mirror resync operations to be delayed due to the lack of sufficient restore slots.
MFS-6547

Massive delay in mounting the configured mount points after starting the NFS service.

MFS-6666
NFS server should throttle RPCs to avoid overwhelming CLDB.
MFS-6785
The response from the mrconfig info containers rw command is slow on a cluster with large number of volumes.
MFS-6869
Fuse client should limit the number of RPCs to prevent overwhelming CLDB.
MFS-7181
FileClient defaults to 8KB reads instead of 512KB.
MFS-8475
The createsystemvolumes.sh script took hours to complete when adding a new node to a cluster with a large number of volumes.
MFS-11111
Queuing and CLI RPC processing are slow in CLDB.

Security

COMSECURE-331
Security vulnerability in the JNDI-bindable DataSources library.
COMSECURE-334
Security vulnerability in the DOM4j XML framework.
COMSECURE-335
Security vulnerability in the Jasper library.
CORE-290
The /opt/mapr directory contains files and directories with insecure permissions.
CORE-293
After upgrading system security packages, mapr-zookeeper and mapr-warden are not properly started with systemd. The ps command reports them as started, while systemd reports errors when trying to start these services.
CORE-384
Remote Code Execution vulnerability in the ZooKeeper Java JMX server. Added a parameter JMXDISABLE to enable or disable loading ZooKeeper JMX parameters.
MAPRDB-2251
Standardize JMX handling for Java processes to prevent vulnerabilities.
MAPRDB-2255
Stream ACE u:mapr | has the potential to lock out the administration of the stream.
MAPRHADOOP-63
Security vulnerability in jackson-databind.
CORE-562, MAPRHADOOP-123
Security vulnerabilities in MapR 6.x JAR files.
MAPRHADOOP-58, MAPRHADOOP-64, MAPRHADOOP-136, MAPRHADOOP-137
Multiple security vulnerabilities in Hadoop.
MAPRYARN-241
Remote Code Execution vulnerability in the YARN Java JMX Server.
MFS-2336
File Client impersonation does not honour the permissions of the actual user.
MFS-2493
The /tmp/cldbinfo/unreachableCldbs is created with insecure permissions.
MFS-2551
Local Privilege Escalation vulnerability in the maprexecute command.
MFS-2645
Fixed a buffer overflow in NFS Server version 3.
MFS-2661
Snapshot creation fails due to a permission error.
MFS-2685
The maprcli commands use the wrong ticket to communicate with ZooKeeper in secure, cross cluster environments.
MFS-2700
FUSE kernel sends the wrong user credentials to the MapR FUSE Process.
MFS-2708
Disk failure related log files have insecure permissions.
MFS-3310
Need an alert to warn about expiring SSL certificates. Added the Security Certificate Expiry Alarm.
MFS-5229
Remote Code Execution vulnerability in the MAST Gateway JMX Server.
MFS-5234
Remote Code Execution vulnerability in the CLDB JMX Server.
MFS-5235
Remote Code Execution vulnerability in the Gateway JMX Server.
MFS-5236
Remote Code Execution vulnerability in the Warden JMX Server. Added a new parameter warden.enable.jmxremote that must be explicitly set to true to enable the Warden JMX Server.

MapR Streams

MS-762
Customer Streams face cursor commit failures.
MS-557
Commits fail for MapR Streams on volumes that were previously mirror volumes but are now standard volumes.

Upgrade

MS-925
After upgrade to EEP 6.2 (Spark 2.4.0), Kafka/ MapR Streams cannot be consumed.
MFS-2079
After upgrading to MapR version 6.1.0, the volume Name Container hangs when assigning volume names for volumes created with MapR version < 4.0.1.
MFS-2469
After upgrading MapR from version 5.2.2 to 6.1.0, slave CLDB nodes are stuck during initialization.
MFS-2553
Ecosystem jobs using ZooKeeper fail after upgrading to the MapR 6.1 EBF patch.
MFS-2560
CLDB on a MapR 5.2.2 cluster gets overwhelmed with RPC calls when mirroring from a MapR 6.1.x cluster.
MFS-2561
The VOLUME_ALARM_DATA_UNDER_REPLICATED alarm is generated frequently after upgrading MapR from version 3.0.1 to version 6.1.0.
MFS-2675
Certify MapR version 6.1.0 on RHEL 7.7
MON-3922
When upgrading to MapR 6.x, ensure that volumes prior to MapR version 6.0, which lack volume aces are handled gracefully after upgrade.
MON-4862
After upgrading from MapR 5.2.1 to MapR 6.1, API server fails to start with an M5 license without tables support installed.
MON-4892
Snapshot tab in MCS indicates that license upgrade is needed after upgrading from MapR 5.2.1 to MapR 6.1 with M5 license installed.

YARN

MAPRMR-4
With centralized logging, YARN does not populate stderr and stdout logs.
MAPRMR-19
Applications fail with the Jobstatus not available exception. The ApplicationMaster has already finished processing each job but the Job History Server has not yet updated job statuses. This causes the failures. Two options have been added to YARN to retry fetching job statuses.
  • yarn.app.mapreduce.job.update-status-max-retries - The number of times to retry.
  • yarn.app.mapreduce.job.update-status-retry-interval - The interval to wait before each retry attempt.
MAPRYARN-127
Resource Manager fails with a Concurrent Modification Exception.
MAPRYARN-155
Containers fail to launch if property names contain a dash (-) in the launch_container.sh script.
MAPRYARN-161
Deletion of History Server logs is stopped when an invalid application directory is found within the log aggregation directory.
MAPRYARN-171
YARN preemption does not occur with Fair and DRF scheduling policies.
MAPRYARN-191
YARN API requests via CLI do not return any result when cluster has Label-Based-Scheduling enabled.
MAPRYARN-192
MapReduce jobs fail if their labels contain the logical operand character (&&).
MAPRYARN-193
Resource Manager crashes when sorting Collections using the FairShare comparator.
MAPRYARN-195
Resource Manager exits with a FATAL error.
MAPRYARN-203
Resource preemption fails and returns a Null Pointer Exception.
MAPRYARN-210

Use per-node local volumes for YARN log aggregation instead of a single volume. Added the Local Log Aggregation Feature.

MAPRYARN-221
Containers hang in LOCALIZING state. Added two options:
  • yarn.nodemanager.timeout-localizing-container - The maximum time to wait to localize resources for containers.
  • yarn.nodemanager.check-interval-localizing-container.ms - The frequency at which the ApplicationMaster checks the running time of the localizing container.
MAPRYARN-223
Maximum idle time of the Jetty connection should be configurable.
MAPRYARN-244
Resource Manager hangs when trying to shut down after CLDB failover.
MAPRYARN-246
Resource Manager hangs when there is a space in the name of the queue in the fair-scheduler.xml file.
MAPRYARN-249
Resources needed to preempt should not have negative vcore values.
MAPRYARN-250
Job History Server (JHS) hangs under heavy load when scanning MFS for job history files. Added a parameter, mapreduce.jobhistory.intermediate-done-scan-timeout to set the timeout in milliseconds for rescanning the done_intermediate user directory.
MAPRYARN-258
Publish system metrics in batches so as to avoid overloading the Application Timeline Server (ATS).
MAPRYARN-261
Administrator users who are not part of the mapr group are not able to view the logs of the running jobs of another user.
MAPRYARN-276
Resource Manager crashes with a Null Pointer Exception.
MAPRYARN-284
YARN kills container but the task process is not killed.
MAPRYARN-287
When the ClientRMService processes an application kill request, the application diagnostics should report the user and the host that issued the kill request.
MAPRYARN-291
On RHEL 8.2, Warden must run Node Manager with umask 022 on MapR Core 6.1.0.