Issues Resolved in 5.2.2
Issues resolved in 5.2.2 release
The following customer-reported issues which were observed in version 5.2.0 or 5.2.1 are resolved in Version 5.2.2.
Put issues list here
Product | Number | Description | Resolution |
CLDB | 23685 | When an SP went offline, it was not immediately reflected in the "Number of Storage Pools Offline" metrics because it took about 60 minutes for CLDB to declare an SP as offline. | With this fix, the "Number of Storage Pools Offline" metric is updated within 2 minutes of a storage pools stopping heartbeating. Newly introduced "Number of Storage Pools To Rereplicate" metric is updated only after an hour to reflect CLDB view of offline SPs. |
CLDB | 24885 | In rare cases, querying for information about the nodes in the cluster, along with removing nodes in the cluster, caused the CLDB to shutdown or failover. | Concurrent processing of the above operations is now handled correctly without causing the CLDB to fail. |
CLDB | 25800 | Restoring mapr.cldb.internal volume from a dump fails. | CLDB restore from volume dump is successful. |
CLDB | 26335 |
When a volume was deleted, the associated snapshots were also deleted; but the snapcids in CLDB were not deleted immediately. |
With this fix, CLDB will now purge all snapshots when a volume is deleted. |
CLDB | 26705 | If dump replicationmanagerququeinfo command is run when CLDB is checking for under-replicated containers, CLDB would sometimes crash (resulting in a failover) because of a race condition. | With this fix, CLDB will no longer crash if dump replicationmanagerququeinfo command is run. |
CLDB | 27298 |
CLDB crashed with NPE when the fileserver reported an empty feature list to CLDB. |
With this fix, CLDB will no longer crash when fileserver reports empty feature list. |
CLDB | 27475 | Sometimes CLDB was marking a container invalid immediately after it asked it to become master. | With this fix, CLDB will no longer mark a container invalid after asking it to become master. |
CLDB | 28002 | Sometimes, CLDB threw an exception and failed over during schedule update because of a race condition. | With this fix, CLDB will no longer fail over during schedule update operation. |
Configuration/FS | 26558 |
NodeManager fails to start correctly with the error "true: integer
expression expected" due to an invalid string-comparison operator in
|
|
DB | 26714 |
MapR-DB returns expired rows while retrieving from value cache. |
With this fix, expired rows no longer return. |
DB | 27018 | Jobs performing put operations and locking keys could get stuck. | Put operations work properly |
DB | 27300 | During a bulk load into a MapR-DB table when at least one snapshot of the volume where the table resides, the bulk load could hang indefinitely. | With this fix, the bulk load to the MapR-DB table will complete. The load may be delayed up to two minutes, but it will recover automatically. |
DB | 28209 | A GET call on a binary table copy returns rows that had been deleted from the copy of the table. | GETs against binary table copies work correctly and do not return rows that were previously deleted. |
DB | 28511 | MFS log shows continuous high mfs memory usage alarms and MFS restarts. | With this fix, system runs normally and high mfs memory usage alarms are not raised. |
DB-JSON | 27705 | Multi-threaded applications doing inserts using the OJAI API to write to JSON tables at high frequencies periodically could hang, lead to a core dump with mapr::fs::DbClntPutBuffer::shouldFlush , and need to be killed. | With 5.2.2, inserts under this type of workload no longer cause client applications to crash or hang. |
DB-JSON | 27878 | Performing an _id projection with a non-existing id could result in MapR-DB ceasing to work. | Non-existing _id projection will no longer cause MapR-DB failures. |
DB-JSON | 28378 | For tables created with a non-default columnFamily on an array field path, find and findById with acondition on array element does not return the correct result. | With this fix, findby and findbyid work correctly with a non-default column family being an array field path. |
Fileclient | 26781 | The hadoop mfs –rmr command was causing a crash if there were parallel
deletes/rmdir commands being executed while the rmr command was getting
executed |
With this fix, the command will not cause a crash. |
FileClient | 27757 | YARN Distributed shell application fails when a RM failover happens | RM failover no longer causes YARN distributed shell applications to fail |
FileClient | 27940 | User credentials were not set correctly in the read and adviseFile APIs from Java client resulting in Drill queries failing intermittently with permission denied errors in the logs. | With this fix, user credentials will be set correctly in the read and adviseFile APIs from Java client. |
FileClient | 28528 | The FsAction.READ_EXECUTE operation on directory was throwing
AccessControlException. |
The FsAction.READ_EXECUTE operation on directory will no
longer result in AccessControlException. |
Fileserver | 24898 |
When bringing an SP online, MFS crashed because of memory double free issue. |
With this fix, MFS will not crashing when bringing an SP online. |
FileServer | 25538 | In case of simultaneous failures on multiple nodes, in some very rare
circumstances, concurrent seattr operation resulted in inconsistent
attributes being propagated. |
With this fix, the race condition is properly handled. |
FileServer | 27223 | CLDB crashes every 4-6 hours with core dump. | System behaves correctly |
FileServer | 27440 |
The chown command was failing on immutable directories if groups were not specified with the command, even when the command was run by the owner of the directory. |
With this fix, the chown command will complete successfully when run by the owner with or without the groups information. |
FS-Fuse | 24595 | The FUSE client stores libMapRClient in /tmp. | The location for MapR libraries is now configurable in fuse.conf file. See "Configuring the MapR FUSE-Based POSIX Client". |
FS-Fuse | 26570 | The gethostbyname() call was returning h_addr_list as NULL causing MFS to crash. | In this patch, the obsolete gethostbyname API has been replaced with the getaddrinfo API. |
FS-Fuse | 26757 | If there was data mutation (but not the length) resulting in the file size being the same, but mtime was different, the FUSE kernel was consuming data from page cache instead of initiating a new READ operation. | With this fix, on kernels greater than version 3.6, the metadata change will invalidate the kernel page cache and trigger the POSIX client to initiate a new READ operation. |
FS-Fuse | 26961 | The FUSE-based POSIX client crashed when file was renamed with sticky set on parent directory. | With this fix, the FUSE-based POSIX client will allow file rename when sticky bit is set on parent directory. |
FS-Fuse | 26991 | Read requests beyond the file size from the POSIX client was being sent to incorrect file descriptor. |
With this fix, read requests beyond the file size will not go to incorrect file descriptor. |
FS:Mirror2RW | 27605 | Mirror stop fails to stop mirror when source cluster is down. | With this fix, request to stop mirror works correctly. |
FS:resync | 26092 and 26096 | The snapshots created for resync (or mirroring operation) were not getting deleted after resync. |
With this fix, snapshots created for resync operation will be deleted after resync operation. |
Installer:Stanzas | IN-315 |
Ubuntu upgrade using stanzas fails from 5.1 to 5.2.0 or 5.2.1. |
Ubuntu upgrades from 5.1 to 5.2.x work correctly. |
MapR Streams | 26966 | Unable to delete a connector when using Kafka Connect for MapR Streams in distributed mode. Kafka Connect API indicated that the connector has been deleted, but the connector was still listed as an active connector. | Able to delete a connector: DELETE /connectors/(string:name)/
executes correctly. |
MapR Streams | 25747 | Distributed mode was not available in version 2.0.1-1611 of Kafka Connect for MapR Streams. | Distributed mode is available for MEP 2.0.1 and MEP 3.0.0. |
Multi-MFS | 22964 | Multi-MFS allowed the disk balancer to place multiple replicas of one container on the same node, resulting in the loss of multiple copies during a node failure. | Multiple replicas of one container are not allowed on the same node, providing better fault tolerance during node failures. |
RPC | 26718 | On a multi-NIC cluster, sometimes resync operation failed with intermittent errors due to multiple RPC bindings from the same host. |
With this fix, resync will no longer fail on a multi-NIC cluster. |
Yarn | 26839 | Network issues can cause the ResourceManager (RM) web UI to hang until it is restarted. | With this fix, the RM web UI no longer causes an outage of the web UI. |
Yarn | 26898 | The yarn queue -status <queuename> command does not show
information about the queue labels because certain Apache properties for retrieving
the information are not supported. |
With this fix, the MapR "Label" and "Label Policy" properties are provided as
outputs to the yarn queue -status <queuename> command in lieu of
the unsupported Apache properties. |
Issues resolved in 5.2.1 release
The following customer-reported issues which were observed in version 5.2.0 are resolved in Version 5.2.1.
Product | Number | Description | Resolution |
AWS SDK jar | 24566 | An older version of the aws-sdk jar was built with Mapr. | With this fix, MapR upgraded the aws-sdk jar from version 1.7.4 to 1.7.15 |
Build | 24992 | Installing a MapR patch caused jar files to be removed from under the drill/drill-1.4.0/jars/ directory | Jar files are no longer incorrectly removed |
CLDB | 14105 | When nodes attempt to register with duplicate IDs, CLDB does not register the nodes and log meaningful error messages. | With this fix, when nodes attempt to register with duplicate IDs, CLDB will log appropriate error messages. |
CLDB | 24413 | CLDB was crashing when volume replication was greater than 3. | With this fix, CLDB will not crash when volume replication factor is greater than 3 |
CLDB | 24647 | On a node with multiple host IDs, CLDB crashed and failed over to a new CLDB when a stale host ID was removed. | With this fix, CLDB will not crash and fail over when a stale host ID is removed. |
CLDB | 24651 | CLDB threw an exception and failed over when the snapshots list was iterated over while snapshots were being created. | With this fix, CLDB will no longer fail over when snapshots list is iterated over while new snapshots are being created. |
CLDB | 24662 | Intermittently, CLDB was shutting down because of race between initialization and use of license. | With this fix, the license will be completely initialized before being used. |
CLDB | 24770 | Under high load, sometimes CLDB would be caught up in a deadlock when updating volume info and volume snapshot count simultaneously. | With this fix, there will no longer be a deadlock when updating volume info and volume snapshot count. |
CLDB | 25708 | After a rolling upgrade of certain nodes to 4.1 or later, operations from these nodes to nodes running 4.0.2 and prior versions of MapR were getting stalled because MapR 4.0.2 and older versions did not process the new RPC introduced with MapR 4.1. | With this fix, operations on nodes running MapR 4.0.2 will not be stalled. |
CLDB | 26214 | During rolling upgrade, if slave CLDB is upgraded before master CLDB, the slave CLDB may crash when accessing new KvStore tables. | With this fix, slave CLDB will not crash on reading new tables, even if they are non-existent. |
CLDB | 26335 | When snapshots were deleted as part of volume remove, CLDB tables, which store snapshot info, were not purged at a fast rate. As a result of this, the cid 1 container grew in size gradually. | With this fix, snapshot tables will now be purged properly when volumes are removed and the size of cid 1 container will not grow. |
DB | 24745 | An assertion failure occurs in MapR-FS due to zero (0) length field names in OJAI documents. | With this fix, the assert failure will no longer occur. |
DB | 24807 | The run time of MapR tasks with counters is slow for file output commits. Because MapR-DB uses time-based trigger for bucket flushing, any unused buckets for 5 mins are flushed. These unused buckets are flushed every 2 sec in batches of 12. If there are a lot of buckets, regardless of size, that are unused for that period of time, the load caused by the flushing impacts performance. | The time-based bucket flush was disabled, preventing slow performance. |
DB | 25241 | In MapR-DB, when using the HBase Java FuzzyRowFilter filter, the wrong result is returned. this occurs because the mask preprocessing converts 1 to 2 and 0 to -1. | With this fix, the correct results are returned. |
DB | 25333 | An exception occurs when a JSON document is re-inserted into the same row after a table’s time-to-live has expired. | With this fix, the exception will no longer occur and re-insertion will complete successfully. |
DB | 25401 | The cumulative cost becomes a negative value when a MapR-DB table has more than 2147483647 rows. | With this fix, the return type for the Java getNumRows() API is changed to long and the correct value is preserved. |
DB-JSON | 26338 | When using an "In" condition for QueryCondition API on only the _id field, the last record is omitted. | With this fix, all records are returned. |
DB-Marlin | 24408 | When running multiple producers as separate threads within a process, with a very small value for buffer.memory (say 1KB), some producers can stall. This is due to a lack of buffer memory. | With this fix, the default value for minimum buffer memory is increased to 10kB. |
FileClient | 24053 | During client initialization, the client crashed if there was an error during initialization | With this fix, the client will not crash if there is an error during initialization. |
FileClient | 25471 | The readdir operation was returning incorrect entries when the child entries were volumes because of an issue with volume attributes on the client side | With this fix, volume attributes will be set correctly for lookup and readdir operations. |
FileClient | 28528 | The FsAction.READ_EXECUTE operation on directory was throwing
AccessControlException. |
The FsAction.READ_EXECUTE operation on directory will no
longer result in AccessControlException. |
MapR-FS | 12856 | When the hadoop fs -rmr command is run, it reads entire directory contents into memory before starting to delete anything resulting in Out Of Memory error. | This fix includes a new hadoop mfs -rmr path command that:
|
MapR-FS | 20644 | Sometimes, when mirroring large number of containers, the volume mirror thread was crashing resulting in a CLDB failover. | With this fix, the mirroring process will be resilient to large number of containers. |
MapR-FS | 22044 | The CLDB logs were growing to a large size with stdout and stderr messages when a user's ticket expired. | With this fix, the CLDB logs will not grow to a large size with stdout and stderr messages when a user's ticket expires because the log level of messages related to ticket expiration has now been changed to Debug. |
MapR-FS | 23652 | The POSIX loopbacknfs client did not automatically refresh renewed service tickets. | With this fix, the POSIX loopbacknfs client will:
|
MapR-FS | 23975 | In version 5.1, MFS was failing to start on some docker containers as it was trying to figure out number of numa nodes from /sys/devices/system/node. | With this fix, MFS will work on docker containers. |
MapR-FS | 24022 | Mirroring of a volume on a container which does not have a master container caused the mirror thread to hang. | With this fix, mirroring will not hang when the container associated with the volume has no master. |
MapR-FS | 24139 | If limit spread was enabled and the nodes were more than 85% full, CLDB did not allocate containers for IOs on non-local volumes. | With this fix, CLDB will now allocate new containers to ensure that the IO does not fail. |
MapR-FS | 24155 | Disk setup was timing out if running trim on flash drives took some time. | With this fix, disk setup will complete successfully and the warning message (“Starting Trim of SSD drives, it may take a long time to complete”) is entered in the log file. |
MapR-FS | 24159 | The mtime was updated whenever a hard link was created. Also, when a hard link was created from the FUSE mount point, although the ctime was updated, the update timestamp only showed the minutes and seconds and not the nanoseconds. | With this fix, mtime will not change on the hard link and when a hard link is created from the FUSE mount point, the timestamp for ctime will include nanoseconds. |
MapR-FS | 24249 | When running map/reduce jobs with older versions of the MapR classes, a system hang or other issues occurred because the older classes linked to the native library installed on cluster nodes that were updated to a newer MapR version | With this fix, the new fs.mapr.bailout.on.library.mismatch parameter detects mismatched libraries, fails the map/reduce job, and logs an error message. The parameter is enabled by default. You can disable the parameter on all the TaskTracker nodes and resubmit the job for the task to continue to run. To disable the parameter, you must set it to false in the core-site.xml file. |
MapR-FS | 24352 | Mirror synchronization is not optimized. | In this patch, mirror synchronization has been optimized for changes in
a small percentage of the inodes. During mirror resync operation, the destination
will send the recent version number from the last mirror resync operation. While
scanning inodes to identify the inodes that have changed since the last resync
operation, MFS will now compare the version number sent by the destination with
the allocation group, which keeps track of all the inodes. If the allocation group
version is:
|
MapR-FS | 24585 | Excessive logging in CLDB audit caused cldbaudit.log file to grow to large sizes. |
With this fix, to reduce the size of cldbaudit.log file, the
queries to CLDB for ZK string will no longer be logged for auditing. |
MapR-FS | 24618 | Remote mirror volumes could not be created on secure clusters using MCS even when the appropriate tickets were present. | With this fix, remote mirror volumes can now be created on secure clusters using MCS. |
MapR-FS | 24630 | Under some conditions, using the 'ls' command with --full-time option produced incorrect results that showed as a negative number. | With this fix, the correct timestamp is supplied. |
MapR-FS | 24660 | MFS crashed because the maximum number of slots for backgrounded delete operations was not adequate. The incoming client operations reserving these slots were hanging and causing MFS to crash. | With this fix, MFS will not crash as the number of slots for background operations has been increased. |
MapR-FS | 24712 | During container resynchronization, the same scratch space was being reused by internal parallel operations resulting in corruption. | With this fix, internal parallel operations will use separate scratch spaces. |
MapR-FS | 24846 | If the topology of a node changed, after a CLDB failover, the list of nodes under a topology could not be determined as the new non-leaf topologies were not being updated. | With this fix, the inner nodes of topology graph will be updated correctly and the list of nodes under an inner (non-leaf) topology will be determined correctly. |
MapR-FS | 24915 | Running the expandaudit utility on volumes can result in very large (more than 1GB) audit log files due to incorrect GETATTR (get attributes) cache handling. | With this fix, the expandaudit utility has been updated so that it will not perform subsequent GETATTR calls if the original call to the same file identifier failed. |
MapR-FS | 24965 | On large clusters, sometimes the bind failed with the message indicating unavailability of port when running MR jobs, specifically reducer tasks. | With this fix, the new fs.mapr.bind.retries configuration parameter in core-site.xml file, if set to true, will retry to bind during client initialization for 5 minutes before failing. By default, the fs.mapr.bind.retries configuration parameter is set to false. |
MapR-FS | 24971 | When the mirroring operation started after a CLDB failover, sometimes it was sending requests to slave CLDB where data was stale, resulting in the the mirroring operation hanging. If the CLDB failover happened again during this time, the new CLDB master was discarding data resynchronized by the old mirroring operation, but marking the mirroring operation as successful. This resulted in data mismatch between source and destination. | With this fix, mirroring requests will be sent to master CLDB node only. |
MapR-FS | 25041 | Whenever a newly added node was made the master of the name container, MFS crashed while deleting files in the background. | With this fix, MFS will not crash when a newly added node is made the master of the name container. |
MapR-FS | 25184 | If limit spread was enabled and the nodes were more than 85% full, CLDB did not allocate containers for IOs on local volumes. | With this fix, CLDB will now allocate new containers to ensure that the IO does not fail. |
MapR-FS | 25290 | In secure environment, while writes were in progress, num_groups got corrupted and caused the FUSE process to crash. | With this fix, the FUSE process will not crash while writes are in progress. |
MapR-FS | 25308 | MFS crashed when mirroring a mirror volume that was promoted to a read/write volume and edited, and then reverted to a mirror volume. | With this fix, MFS will not crash when resynchronizing a mirror volume that was promoted to a read/write volume and edited, and then reverted to a mirror volume. |
MapR-FS | 25337 | When too many files were open, writes through FUSE were failing with EAGAIN messages. | With this fix:
|
MapR-FS | 25426 | The server was rejecting encrypted writes as the expected length was not matching the RPC data length and this caused the server to crash. | With this fix, the server will no longer crash as the expected length will always match the RPC data length for encrypted writes. |
MapR-FS | 25590 | Sometimes the SP to Fileserver map became inconsistent across different kvstore tables due to a race condition, which caused the container lookup from slave CLDB to fail. | With this fix, kvstore tables will be made consistent if they are inconsistent. |
MapR-FS | 25775 | While uncaching was in progress, MFS writes were taking a long time. | With this fix, because of better uncaching algorithm (which utilizes CPU efficiently), there will be an improvement in the overall speed of MFS (including writes) while uncaching is in progress. |
MapR-FS | 25829 | The libMapRClient library required JVM to be installed on the client machine, which is not required by C and C++ programs. | With this fix, libMapRClient library will no longer need JVM to be
installed on the client machine for C and C++ programs. |
MapR-FS | 25848 | After a rolling upgrade to 5.2 of namespace container nodes, ACE information was getting set on certain operations incorrectly causing operations to fail. | With this patch, ACE information will be discarded after a rolling upgrade. |
MapR-FS | 25856 | In the event of a CLDB failover, a table on the unreachable node is deleted and re-created by CLDB master. Sometimes, multiple container lookup threads from slave CLDBs trying to open/access that table during the failover caused CLDB exception. | With this fix multiple threads can safely access unreachable node table. |
MapR-FS | 26025 | A corrupt encrypted write results in a data decryption failure. As a result of the decryption failure, MFS returns an EINVAL. The master node for the write crashes when it receives an EINVAL from the replicas. In this case, the decryption failure should have resulted in an EBADMSG instead of an EINVAL. | With this fix, an EBADMSG is returned in case of a decryption failure of data. Upon encountering an EBADMSG, MFS sends an ErrServerRetry to the client. The Client revalidates the CRC, tries decrypting the encrypted buffers, and then retries the write operation, making the client more resilient to memory and network corruptions. |
MapR-FS | 26054 | Sometimes, the container was getting stuck in resync state because the resync operation was hanging. | With this fix, the resync operation will no longer hang. |
MapR-FS | 26062 | After installing patch 41809 on v5.2, the FUSE-based POSIX client failed to start. | With this fix, the FUSE-based POSIX client will now start when the command to start the service is run. |
MapR-FS | 26093 | Sometimes, MFS crashed after promoting destination mirror volume to read-write volume. | With this fix, MFS will not crash after promoting destination mirror volume to read-write volume. |
MapR-FS | 26094 | Sometimes MFS crashed because there were many SP cleaner threads between low and high threshold. | With this fix, MFS will not crash because the cleaner is disabled if it is below the high threshold. |
MapR-FS | 26288 | During rolling upgrade, if slave CLDB is upgraded before master CLDB, the slave CLDB may crash when accessing new KvStore tables. | With this fix, slave CLDB will not crash on reading new tables, even if they are non-existent. |
MapR-FS | 26336 | MFS was crashing during truncate operation because of the following:
|
With this fix, MFS will no longer crash during truncate operation. |
MapR-FS | 26351 | During disksetup, even if the mfs.ssd.trim.enabled configuration parameter was set to false, the device was getting trim calls. | With this fix, MFS will not attempt to trim if the configuration parameter is set to false. |
Hive, Tez | 20965 | When working with multiple clusters, synchronization issues was causing MapRFileSystem to return NullPointerException. | With this fix, MapRFileSystem has been improved to better support working with multiple clusters and MapRFileSystem contains fixes for synchronization issues. |
Hoststats | 11349 | Hoststats did not work on POSIX edge node. | With this fix, hoststats can work on POSIX client edge nodes as well to display the statistics on MCS. |
JobTracker | 24700 | The Job Tracker user interface failed with a NullPointerException when a user submitted a Hive job with a null value in a method. | With this fix, the Job Tracker interface does not fail when a Hive job is run with a null value in a method. |
MapReduce | 24505 | A job failed when the JvmManager went into an inconsistent state. | With this fix, jobs no longer fail as a result of the JvmManager entering an inconsistent state. |
MapReduce | 25599 | Race condition in jobtracker-start script could cause warden to start multiple jobtrackers. | With this fix the start script loops and waits for a successful start of jobtracker before exiting, thus closing the window of the race condition. |
MapReduce | 25695 | It was not possible to restrict the web access port range, so the YARN Mapreduce application master could open a web port anywhere in the ephemeral port range of the node where it was running. | With this change, the YARN Mapreduce application master will only open its web port within the range specified by the mapred parameter: yarn.app.mapreduce.am.job.client.port-range |
MCS | 23257 | In MCS, new NFS VIPs were visible in the NFS HA > VIP Assignments tab, but not in the NFS HA > NFS Setup tab. | With this fix, the NFS VIPs will be available in both the NFS HA > VIP Assignments tab and the NFS HA > NFS Setup tab. |
NFS | 24315 | If you use the NFS client and you used the dd command with iflag=direct, an incorrect amount of data may have been read. | With this fix, the dd command will read exactly the expected amount of
data when iflag=direct is set. |
NFS | 24446 | Due to incorrect attribute cache handling in NFS server, the getattr call sometimes returned stale mtime because the attribute cache was not getting updated properly at the time of setattr. | With this fix, the attributes are now properly cached. |
NFS | 24658 | CLDB returned “no master” and an empty list for container lookup, which NFS server could not handle, because when multiple servers are down, there can be no master for a container. | With this fix, NFS server will handle empty node list for container lookup. |
NFS:Loopback | 23652 | The POSIX loopbacknfs client did not automatically refresh renewed service tickets. | With this fix, the POSIX loopbacknfs client will:
|
Pkg/deployment | 24309 | Symlinks that existed in a MapR 5.1 installation were not re-created during an upgrade to MapR 5.2. This problem resulted when the mapr-hadoop-core package was updated on a cluster with the incorrect version of the mapr-core-internal package. This problem can occur during an upgrade from any older MapR version to a newer MapR version. | With this fix, the mapr-hadoop-core package has a new dependency for a specific version of mapr-core-internal. If the correct version of mapr-core-internal is not present, an error message is generated, and the mapr-hadoop-core package cannot be installed. Note that this fix is effective for MapR 5.2.1 or later installations. |
RPC | 24610 | In a secure cluster, when there are intermittent connection drops (between MFS-MFS or client-MFS), the client and/or server could crash during authentication. | With this fix, the client and/or server will not crash during authentication if there are intermittent connection drops. |
Streams | 23563 | High CPU utilization occurs when the default buffering time for MapR Streams is set to 0. | With this fix, CPU utilization and latency is reduced by having TimeBasedFlusher active only when there is work to do. |
UI:CLI | 24280 | Running the maprcli dashboard info command occasionally throws a
TimeoutException error. |
With this fix, the internal timeout command was increased to provide more allowance for command processing. |
Warden | 24119 | Warden adjusts the FileServer (MFS) and Node Manager (NM) memory incorrectly when NM and TaskTracker (TT) are on the same node. This can result in too much memory being allocated to MFS. | With this fix, Warden does not adjust MFS memory when NM and TT are on the same node. Memory adjustment is implemented only when TT and MapR-FS (but no NM) are on the same node. |
Warden | 24562 | CLDB (container location database) performance suffered because Warden gave the CLDB service a lower CPU priority. | With this fix, Warden uses a new algorithm to set the correct CPU priority for the CLDB service. |
Yarn | 24477 | Jobs failed if a local volume was not available and directories for mapreduce could not be initialized. | With this fix, jobs no longer fail, and local volume recovery is enhanced. |
YARN | 25387 | A null pointer exception (NPE) was generated when the capacity scheduler was enabled. Adding a node that does not contain a label can result in an NPE. | With this fix, errors are no longer generated when the capacity scheduler is enabled. |
YARN | 25412 | Mapreduce jobs fail if the Application Master (AM) is restarted for any reason -- for example, because of a node failure -- during a job commit and leaves a control file that prevents subsequent commit attempts. | With this fix,
MAPREDUCE-5485 is backported to MapR 5.1. MAPREDUCE-5485 adds a clean-up of
commit-stage files. If the first commit attempt fails, temporary files are removed,
allowing the next repeatable commit attempt to write them again without throwing an
exception. To benefit from this fix, the user must set the
mapreduce.fileoutputcommitter.algorithm.version parameter to "2" in the
mapred-site.xml file. |
YARN | 25654 | At startup, while processing application-recovery data, the ResourceManager (RM) failed with a null pointer exception. | With this fix, the ResourceManager starts correctly when processing application-recovery data. |
Yarn/security | 25448 | A user's temporary log files for running jobs were not readable by another user from the same group in the RM UI. An exception with the message, "Exception reading log file. User 'mapr' doesn't own requested log file" was generated. | With this fix, users in the same primary group can access user logs of other users in the group. |
Yarn/Warden | 25695 | It was not possible to restrict the web access port range, so the YARN Mapreduce application master could open a web port anywhere in the ephemeral port range of the node where it was running. | With this change, the YARN Mapreduce application master will only open its web port within the range specified by the mapred parameter: yarn.app.mapreduce.am.job.client.port-range |