Enabling YARN Log Aggregation
To enable YARN log aggregation, add or edit the following
properties in yarn-site.xml
:
- Set the value of the
yarn.log-aggregation-enable
totrue
. - Configure the
yarn.log.server.url
property to contain the URL of the YARN HistoryServer, which should look like the following:secure cluster https://<historyserver-host>:19890/jobhistory/logs
non-secure cluster http://<historyserver-host>:19888/jobhistory/logs
- Optional: Set the
yarn.nodemanager.remote-app-log-dir
value to a location in the MapR Data Platform file system. By default, the location ismaprfs:///tmp/logs
. - Optional: Set the
yarn.nodemanager.remote-app-log-dir-suffix
value to the name of the folder that should contain the logs for each user. By default, the folder name islogs
.
On a non-secure cluster, you must also add the following property to
/opt/mapr/hadoop/hadoop-2.x/etc/hadoop/yarn-env.sh
on the Node Manager
nodes:
export MAPR_IMPERSONATION_ENABLED=1
Afterwards, restart Node Manager services. This setting enables impersonation for Node Manager processes so that log files can be created with the correct user ownership.
Aggregated logs are owned by the user who runs the job. For example, if user
admin
runs a job, the logs are stored to
maprfs:///tmp/logs/admin
. If user analyst
runs
a job, the logs are stored to maprfs:///tmp/logs/analyst
. If these two
users do not share the same UNIX group, they will be unable to see each other's logs.