Logging Options on YARN
- For MapReduce v2 applications, the default logging option is to log files on the local file system. However, central logging and YARN log aggregation are also available.
- For non-MapReduce applications, the default logging option is to log files on the local file system. However,YARN log aggregation is also available.
- Centralized Logging for MapReduce v2
- Centralized logging provides an application-centric view of all the log files
generated by NodeManager nodes throughout the cluster. It enables users to gain a
complete picture of application execution by having all the logs available in a single
directory, without having to navigate from node to node.
The MapReduce program generates three types of log output:
- Standard output stream: captured in the
stdout
file - Standard error stream: captured in the
stderr
file - Log4j logs: captured in the
syslog
file
Centralized logs are available cluster-wide as they are written to the following local volume on the MapR-FS: /var/mapr/local/<NodeManager node>/logs/yarn/userlogs
Since the log files are stored in a local volume directory that is associated with each NodeManager node, you run the
maprcli job linklogs
command to create symbolic links for all the logs in a single directory. You can then use tools such asgrep
andawk
to analyze them from an NFS mount point. You can also view the entire set of logs for a particular application using the HistoryServer UI. - Standard output stream: captured in the
- YARN Log Aggregation
- The YARN log aggregation option aggregates logs from the local file system and moves
log files for completed applications from the local file system to the MapR-FS. This
allows users to view the entire set of logs for a particular application using the
HistoryServer UI or by running the
yarn logs
command.