
From the command line, it’s easy to see the current state of any running applications in your YARN cluster by issuing the yarn top
command.
The output of that command is a continuously updating (about once every 3 seconds) screen in your terminal showing the status of applications, the memory and core usage, and the overall completion percentage of an application.
Something like this shows up when you enter the command:
YARN top - 08:40:46, up 7d, 16:11, 0 active users, queue(s): root NodeManager(s): 10 total, 10 active, 0 unhealthy, 0 decommissioned, 0 lost, 0 rebooted Queue(s) Applications: 4 running, 13874 submitted, 0 pending, 13854 completed, 6 killed, 10 failed Queue(s) Mem(GB): 194 available, 796 allocated, 0 pending, 0 reserved Queue(s) VCores: 85 available, 35 allocated, 0 pending, 0 reserved APPLICATIONID USER TYPE QUEUE #CONT #RCONT VCORES RVCORES MEM RMEM VCORESECS MEMSECS %PROGR TIME NAME application_1498162987065_13874 user spark default 11 0 11 0 260G 0G 164 4077 10.00 00:00:00 test_app application_1498162987065_13870 user spark default 11 0 11 0 260G 0G 983 23393 10.00 00:00:01 test_app application_1498162987065_13869 user spark default 11 0 11 0 260G 0G 1212 28919 10.00 00:00:01 test_app application_1498162987065_13873 user tez default 2 0 2 0 16G 0G 58 499 0.00 00:00:00 test_query
I’m running three test applications and one test query on this particular cluster all in the default queue. You also get NodeManager status, total applications, total memory and total cores.
Of course, you can get all this same information from the ResourceManager’s homepage on port 8088 but that:
- Isn’t a live view into the status of your applications,
- Not as quick to access
- Isn’t as simple as a straightforward CLI view,
- Looks like a webpage fresh out of the early 1990s — to complete the look, add some <marquee> tags.
The yarn top
command bears a striking resemblance to the normal Linux top
command for obvious reasons: it’s all about knowing what processes are running in your environment. Applications in YARN are a little different than applications on a single Linux server so there are minor tweaks and different options available between the two.
Here’s the original JIRA of the command: https://issues.apache.org/jira/browse/YARN-3348