Monitoring System Resources On Linux

August 26, 2022
August 26, 2022

Introduction

Monitoring your resources is good practice and detrimental to keeping your server healthy and preventing downtime. This guide will quickly go over a few of the tools needed in the arsenal of a successful administrator. In particular we will go over some useful tools that analyze running processes, disk usage, and memory usage.

Process Management

To view all running processes on your server you can use the top command.

$ top

top - 02:13:35 up 55 days, 14:50,  1 user,  load average: 0.12, 0.11, 0.12
Tasks: 197 total,   1 running, 195 sleeping,   1 stopped,   0 zombie
%Cpu(s):  2.9 us,  5.9 sy,  0.0 ni, 91.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   7957.7 total,   4228.9 free,   1757.0 used,   1971.8 buff/cache
MiB Swap:    124.0 total,    124.0 free,      0.0 used.   5896.0 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
     11 root      20   0       0      0      0 I   6.2   0.0 214:36.75 rcu_sched
3848281 root      20   0    9432   3968   3296 R   6.2   0.0   0:00.02 top
      1 root      20   0  169476  12828   8476 S   0.0   0.2  14:35.96 systemd
      2 root      20   0       0      0      0 S   0.0   0.0   0:01.71 kthreadd
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp
      4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par_gp

The first line prefixed with top displays how long your system has been running, the active users, and the systems load average. The second line displays the number of total tasks, how many are currently active, and how many are sleeping (inactive). The third line contains information about the CPU load. Finally the fourth and fifth line contains information about allocated memory and unused memory (total and swap).

  • PID: Shows the task’s unique process id.
  • USER: The owner of the task.
  • PR: The process’s priority. The lower the number, the higher the priority.
  • NI: Represents the Nice Value of a task. A Negative nice value implies higher priority, and a positive Nice value means lower priority.
  • VIRT: Total virtual memory used by the task.
  • RES: How much physical RAM the process is using, measured in kilobytes.
  • SHR: Represents the Shared Memory size (kb) used by a task.
  • S: Shows the current state of the command.
  • %CPU: Represents the CPU usage.
  • %MEM: Shows the Memory usage of a task.
  • TIME+: CPU Time, the same as ‘TIME’, but reflecting more granularity through hundredths of a second.
  • COMMAND: The name of the command that started the process.

This information can be useful for many reasons. Here is an example of one such scenario: Suppose you have a long-running, stuck, or unresponsive process. You can simply stop it by passing the process ID (PID) to the kill command. In this example let’s say that the PID is 1234, then you can use the following command to kill the process:

$ kill -9 1234

A great alternative to top is htop. htop is top but much easier on the eyes. It also allows for more straightforward filtering. More information can be found here.

Disk Usage

The df command provides you with an overview of how much disk space is used and available on your attached drives.

$ df

Filesystem                        1K-blocks    Used Available Use% Mounted on
udev                                4042404       0   4042404   0% /dev
tmpfs                                814868    1932    812936   1% /run
/dev/mapper/JetrailsVol1-lv_slash  40792616 7016144  33776472  18% /
tmpfs                               4074324       0   4074324   0% /dev/shm
tmpfs                                  5120       0      5120   0% /run/lock
tmpfs                               4074324       0   4074324   0% /sys/fs/cgroup
/dev/vda1                            996004  174576    821428  18% /boot
tmpfs                                814864       0    814864   0% /run/user/0

If you want this list in megabytes you can run the same command with the -h flag:

$ df -h

Filesystem                         Size  Used Avail Use% Mounted on
udev                               3.9G     0  3.9G   0% /dev
tmpfs                              796M  1.9M  794M   1% /run
/dev/mapper/JetrailsVol1-lv_slash   39G  6.7G   33G  18% /
tmpfs                              3.9G     0  3.9G   0% /dev/shm
tmpfs                              5.0M     0  5.0M   0% /run/lock
tmpfs                              3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/vda1                          973M  171M  803M  18% /boot
tmpfs                              796M     0  796M   0% /run/user/0

The du command is used to get the size of the current directory and sub-directories. Similar to the above example, we will run du with the -h flag to get a more human-readable output.

$ du -h

8.0K	./.ssh
12K	./.cache/pip/http/5/2/9/4/6
12K	./.cache/pip/http/5/2/9/4
12K	./.cache/pip/http/5/2/9
12K	./.cache/pip/http/5/2
12K	./.cache/pip/http/5
88K	./.cache/pip/http/f/0/7/9/d
88K	./.cache/pip/http/f/0/7/9
88K	./.cache/pip/http/f/0/7
88K	./.cache/pip/http/f/0

Memory Usage

The free command allows you to check your system’s current memory usage.

$ free

              total        used        free      shared  buff/cache   available
Mem:        8148652     1796412     4332960        4888     2019280     6040196
Swap:        126972           0      126972

If you would like to see the output converted to megabytes, then you can run the same command with the -m flag:

$ free -m

              total        used        free      shared  buff/cache   available
Mem:           7957        1754        4230           4        1971        5898
Swap:           123           0         123

Conclusion

The commands described in this article are a great set of tools when starting to debug issues. They will come in handy when you may be experiencing problems, but please note that these command-line tools only scratch the surface of what’s available for Linux. I’d advise you to do some searching as there may be other tools that better suit your needs.