Top Ten Essential Commands for Linux System Monitoring
Learn ten Linux monitoring commands for checking CPU, memory, disk, network sockets, load, and historical system activity.
Top Ten Essential Commands for Linux System Monitoring
When a Linux server feels slow, you need commands that tell you whether the pressure is CPU, memory, disk, network, or load. These Linux monitoring commands help you move from "the server is slow" to a specific next step.
The ten tools below give you quick snapshots, interactive views, and historical data. Use them together instead of trusting one number in isolation.
1. top - Real-time Process Activity
The top command provides a dynamic, real-time view of a running Linux system. It displays a summary of system information and a list of processes or threads currently managed by the Linux kernel. It's often the first tool administrators turn to for a quick overview of system activity.
Key Metrics:
- CPU usage:
us(user),sy(system),ni(nice),id(idle),wa(I/O wait),hi(hardware IRQ),si(software IRQ),st(steal time). - Memory usage: Total, free, used, buffers/cache.
- Swap usage: Total, free, used.
- Process list: PID, User, PR (priority), NI (nice value), VIRT (virtual memory), RES (resident memory), SHR (shared memory), S (status), %CPU, %MEM, TIME+, COMMAND.
Basic Usage:
top
Practical Examples:
- Sort by CPU usage: While in
top, pressP. - Sort by memory usage: While in
top, pressM. - Show specific user processes: While in
top, pressuthen type the username. - Kill a process: While in
top, presskand enter the PID.
Tips:
- Press
1to toggle the display of individual CPU cores. - Press
qto quittop. - Use
top -bn1to get a single snapshot (useful for scripting).
2. htop - Interactive Process Viewer
htop is an enhanced, interactive, and user-friendly process viewer that offers many advantages over the traditional top command. It presents a more visually appealing and navigable interface, making it easier to monitor and manage processes.
Key Advantages:
- Visual meters: CPU, memory, and swap usage are displayed graphically.
- Scrollable list: You can scroll vertically and horizontally to see all processes and their full command lines.
- Easy process management: Kill, renice, and other actions can be performed directly using function keys without entering PIDs.
- Tree view: Processes can be displayed in a tree format to show parent-child relationships.
Basic Usage:
# May require installation:
# sudo apt install htop (Debian/Ubuntu)
# sudo yum install htop (RHEL/CentOS)
htop
Practical Examples:
- Filter processes: Press
F4. - Kill a process: Select the process, then press
F9. - Sort by various columns: Use
F6.
Tips:
htopis generally preferred for interactive monitoring due to its superior user experience.- Customize
htop's display options (F2) to suit your workflow.
3. vmstat - Virtual Memory Statistics
The vmstat command reports information about processes, memory, paging, block IO, traps, and CPU activity. It's an excellent tool for identifying memory bottlenecks or high disk I/O.
Key Metrics:
r: Number of processes waiting for run time.b: Number of processes sleeping in uninterruptible sleep (typically I/O).swpd: Amount of virtual memory used.free: Amount of idle memory.si/so: Amount of memory swapped in from disk / swapped out to disk.bi/bo: Blocks received from a block device / blocks sent to a block device.wa: Time spent waiting for I/O completion.
Basic Usage:
vmstat 1 5 # Report every 1 second, 5 times
Practical Examples:
- Show active/inactive memory:
vmstat -a - Display slabinfo:
vmstat -m - Show disk statistics:
vmstat -d
Tips:
- High
si/sovalues often indicate memory pressure and excessive swapping, which can severely degrade performance. - A consistently high
wapercentage suggests an I/O bottleneck.
4. iostat - I/O Statistics
iostat is part of the sysstat package and reports CPU utilization and I/O statistics for devices, partitions, and network file systems. It's crucial for understanding disk performance issues.
Key Metrics:
%user,%system,%iowait,%idle: CPU utilization breakdowns.r/s/w/s: Reads/writes per second.rkB/s/wkB/s: Kilobytes read/written per second.await: Average time (in milliseconds) for I/O requests issued to the device to be served.%util: Percentage of elapsed time during which the device had I/O requests in progress.
Basic Usage:
# May require installation:
# sudo apt install sysstat (Debian/Ubuntu)
# sudo yum install sysstat (RHEL/CentOS)
iostat -xz 1 5 # Extended stats, every 1 second, 5 times
Practical Examples:
- Specific device monitoring:
iostat -xz /dev/sda 1 - Display only CPU utilization:
iostat -c - Display only device utilization:
iostat -d
Tips:
- A high
%utilcombined with a highawaittime often points to an I/O bottleneck on that device. On modern SSDs and virtualized storage, confirm with application latency before assuming the disk is saturated. - Compare
rkB/sandwkB/swithr/sandw/sto understand average I/O size.
5. free - Memory Usage
The free command displays the total amount of free and used physical memory and swap space in the system, as well as the buffers and caches used by the kernel.
Key Metrics:
total: Total installed memory.used: Used memory (includes buffers/cache).free: Unused memory.shared: Memory used by tmpfs (shared memory segments).buff/cache: Memory used by kernel buffers and page cache.available: An estimate of how much memory is available for starting new applications, without swapping.
Basic Usage:
free -h # Human-readable output
Practical Examples:
- Display memory in megabytes:
free -m - Continuously update every 5 seconds:
watch -n 5 free -h
Tips:
- The
availablecolumn is the most important metric for understanding how much memory is genuinely free for new processes. - Linux aggressively uses available memory for disk caching, so a low
freevalue is normal and often desirable.
6. df - Disk Space Usage
The df command reports the amount of disk space used and available on file systems. It's essential for monitoring storage capacity and preventing disk-full scenarios.
Key Metrics:
Filesystem: The name of the file system.Size: Total size of the file system.Used: Amount of disk space used.Avail: Amount of disk space available.Use%: Percentage of disk space used.Mounted on: The mount point of the file system.
Basic Usage:
df -h # Human-readable output
Practical Examples:
- Show inode usage:
df -i(inodes are metadata structures; running out of them can prevent file creation even with free space). - Show specific filesystem type:
df -hT -t ext4
Tips:
- Regularly check
Use%to prevent file systems from filling up, which can cause application failures and system instability. - High inode usage can be an issue with many small files.
7. du - Disk Usage of Files and Directories
The du command estimates file space usage. While df checks total filesystem usage, du is used to find out the size of specific files or directories, which is critical for identifying what is consuming disk space.
Key Metrics:
- Total size of specified files or directories.
Basic Usage:
du -sh /var/log # Summary, human-readable for /var/log directory
Practical Examples:
- Show sizes of all subdirectories (one level deep):
du -h --max-depth=1 /home/user - Find the largest files/directories:
du -ah /path/to/check | sort -rh | head -n 10
Tips:
- Combine
duwithsortandheadto quickly pinpoint disk space hogs. - Be mindful when running
duon large directories, as it can be resource-intensive.
8. sar - System Activity Reporter
sar is a powerful tool from the sysstat package that collects, reports, or saves system activity information. Unlike top or vmstat which show real-time snapshots, sar excels at providing historical data, making it invaluable for long-term performance analysis and capacity planning.
Key Features:
- CPU statistics:
%user,%nice,%system,%iowait,%steal,%idle. - Memory statistics:
kbmemfree,kbmemused,kbbuffers,kbcached. - Disk I/O:
tps,rd_sec/s,wr_sec/s. - Network statistics:
rxpck/s,txpck/s,rxbyt/s,txbyt/s. - Load average, swap activity, kernel activity, and more.
Basic Usage:
# Report CPU utilization every 1 second, 5 times:
sar -u 1 5
# Report disk activity:
sar -d
# Report memory utilization:
sar -r
# Report network statistics:
sar -n DEV
Practical Examples:
- View a saved CPU activity file:
sar -u -f /var/log/sysstat/saDDon many Debian-based systems, or/var/log/sa/saDDon many RHEL-based systems. ReplaceDDwith the day of month. - Display all collected data for today:
sar -A
Tips:
- Ensure the
sysstatpackage is installed and configured to collect data regularly for historical analysis. sarcan be overwhelming; focus on specific flags (-u,-r,-d,-n) relevant to your investigation.
9. ss (Socket Statistics) - Network Connections
ss is a utility to investigate sockets. It's a faster and more efficient replacement for the older netstat command, providing more detailed information about TCP, UDP, and other socket types, including their state, local/remote addresses, and process IDs.
Key Metrics:
- State:
ESTAB,LISTEN,TIME-WAIT,CLOSE-WAIT, etc. - Recv-Q / Send-Q: The receive and send queue sizes.
- Local Address:Port / Peer Address:Port: The local and remote endpoints.
- Process Name: The process associated with the socket.
Basic Usage:
ss -tuln # TCP, UDP, listening, numeric ports
Practical Examples:
- List all TCP connections:
ss -t - List all UDP connections:
ss -u - Show processes listening on specific ports:
ss -tulnp | grep 80 - Summarize socket statistics:
ss -s
Tips:
- A high number of
TIME-WAITsockets is not automatically bad; it can be normal on busy TCP services. Pair it with port exhaustion, failed connections, or queue growth before treating it as a problem. - Monitor
Recv-QandSend-Qfor signs of network buffering issues or slow application processing.
10. uptime - System Uptime and Load Average
The uptime command shows how long the system has been running, the current time, how many users are logged in, and the system load averages for the past 1, 5, and 15 minutes.
Key Metrics:
- Current time: Self-explanatory.
- Uptime: How long the system has been running.
- Users: Number of users currently logged in.
- Load average: The average number of processes that are either in a runnable or uninterruptible state. This includes processes that are running on the CPU, waiting for CPU, or waiting for disk I/O.
- 1-minute load average
- 5-minute load average
- 15-minute load average
Basic Usage:
uptime
Practical Examples:
- Often used as a quick health check for a server's general busyness.
Tips:
- Compare the load average to the number of CPU cores on your system. A load average consistently higher than the number of CPU cores often indicates a CPU or I/O bottleneck.
- An increasing load average over time (e.g., 1-minute > 5-minute > 15-minute) suggests the system is getting busier.
A Simple Troubleshooting Flow
For a slow server, start with uptime to check load, then use top or htop to find busy processes. Check free -h and vmstat 1 5 for memory pressure, iostat -xz 1 5 for disk latency, and ss -tulnp for listening services or backed-up sockets. If the issue happened earlier, use sar to compare the bad window with a normal one.
The takeaway is simple: each command answers one part of the story. Your job is to line up CPU, memory, disk, and network evidence before you restart services or resize the machine.