Unix Health Check
Note Many of these procedures can be scripted.
1. Check for problem conditions
system logs error/warning
HPUX:
grep -i -e "EMS" -e "error" -e "warning" -e "excessive" /var/adm/syslog/syslog.log | more
tail /etc/shutdownlog
hardware problems
HPUX:
ioscan -fnC disk | more
ioscan -fnC fc | more
ioscan -fnC processor | more
ioscan -fnC ba | more
ioscan -fnC memory
filesystems full
HPUX:
bdf | more
AIX|SUN|LINUX
df|more
2. Overview of the system
top/topas
glance (if HPUX)
Processor Utilization
Disk Utilization
Memory Utilization
Paging/Swapping
Busiest Processes
3. Memory
RAM
Swapping/Paging
swapinfo
vmstat (avm - active virtual memory pages, pi - pages swapped in, po - pages swapped out)
sar -g
sar -w ??
ipcs
Are the swap disks > 50% busy?
4. Processor Bottleneck
% utilization
% system time vs %user time
uptime
sar -q (runq-sz run queue, swpq-sz swap queue)
sar -u (%usr, %sys, %wio - wait i/o)
On Sun
iostat -xc
Don't collect more than 100 samples
Strip out idle disks
Set an appropriate interval
For 24 hours use 15 mins
For 10 mins use 10 secs
5. Disk Bottleneck
In General
Check for adequate free space
Check for high utilization
sar -d (%busy and avque)
iostat (bps - kb per sec)
On Sun
iostat -xn 30
Look for disks > 5% busy and average response times > 30 ms (svc_t)
6. Processes
Busiest Processes ps -ef | sed 1d | sort -rn +6|head
Zombies ps -ef|grep -i defun
sar -w (swpins, swpots)
vmstat (procs - b => blocked)
7. Network
Utilization
Packet rate on each interface
Errors
netstat -s | more
nettune/ndd
nfsstat
netstat -i
nfsstat
netstat -s
Note Many of these procedures can be scripted.
1. Check for problem conditions
system logs error/warning
HPUX:
grep -i -e "EMS" -e "error" -e "warning" -e "excessive" /var/adm/syslog/syslog.log | more
tail /etc/shutdownlog
hardware problems
HPUX:
ioscan -fnC disk | more
ioscan -fnC fc | more
ioscan -fnC processor | more
ioscan -fnC ba | more
ioscan -fnC memory
filesystems full
HPUX:
bdf | more
AIX|SUN|LINUX
df|more
2. Overview of the system
top/topas
glance (if HPUX)
Processor Utilization
Disk Utilization
Memory Utilization
Paging/Swapping
Busiest Processes
3. Memory
RAM
Swapping/Paging
swapinfo
vmstat (avm - active virtual memory pages, pi - pages swapped in, po - pages swapped out)
sar -g
sar -w ??
ipcs
Are the swap disks > 50% busy?
4. Processor Bottleneck
% utilization
% system time vs %user time
uptime
sar -q (runq-sz run queue, swpq-sz swap queue)
sar -u (%usr, %sys, %wio - wait i/o)
On Sun
iostat -xc
Don't collect more than 100 samples
Strip out idle disks
Set an appropriate interval
For 24 hours use 15 mins
For 10 mins use 10 secs
5. Disk Bottleneck
In General
Check for adequate free space
Check for high utilization
sar -d (%busy and avque)
iostat (bps - kb per sec)
On Sun
iostat -xn 30
Look for disks > 5% busy and average response times > 30 ms (svc_t)
6. Processes
Busiest Processes ps -ef | sed 1d | sort -rn +6|head
Zombies ps -ef|grep -i defun
sar -w (swpins, swpots)
vmstat (procs - b => blocked)
7. Network
Utilization
Packet rate on each interface
Errors
netstat -s | more
nettune/ndd
nfsstat
netstat -i
nfsstat
netstat -s