CLOSE_WAIT問(wèn)題終于解決了。
首先我要多謝tonykorn97,他的Blog:
http://tonykorn97.itpub.net/index.php一篇對(duì)我非常有用的文章!(我在下面引用了過(guò)來(lái)。)
from :
http://tonykorn97.itpub.net/index.php下面就說(shuō)說(shuō)我是怎么解決的吧:
[oracle9i@RHEL3 oracle9i]$ /usr/sbin/lsof -i | grep 6800
oracle 22725 oracle9i 3u IPv4 18621468 TCP RHEL3:6800 (LISTEN)
oracle 22725 oracle9i 4u IPv4 18621469 TCP RHEL3:6800->RHEL3:2174 (CLOSE_WAIT)
oracle 22725 oracle9i 8u IPv4 18621568 TCP RHEL3:6800->RHEL3:2175 (CLOSE_WAIT)
oracle 22725 oracle9i 9u IPv4 18621578 TCP RHEL3:6800->RHEL3:2176 (CLOSE_WAIT)
oracle 22726 oracle9i 3u IPv4 18621468 TCP RHEL3:6800 (LISTEN)
oracle 22726 oracle9i 4u IPv4 18621469 TCP RHEL3:6800->RHEL3:2174 (CLOSE_WAIT)
oracle 22726 oracle9i 8u IPv4 18621568 TCP RHEL3:6800->RHEL3:2175 (CLOSE_WAIT)
oracle 22726 oracle9i 9u IPv4 18621578 TCP RHEL3:6800->RHEL3:2176 (CLOSE_WAIT)
[oracle9i@RHEL3 oracle9i]$ kill -9 22725
# 22725, 22726就是使用該6800端口的進(jìn)程號(hào)(PID)。
[oracle9i@RHEL3 oracle9i]$ /usr/sbin/lsof -i | grep 6800
# 現(xiàn)在就沒(méi)有了,真是太好了。 這個(gè)問(wèn)題在這服務(wù)器上已經(jīng)出現(xiàn)3天多了還沒(méi)有下出,結(jié)果6800端口就沒(méi)辦法使用了。
該問(wèn)題的出現(xiàn)原因網(wǎng)上到處都是,也就是Socket的Client端出現(xiàn)異常沒(méi)有Close就退出了。
lsof工具真的不錯(cuò)?。。?br>
----------------------------------------------
lsof的功能很多,特別提醒大家, -c,-g,-p,-u,這四個(gè)參數(shù)最有用。更詳細(xì)的資料請(qǐng)參看:man lsof。
1、查看文件系統(tǒng)阻塞
根據(jù)工作需要,系統(tǒng)管理員想卸載一個(gè)文件系統(tǒng)并執(zhí)行umount /mountpoint,但程序報(bào)告常常顯示:umount: /mountpoint: device is
busy;這是因?yàn)樵撐募到y(tǒng)上有正在打開的文件而不允許你這么做。這時(shí),我們需要知道哪些文件、程序及用戶仍在使用該系統(tǒng),以便通知用
戶退出該系統(tǒng),可以使用lsof識(shí)別正在打開一個(gè)特定文件系統(tǒng)的進(jìn)程,執(zhí)行如下命令:
/usr/sbin/lsof /mountpoint
在這里,mountpoint就是安裝位置。例如:
# /usr/sbin /lsof /home
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
bash12134 meng cwdDIR8,5 4096 32705 /home/meng
telnet 12176 meng cwdDIR8,5 4096 32705 /home/meng
bash19809 meng cwdDIR8,5 4096 32705 /home/meng
bash20276 meng cwdDIR8,5 4096 32705 /home/meng
su 20315 root cwdDIR8,5 4096 32705 /home/meng
bash20316 root cwdDIR8,5 4096 32705 /home/meng
csh 20374 root cwdDIR8,5 4096 32705 /home/meng
lsof 20396 root cwdDIR8,5 4096 32705 /home/meng
lsof 20397 root cwdDIR8,5 4096 32705 /home/meng
顯然,所有使用這些被打開的文件的進(jìn)程都需要在文件系統(tǒng)能夠被卸載前被終止。管理員以root身份,kill掉占用這個(gè)文件系統(tǒng)的進(jìn)程,
解除文件系統(tǒng)阻塞。
2、搜索打開的網(wǎng)絡(luò)連接
如果想搜索IP地址為10.645.64.23的遠(yuǎn)程連接主機(jī)的所有網(wǎng)絡(luò)連接,可以執(zhí)行如下命令:
/usr/sbin/lsof –i@10.65.64.23可以打開系統(tǒng)中該遠(yuǎn)程知己所有打開的套接字。
# lsof -i@10.65.64.23
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
telnetd 6605 root0u inet 0x14813f00 0t0 TCP xpp3:telnet->linuxone:33143 (ESTABLISHED)
telnetd 6605 root1u inet 0x14813f00 0t0 TCP xpp3:telnet->linuxone:33143 (ESTABLISHED)
telnetd 6605 root2u inet 0x14813f00 0t0 TCP xpp3:telnet->linuxone:33143 (ESTABLISHED)
3、尋找本地?cái)嚅_的打開文件
用戶經(jīng)常遇到這種情況,當(dāng)一個(gè)進(jìn)程正在向一個(gè)文件寫數(shù)據(jù)時(shí),該文件的目錄可能被移動(dòng)。這就產(chǎn)生了一個(gè)非常大的問(wèn)題。例如,用戶可
能發(fā)現(xiàn)正在向/data寫數(shù)據(jù),但是卻看不到文件增大,LSOF這個(gè)工具可以找到這樣的錯(cuò)誤,例如:
/usr/sbin/lsof +L1,通??梢钥吹较旅娴男畔ⅲ?br> # lsof +L1
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINKNODE NAME
svrMgt_mi458 root 4r VREG 8,00 0 3418 / (/dev/rz0a)
yes 677 root 1w VREG 8,0 186523648 0 92888 / (/de v/rz0a)
# lsof +L1
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME
svrMgt_mi458 root 4r VREG 8,0 0 0 3418 / (/dev/rz0a)
yes 677 root1w VREG 8,0 2735882240 92888 / (/dev/rz0a)
我們可以用kill -9 PID命令來(lái)結(jié)束PID顯示的命令排除錯(cuò)誤,釋放空間。
我們還可以用-a選項(xiàng)來(lái)限制lsof報(bào)告單文件系統(tǒng)中的鏈接數(shù)量。例如,為了限制到/data部分的輸出,可以輸入:/usr/sbin/lsof –a +L1
/data
4、搜索被程序打開的所有文件及打開的文件相關(guān)聯(lián)進(jìn)程
如果想知道執(zhí)行PID號(hào)為637的sendmail命令打開了哪些文件的話,可以執(zhí)行l(wèi)sof -p 637命令。輸出的結(jié)果如下:
# lsof -p 637
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
sendmail 637 root cwd VDIR8,6 512 470400 /usr/var/spool/mqueue
sendmail 637 root txt VREG8,6 466944 9650 /usr (/dev/rz0g)
sendmail 637 root txt VREG8,0 139264 16016 /sbin/loader
sendmail 637 root txt VREG8,0 1663104 38402 /shlib/libc.so
sendmail 637 root0r VCHR2,2 0t0 9607 /dev/null
sendmail 637 root1w VCHR2,2 0t0 9607 /dev/null
sendmail 637 root2w VCHR2,2 0t0 9607 /dev/null
sendmail 637 root3u unix 0x0c2fc280 0t0->0x1ead2b40
sendmail 637 root4u inet 0x0c34c200 0t0TCP *:smtp (LISTEN)
上述輸出信息顯示了該程序當(dāng)前打開的所有文件、設(shè)備、庫(kù)及套接字等。
執(zhí)行下面的命令可以發(fā)現(xiàn)哪些進(jìn)程正在使用某個(gè)特定的文件,如下所示,可以看出,只有系統(tǒng)記錄后臺(tái)進(jìn)程syslogd打開messages這個(gè)文件
。
# lsof /var/adm/messages
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
syslogd 147 root 16w VREG8,6 2653365 22501 /usr/var/adm/messages
5、其它使用命令(更詳細(xì)的資料請(qǐng)man lsof,這部分參看了一些資料給大家總結(jié)一下)
若沒(méi)有加上任何的參數(shù),lsof 會(huì)列出所有被程序打開的文件。
參數(shù)可以相互結(jié)合,ex: -a -b -c 等同于 -abc
-? -h 這兩個(gè)參數(shù)意思相同,顯示出 lsof 的使用說(shuō)明。
-a 參數(shù)被視為 AND (注意:-a參數(shù)一但加上 ,會(huì)影響全部的參數(shù)。)
-c c 顯示出以字母 c開頭進(jìn)程現(xiàn)在打開的文件
例:顯示以init進(jìn)程現(xiàn)在打開的文件
# lsof -c init
COMMAND PID USER FD TYPE DEVICE SIZE/OFF INODE NAME
init 1 root cwd VDIR 4095,365376 8192 2 /
init 1 root txt VREG 4095,365376 286720 463 /sbin/init
+d s 依照文件夾s來(lái)搜尋,此參數(shù)將不會(huì)繼續(xù)深入搜尋此文件夾
例:顯示在/usr/users/tongxl目錄下被程序正在打開的文件(如下所示)
# lsof +d /usr/users/tongxl
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ksh 26946 root cwd VDIR8,6 512 51281 /usr/users/tongxl/c
a.out 26953 root cwd VDIR8,6 512 51281 /usr/users/tongxl/c
+D D 同上,但是會(huì)搜索目錄下的目錄,時(shí)間較長(zhǎng)。(注意︰lsof以此參數(shù)進(jìn)行時(shí),須花費(fèi)較多的動(dòng)態(tài)記憶體。尤其在處理較大的文件夾時(shí)
,請(qǐng)務(wù)必審慎使用之。)
例:顯示在/usr/local/文件夾下被程序正在打開的文件(如下)很明顯可以看出二者的差別
# lsof +D /usr/users/tongxl
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ksh 26946 root cwd VDIR8,6 512 51281 /usr/users/tongxl/c
a.out 26953 root cwd VDIR8,6 512 51281 /usr/users/tongxl/c
a.out 26953 root txt VREG8,624576 51311 /usr/users/tongxl/c/a.out
-d s 此參數(shù)以file descriptor (FD)值顯示結(jié)果,可以采用范圍表示,如 1-3 或 3-10 但 最前面的數(shù)一定要比最后面的數(shù)小。
舉例:以FD為4顯示
# lsof -d 4
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
syslogd 147 root4u inet 0x1fe0b980 0t0UDP *:syslog
binlogd 151 root4u inet 0x1fe0bd40 0t0UDP *:*
portmap 319 root4u inet 0x1fe0b740 0t0UDP *:111
mountd321 root4u VREG8,6 253 22516 /usr (/dev/rz0g)
nfsd 323 root4u inet 0x0c349e00 0t0TCP *:2049 (LISTEN)
rpc.statd 330 root4u inet 0x1ab42000 0t0TCP xpp3:1024 (LISTEN)
rpc.lockd 332 root4u inet 0x1fe0bbc0 0t0UDP xpp3:1028
snmpd 449 root4u unix 0x1aaf6500 0t0/var/esnmp/esnmpd
svrMgt_mi 457 root4r VREG8,00 3424 / (/dev/rz0a)
os_mibs 458 root4u inet 0x1ab475c0 0t0UDP *:*
cpq_mibs 460 root4u unix 0x1aaf77c0 0t0/var/esnmp/esnmp_sub460
advfsd472 root4u inet 0x0c320000 0t0TCP *:AdvFS (LISTEN)
insightd 475 root4r VDIR8,6 512 25610 /usr (/dev/rz0g)
inetd 506 root4u inet 0x1ab26700 0t0TCP *:ftp (LISTEN)
lpd 567 root4wW VREG8,64 451219 /usr (/dev/rz0g)
dtlogin 605 root4w VREG8,64 344028 /usr (/dev/rz0g)
Xdec 616 root4w VREG8,64 344028 /usr (/dev/rz0g)
sendmail 702 root4u inet 0x0c321900 0t0TCP *:smtp (LISTEN)
dtlogin 891 root4w VREG8,64 344028 /usr (/dev/rz0g)
dxconsole 907 root4w VREG8,64 344028 /usr (/dev/rz0g)
dtgreet 908 root4w VREG8,64 344028 /usr (/dev/rz0g)
-g [s] 以程序的PGID (process group IDentification)顯示,也可以采用范圍(1-3)或個(gè)別(3,5)表示,若沒(méi)有特別指定,則顯示全部。
舉例:以PGID為3顯示
# lsof -g 3
COMMAND PID PGID USER FD TYPE DEVICE SIZE/OFF NODE NAME
kloadsrv 33 root cwd VDIR8,0 2560 2 /
kloadsrv 33 root txt VREG8,0 221184 16041 /sbin/kloadsrv
kloadsrv 33 root0r VCHR0,0 0t0 9608 /dev/console
kloadsrv 33 root1w VCHR0,0 0t0 9608 /dev/console
kloadsrv 33 root2w VCHR0,0 0t0 9608 /dev/console
-i [i] 用以監(jiān)聽(tīng)有關(guān)的任何符合的位址。若沒(méi)有相關(guān)位置被指定,則監(jiān)聽(tīng)全部。
語(yǔ)法: lsof -i[46] [protocol][@hostname|hostaddr][:service|port]
46 --> IPv4 or IPv6
protocol --> TCP or UDP
hostname --> Internet host name
hostaddr --> IPv4位置
service --> /etc/service中的 service name (可以不只一個(gè))
port --> 埠號(hào) (可以不只一個(gè))
# lsof -i tcp@xp001
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
telnetd 26862 root0u inet 0x0c349000 0t0 TCP xpp3:telnet->xp001:3807 (ESTABLISHED)
telnetd 26862 root1u inet 0x0c349000 0t0 TCP xpp3:telnet->xp001:3807 (ESTABLISHED)
telnetd 26862 root2u inet 0x0c349000 0t0 TCP xpp3:telnet->xp001:3807 (ESTABLISHED)
telnetd 26986 root0u inet 0x1ab27100 0t0 TCP xpp3:telnet->xp001:3988 (ESTABLISHED)
telnetd 26986 root1u inet 0x1ab27100 0t0 TCP xpp3:telnet->xp001:3988 (ESTABLISHED)
telnetd 26986 root2u inet 0x1ab27100 0t0 TCP xpp3:telnet->xp001:3988 (ESTABLISHED)
-l此參數(shù)禁止將user ID轉(zhuǎn)換為登入名稱。(預(yù)設(shè)顯示登入名稱)
# lsof -l|more
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
kernel0 0 cwd VDIR8,02560 2 /
init 1 0 cwd VDIR8,02560 2 /
init 1 0 txt VREG8,0 286720 16015 / (/dev/rz0a)
kloadsrv 3 0 cwd VDIR8,02560 2 /
kloadsrv 3 0 txt VREG8,0 221184 16041 /sbin/kloadsrv
kloadsrv 3 0 0r VCHR0,0 0t0 9608 /dev/console
kloadsrv 3 0 1w VCHR0,0 0t0 9608 /dev/console
kloadsrv 3 0 2w VCHR0,0 0t0 9608 /dev/console
+|-L [l] +或-表示正在打開或取消顯示文件連結(jié)數(shù). 若只有單純的+L,后面沒(méi)有任何數(shù)字,則表示顯示全部。若其后有加上數(shù)字,只有文
件連結(jié)數(shù)少于該數(shù)字的會(huì)被列出。
-n不將IP位址轉(zhuǎn)換成hostname,預(yù)設(shè)是不加上-n參數(shù)。
舉例: lsof -i tcp@xp001 -n
(您可以和上兩張圖比較一下,原先的hostname便回ip位置了)
# lsof -i tcp@xp001 -n
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
telnetd 26862 root0u inet 0x0c349000 0t0 TCP 10.65.69.147:telnet->10.65.69.131:3807 (ESTABLISHED)
telnetd 26862 root1u inet 0x0c349000 0t0 TCP 10.65.69.147:telnet->10.65.69.131:3807 (ESTABLISHED)
telnetd 26862 root2u inet 0x0c349000 0t0 TCP 10.65.69.147:telnet->10.65.69.131:3807 (ESTABLISHED)
telnetd 26986 root0u inet 0x1ab27100 0t0 TCP 10.65.69.147:telnet->10.65.69.131:3988 (ESTABLISHED)
telnetd 26986 root1u inet 0x1ab27100 0t0 TCP 10.65.69.147:telnet->10.65.69.131:3988 (ESTABLISHED)
telnetd 26986 root2u inet 0x1ab27100 0t0 TCP 10.65.69.147:telnet->10.65.69.131:3988 (ESTABLISHED)
# lsof -i tcp@xp001
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
telnetd 26862 root0u inet 0x0c349000 0t0 TCP xpp3:telnet->xp001:3807 (ESTABLISHED)
telnetd 26862 root1u inet 0x0c349000 0t0 TCP xpp3:telnet->xp001:3807 (ESTABLISHED)
telnetd 26862 root2u inet 0x0c349000 0t0 TCP xpp3:telnet->xp001:3807 (ESTABLISHED)
telnetd 26986 root0u inet 0x1ab27100 0t0 TCP xpp3:telnet->xp001:3988 (ESTABLISHED)
telnetd 26986 root1u inet 0x1ab27100 0t0 TCP xpp3:telnet->xp001:3988 (ESTABLISHED)
telnetd 26986 root2u inet 0x1ab27100 0t0 TCP xpp3:telnet->xp001:3988 (ESTABLISHED)
-s列出文件的大小,若該文件沒(méi)有大小,則留下空白。
# lsof -s
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
kernel0 root cwd VDIR8,0 2560 2 /
init 1 root cwd VDIR8,0 2560 2 /
init 1 root txt VREG8,0 286720 16015 / (/dev/rz0a)
kloadsrv 3 root cwd VDIR8,0 2560 2 /
kloadsrv 3 root txt VREG8,0 221184 16041 /sbin/kloadsrv
kloadsrv 3 root0r VCHR0,09608 /dev/console
kloadsrv 3 root1w VCHR0,09608 /dev/console
kloadsrv 3 root2w VCHR0,09608 /dev/console
-u s 以login name(登入名稱)或UID,列出所正在打開文件。
# lsof -u tongxl
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
csh 26939 tongxl cwd VDIR8,6 1024 243236 /usr -- tongxl
csh 26939 tongxl txt VREG8,6 253952 12856 /usr (/dev/rz0g)
csh 26939 tongxl txt VREG8,0 139264 16016 /sbin/loader
csh 26939 tongxl txt VREG8,0 1663104 38402 /shlib/libc.so
csh 26939 tongxl0r VCHR1,0 0t0 9612 /dev/tty
csh 26939 tongxl 15u VCHR6,20t328 9618 /dev/pts/2
csh 26939 tongxl 16u VCHR6,20t328 9618 /dev/pts/2
csh 26939 tongxl 17u VCHR6,20t328 9618 /dev/pts/2
csh 26939 tongxl 18u VCHR6,20t328 9618 /dev/pts/2
csh 26939 tongxl 19u VCHR6,20t328 9618 /dev/pts/2
csh 26990 tongxl cwd VDIR8,6 1024 243236 /usr -- tongxl
csh 26990 tongxl txt VREG8,6 253952 12856 /usr (/dev/rz0g)
csh 26990 tongxl txt VREG8,0 139264 16016 /sbin/loader
csh 26990 tongxl txt VREG8,0 1663104 38402 /shlib/libc.so
csh 26990 tongxl0r VCHR1,0 0t0 9612 /dev/tty
csh 26990 tongxl 15u VCHR6,1 0t147797 9616 /dev/pts/1
csh 26990 tongxl 16u VCHR6,1 0t147797 9616 /dev/pts/1
csh 26990 tongxl 17u VCHR6,1 0t147797 9616 /dev/pts/1
csh 26990 tongxl 18u VCHR6,1 0t147797 9616 /dev/pts/1
csh 26990 tongxl 19u VCHR6,1 0t147797 9616 /dev/pts/1
----------------------------------------------------------------