swap_pager: indefinite wait buffer
Miroslav Lachman
000.fbsd at quip.cz
Sat Aug 5 20:05:45 CEST 2006
Dnes se mi v logu na jednom serveru objevilo nekolik zprav, jako je tato:
swap_pager: indefinite wait buffer: bufobj: 0, blkno: 23, size: 4096
Zhruba ve stejnou dobu "umrel" i proces, ktery testoval diskovy system
(kopirovani souboru a overovani jejich md5 checksumu), od teto chvile je
server neuveritelne zpomaleny. Jakakoliv operace v prikazovem radku pres
SSH trva velice dlouho (napriklad i dokoncovani nazvu souboru v shellu).
System je pritom nevytizen. Sice na nem je spusteni Apache, MySQL,
Postfix, Amavisd atd., ale zatim neni nasazen do provozu. Jedine, co
vykonava nejakou cinnost je generovani grafu MRTG.
Vyse ovedena hlaska se v prubehu dne jeste nekolikrat zopakovala.
last pid: 51602; load averages: 0.00, 0.00, 0.00 up
2+20:23:23 19:45:22
94 processes: 1 running, 93 sleeping
CPU states: 0.4% user, 0.0% nice, 0.0% system, 0.0% interrupt, 99.6%
idle
Mem: 257M Active, 100M Inact, 102M Wired, 26M Cache, 60M Buf, 9316K Free
Swap: 3072M Total, 152K Used, 3072M Free
PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND
1077 root 1 5 0 2016K 1180K ttyin 3:26 0.00% systat
599 mysql 10 102 0 493M 70820K ucond 0:43 0.00% mysqld
1087 root 1 -8 0 2520K 1828K piperd 0:25 0.00% ccze
544 root 1 96 0 26628K 25092K select 0:24 0.00% perl5.8.8
491 root 1 96 0 3180K 1444K select 0:07 0.00% ntpd
542 root 1 8 0 23524K 11116K nanslp 0:07 0.00% httpd
24567 root 1 -16 0 1948K 1220K vnread 0:05 0.00% cp
Napriklad spusteni prikazu gstat trva vic nez minutu.
systat -vmstat
1 users Load 0.04 0.01 0.00 Aug 5 19:49
Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER
Tot Share Tot Share Free in out in out
Act 131024 8804 906652 26924 32056 count
All 498000 75216246234032 210428 pages
Interrupts
Proc:r p d s w Csw Trp Sys Int Sof Flt cow 2008 total
1 12 80 356 4 69 158 2 104572 wire
14: ata
262212 act 5
16: bge
0.3%Sys 0.0%Intr 0.0%User 0.0%Nice 99.7%Idl 106616 inact 2
19: uhc
| | | | | | | | | | 26488 cache 2001
cpu0: time
5568 free
daefr
Namei Name-cache Dir-cache prcfr
Calls hits % hits % react
1 1 100 pdwake
zfod pdpgs
Disks ad4 ad5 ozfod intrn
KB/t 13.67 13.67 %slo-z 61456 buf
tps 1 1 1 tfree 18 dirtybuf
MB/s 0.02 0.02 35659 desiredvnodes
% busy 111 111 29876 numvnodes
8913 freevnodes
Vsimnete si, ze disky maji 111% busy (pri jednotlivych obnovenich
obrazovky se hodnoty pohybuji od 90 do 115), presto tps pouze 1 a datovy
tok temer nulovy.
Ve vypisu gstat je videt problikavani velmi vysokych hodnot
procentualniho zatizeni (cisla od 150 az do 3000), ale jsou to jen velmi
kratke okamziky, rozhodne to nevypada, jako kdyz se na disk opravdu neco
systematicky zapisuje / cte. Zaroven jsou tam silene dlouhe casy pro
zapisy (ms/w).
dT: 0.501s w: 0.500s
L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
0 0 0 0 0.0 0 0 0.0 0.0| acd0
4 2 0 0 0.0 2 4 5110.3 164.6| ad5
5 2 0 0 0.0 2 4 5002.8 167.9| ad4
6 2 0 0 0.0 2 4 5002.9 167.9| mirror/gm0
0 0 0 0 0.0 0 0 0.0 0.0| ad4s1
0 0 0 0 0.0 0 0 0.0 0.0| ad4s2
0 0 0 0 0.0 0 0 0.0 0.0| ad5s1
0 0 0 0 0.0 0 0 0.0 0.0| ad5s2
1 0 0 0 0.0 0 0 0.0 0.0| mirror/gm0s1
5 2 0 0 0.0 2 4 5003.0 167.9| mirror/gm0s2
0 0 0 0 0.0 0 0 0.0 0.0| mirror/gm0s1a
0 0 0 0 0.0 0 0 0.0 0.0| mirror/gm0s1b
0 0 0 0 0.0 0 0 0.0 0.0| mirror/gm0s1c
0 0 0 0 0.0 0 0 0.0 0.0| mirror/gm0s1d
0 0 0 0 0.0 0 0 0.0 0.0| mirror/gm0s1e
0 0 0 0 0.0 0 0 0.0 0.0| mirror/gm0s1f
0 0 0 0 0.0 0 0 0.0 0.0| mirror/gm0s1g
1 0 0 0 0.0 0 0 0.0 0.0| mirror/gm0s1h
0 0 0 0 0.0 0 0 0.0 0.0| mirror/gm0s2c
5 2 0 0 0.0 2 4 5003.0 167.9| mirror/gm0s2d
Nejsem bohuzel nijak schopen zjistit, cim to muze byt.
Zmineny stav trva uz vice jak 12 hodin, neni mi znamo, ze by neco
opravdu pracovalo s diskem. Dokazal by mi nekdo poradit, cim a jak
zjistovat, kde muze byt problem?
Miroslav Lachman
PS: stale se jedna o jiz nekolikrat zminovane stroje ASUS RS120-E3 s
250GB SATA disky Seagate s pouzitim Gmirroru.
FreeBSD 6.1-STABLE #0: Wed Aug 2 22:07:55 CEST 2006
/usr/obj/usr/src/sys/ASUS_RS120 i386
More information about the Users-l
mailing list