Date: Thu, 27 Feb 2003 11:08:52 -0400 (AST) From: "Marc G. Fournier" <scrappy@hub.org> To: David Schultz <das@FreeBSD.ORG> Cc: freebsd-stable@FreeBSD.ORG Subject: Re: 4.8-PRERELEASE 'hangs' nightly like clockwork ... Message-ID: <20030227110726.J17399@hub.org> In-Reply-To: <20030226060854.GA6637@HAL9000.homeunix.com> References: <20030225125414.P90059@hub.org> <20030226060854.GA6637@HAL9000.homeunix.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 25 Feb 2003, David Schultz wrote: > Thus spake Marc G. Fournier <scrappy@hub.org>: > > For the past few nights, since I "fixed" the KVA_PAGES issue, the server > > seems to be hanging almost like clockwork ... plus or minus a bit, but is > > around 23hrs or so since the last hang (or, around 9pm CST, not sure which > > one is the 'trigger') ... > > > > top, from last nights, shows: > > > > last pid: 44187; load averages: 0.29, 11.36, 19.195 up 1+00:11:55 22:04:00 > > 3173 processes:1 running, 3150 sleeping, 22 zombie > > CPU states: 0.0% user, 0.0% nice, 8.6% system, 0.6% interrupt, 90.8% idle > > Mem: 2335M Active, 426M Inact, 595M Wired, 205M Cache, 199M Buf, 5860K Free > > Swap: 2048M Total, 495M Used, 1553M Free, 24% Inuse > > > > now, I got the folks down at Rackspace to do a ctl-alt-esc and 'panic', > > and it dumps core, if that helps any ... a gdb on the core file just tells > > me that a panic was issued from the key board ... the top session above > > continued to run up until they issued the ctl-alt-sec, as does a ping to > > the server, so it looks like those processes resident in memory do continu > > to run ... > > It sounds like processes are blocking forever on I/O. Once you > have a crash dump, you can run ps(1) on the image to see what > state processes were in when the dump was taken. I think you want > something like > ps -alxww -M/path/to/core -N/path/to/kernel > If you notice a bunch of them stuck in a suspicious state, load > the dump into kgdb and type 'K, first question is ... what would I consider a "suspicous state": jupiter# awk '{print $9}' ps.1 | sort | uniq -c 978 - 1 FFS 1 WCHAN 239 accept 324 ffsvgt 382 inode 558 lockf 4 nfsd 26 pause 236 piperd 1 pipewr 3 poll 1 ppwait 1 psleep 97 sbwait 32 select 14 ttyin 283 wait jupiter# wc -l ps.1 3181 ps.1 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030227110726.J17399>