Date: Thu, 27 Feb 2003 11:08:52 -0400 (AST) From: "Marc G. Fournier" <scrappy@hub.org> To: David Schultz <das@FreeBSD.ORG> Cc: freebsd-stable@FreeBSD.ORG Subject: Re: 4.8-PRERELEASE 'hangs' nightly like clockwork ... Message-ID: <20030227110726.J17399@hub.org> In-Reply-To: <20030226060854.GA6637@HAL9000.homeunix.com> References: <20030225125414.P90059@hub.org> <20030226060854.GA6637@HAL9000.homeunix.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 25 Feb 2003, David Schultz wrote:
> Thus spake Marc G. Fournier <scrappy@hub.org>:
> > For the past few nights, since I "fixed" the KVA_PAGES issue, the server
> > seems to be hanging almost like clockwork ... plus or minus a bit, but is
> > around 23hrs or so since the last hang (or, around 9pm CST, not sure which
> > one is the 'trigger') ...
> >
> > top, from last nights, shows:
> >
> > last pid: 44187; load averages: 0.29, 11.36, 19.195 up 1+00:11:55 22:04:00
> > 3173 processes:1 running, 3150 sleeping, 22 zombie
> > CPU states: 0.0% user, 0.0% nice, 8.6% system, 0.6% interrupt, 90.8% idle
> > Mem: 2335M Active, 426M Inact, 595M Wired, 205M Cache, 199M Buf, 5860K Free
> > Swap: 2048M Total, 495M Used, 1553M Free, 24% Inuse
> >
> > now, I got the folks down at Rackspace to do a ctl-alt-esc and 'panic',
> > and it dumps core, if that helps any ... a gdb on the core file just tells
> > me that a panic was issued from the key board ... the top session above
> > continued to run up until they issued the ctl-alt-sec, as does a ping to
> > the server, so it looks like those processes resident in memory do continu
> > to run ...
>
> It sounds like processes are blocking forever on I/O. Once you
> have a crash dump, you can run ps(1) on the image to see what
> state processes were in when the dump was taken. I think you want
> something like
> ps -alxww -M/path/to/core -N/path/to/kernel
> If you notice a bunch of them stuck in a suspicious state, load
> the dump into kgdb and type
'K, first question is ... what would I consider a "suspicous state":
jupiter# awk '{print $9}' ps.1 | sort | uniq -c
978 -
1 FFS
1 WCHAN
239 accept
324 ffsvgt
382 inode
558 lockf
4 nfsd
26 pause
236 piperd
1 pipewr
3 poll
1 ppwait
1 psleep
97 sbwait
32 select
14 ttyin
283 wait
jupiter# wc -l ps.1
3181 ps.1
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030227110726.J17399>
