Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Jul 2010 13:16:59 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        Mike Tancsa <mike@sentex.net>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: deadlock or bad disk ?  RELENG_8
Message-ID:  <20100719201659.GA21088@icarus.home.lan>
In-Reply-To: <201007191241.o6JCfcq5049355@lava.sentex.ca>
References:  <201007182108.o6IL88eG043887@lava.sentex.ca> <20100718211415.GA84127@icarus.home.lan> <201007182142.o6ILgDQW044046@lava.sentex.ca> <20100719023419.GA91006@icarus.home.lan> <201007190301.o6J31Hs1045607@lava.sentex.ca> <20100719033424.GA92607@icarus.home.lan> <20100719035844.GA93487@icarus.home.lan> <201007191241.o6JCfcq5049355@lava.sentex.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jul 19, 2010 at 08:41:40AM -0400, Mike Tancsa wrote:
> At 11:58 PM 7/18/2010, Jeremy Chadwick wrote:
> 
> >So I believe this indicates the message only gets printed during swapin,
> >not swapout.  Meaning it's happening during an I/O read from da0.
> 
> Yes, and from my existing ssh sessions, it would _seem_ no disk IO
> was completing.  ie I tried a killall -9 watchdogd which would need
> to load killall from the disk, read whatever its linked against.
> However, after hitting enter it was just blocking on trying to read.
> So I would describe it as if the entire system was waiting from that
> "swapper Indefinite wait" to finish, or I could not read anything
> from drives associated with that controller.

Hmm, okay, so it sounds like the controller wedged or arcmsr(4) started
acting oddly.  I would open up a case with Areca on the problem,
*especially* if it happens again.

> >So what's hz?  Well, I want to assume it's kern.hz, which defaults to
> >1000.  1000*20 = 20000, so the timeout would be 20000/1000 = 20 seconds.
> >That's a pretty long time to be waiting for an I/O read to return.
> 
> I think the messages were printing to the serial console faster than
> that, but I could be wrong. If it happens again, I will time it

Come to think of it, I'm betting you'd get large batches of these
messages if/when it happens.  That VM code isn't something I'm familiar
with (nor msleep(9)), I just happen to dig around and find what I can.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100719201659.GA21088>