From owner-freebsd-stable@FreeBSD.ORG Mon Jul 19 12:41:39 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C66C106564A for ; Mon, 19 Jul 2010 12:41:39 +0000 (UTC) (envelope-from mike@sentex.net) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by mx1.freebsd.org (Postfix) with ESMTP id 1D7CD8FC22 for ; Mon, 19 Jul 2010 12:41:38 +0000 (UTC) Received: from mdt-xp.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.14.4/8.14.3) with ESMTP id o6JCfcq5049355; Mon, 19 Jul 2010 08:41:38 -0400 (EDT) (envelope-from mike@sentex.net) Message-Id: <201007191241.o6JCfcq5049355@lava.sentex.ca> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Mon, 19 Jul 2010 08:41:40 -0400 To: Jeremy Chadwick From: Mike Tancsa In-Reply-To: <20100719035844.GA93487@icarus.home.lan> References: <201007182108.o6IL88eG043887@lava.sentex.ca> <20100718211415.GA84127@icarus.home.lan> <201007182142.o6ILgDQW044046@lava.sentex.ca> <20100719023419.GA91006@icarus.home.lan> <201007190301.o6J31Hs1045607@lava.sentex.ca> <20100719033424.GA92607@icarus.home.lan> <20100719035844.GA93487@icarus.home.lan> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Cc: freebsd-stable@freebsd.org Subject: Re: deadlock or bad disk ? RELENG_8 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jul 2010 12:41:39 -0000 At 11:58 PM 7/18/2010, Jeremy Chadwick wrote: >So I believe this indicates the message only gets printed during swapin, >not swapout. Meaning it's happening during an I/O read from da0. Yes, and from my existing ssh sessions, it would _seem_ no disk IO was completing. ie I tried a killall -9 watchdogd which would need to load killall from the disk, read whatever its linked against. However, after hitting enter it was just blocking on trying to read. So I would describe it as if the entire system was waiting from that "swapper Indefinite wait" to finish, or I could not read anything from drives associated with that controller. >So what's hz? Well, I want to assume it's kern.hz, which defaults to >1000. 1000*20 = 20000, so the timeout would be 20000/1000 = 20 seconds. >That's a pretty long time to be waiting for an I/O read to return. I think the messages were printing to the serial console faster than that, but I could be wrong. If it happens again, I will time it ---Mike -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike