From owner-freebsd-stable@FreeBSD.ORG Thu Oct 5 08:30:36 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A4EA016A47C for ; Thu, 5 Oct 2006 08:30:36 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from fw.zoral.com.ua (fw.zoral.com.ua [213.186.206.134]) by mx1.FreeBSD.org (Postfix) with ESMTP id 099A943D49 for ; Thu, 5 Oct 2006 08:30:33 +0000 (GMT) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id k958QTuD011369 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 5 Oct 2006 11:26:29 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8) with ESMTP id k958USmS013773; Thu, 5 Oct 2006 11:30:29 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8/Submit) id k958URSo013772; Thu, 5 Oct 2006 11:30:27 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 5 Oct 2006 11:30:27 +0300 From: Kostik Belousov To: Vivek Khera Message-ID: <20061005083027.GK89654@deviant.kiev.zoral.com.ua> References: <917B087C-5E13-4D7F-94FA-95CB0E5C1884@khera.org> <20060922190328.GA64849@xor.obsecurity.org> <555B84D2-520F-44D6-84D6-CF9CE7EE47C7@khera.org> <20060922203654.GA65693@xor.obsecurity.org> <847DD3A5-D5DD-4D3E-B755-64B13D1DA506@khera.org> <20061003084315.GA89654@deviant.kiev.zoral.com.ua> <40CE3CF0-49D2-4335-A0B8-34B5251E9E19@khera.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="+Hr//EUsa8//ouuB" Content-Disposition: inline In-Reply-To: <40CE3CF0-49D2-4335-A0B8-34B5251E9E19@khera.org> User-Agent: Mutt/1.4.2.2i X-Virus-Scanned: ClamAV version 0.88.4, clamav-milter version 0.88.4 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=1.9 required=5.0 tests=DNS_FROM_RFC_ABUSE, SPF_NEUTRAL,UNPARSEABLE_RELAY autolearn=no version=3.1.4 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-25) on fw.zoral.com.ua Cc: stable@freebsd.org Subject: Re: ffs snapshot lockup X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Oct 2006 08:30:36 -0000 --+Hr//EUsa8//ouuB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Oct 04, 2006 at 05:16:53PM -0400, Vivek Khera wrote: >=20 > On Oct 3, 2006, at 4:43 AM, Kostik Belousov wrote: >=20 > >>Details are posted at http://vivek.khera.org/scratch/crashlogs/ > >> > >>I have the crashdumps available to a kernel hacker upon request (i'd > >>rather not make them generally available to the public...) > >> > >It seems that you have snapshotted fs exported by nfsd ? At least, =20 > >18a is > >definitely the case. I have the patch (for current) that shall fix =20 > >the issue. > >In fact, you need two patches: >=20 > As per advice of Kris Kenneway, I turned off the software watchdog to =20 > rule out that as my problem. Then I ran a level 3 dump. Dump of root =20 > fs went fine, then it proceeded to do /usr. After a few minutes it =20 > locked up. Typescript 20 at the above URL shows the debugging info =20 > from the break into debugger of the locked up system. Since /usr was =20 > locked, nobody could log in at all. >=20 > The network load was minimal at the time. I had everyone log out and =20 > close mail etc. >=20 What were the symptoms of locked system ? Could you log in on console, or do something at the shell prompt on console ? Also, did the system respond to the pings ? Fs-related deadlocks (as well as stalled disk io) usually do not prevent lowest levels of the isr/network stack from working. Again, I do not see the fs deadlock per se in the supplied script. Dump does disk io, it seems that nfsd tries to serve some request. Sshd looks to be ready to accept connections. If console is available, but ping responses not arrive, this is definitely network card problem. --+Hr//EUsa8//ouuB Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFJMKjC3+MBN1Mb4gRAqFmAKCucL3cM5rG0NRnf62VisTTomK/xACbBVeb cxFiuvG1eCxhPMPaLmWX+tg= =opbq -----END PGP SIGNATURE----- --+Hr//EUsa8//ouuB--