From owner-freebsd-stable@FreeBSD.ORG Fri Oct 6 18:20:56 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D968B16A407 for ; Fri, 6 Oct 2006 18:20:56 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from fw.zoral.com.ua (fw.zoral.com.ua [213.186.206.134]) by mx1.FreeBSD.org (Postfix) with ESMTP id ECDDF43DA7 for ; Fri, 6 Oct 2006 18:20:26 +0000 (GMT) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id k96IFLLd065339 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 6 Oct 2006 21:15:21 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8) with ESMTP id k96IKLVi094482; Fri, 6 Oct 2006 21:20:21 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8/Submit) id k96IKK6I094481; Fri, 6 Oct 2006 21:20:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 6 Oct 2006 21:20:20 +0300 From: Kostik Belousov To: Vivek Khera Message-ID: <20061006182020.GH26993@deviant.kiev.zoral.com.ua> References: <555B84D2-520F-44D6-84D6-CF9CE7EE47C7@khera.org> <20060922203654.GA65693@xor.obsecurity.org> <847DD3A5-D5DD-4D3E-B755-64B13D1DA506@khera.org> <20061003084315.GA89654@deviant.kiev.zoral.com.ua> <40CE3CF0-49D2-4335-A0B8-34B5251E9E19@khera.org> <20061005083027.GK89654@deviant.kiev.zoral.com.ua> <5178C89F-B645-4A82-A7C9-FC09D458FE30@khera.org> <20061006073950.GD26993@deviant.kiev.zoral.com.ua> <20061006175714.GA15880@xor.obsecurity.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="oxV4ZoPwBLqAyY+a" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.2i X-Virus-Scanned: ClamAV version 0.88.4, clamav-milter version 0.88.4 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=1.9 required=5.0 tests=DNS_FROM_RFC_ABUSE, SPF_NEUTRAL,UNPARSEABLE_RELAY autolearn=no version=3.1.4 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-25) on fw.zoral.com.ua Cc: stable@freebsd.org, Kris Kennaway Subject: Re: ffs snapshot lockup X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Oct 2006 18:20:57 -0000 --oxV4ZoPwBLqAyY+a Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Oct 06, 2006 at 02:11:05PM -0400, Vivek Khera wrote: >=20 > On Oct 6, 2006, at 1:57 PM, Kris Kennaway wrote: >=20 > >>This is very strange. You 3 instances of getty where just reading the > >>tty input, and all suspectible processes (like sshd) are waiting =20 > >>on net > >>events. No processes are blocked on the fs. One nfsd is serving =20 > >>the request, > >>and dump is active. > > > >To repeat something I said earlier: when creating a snapshot > >(e.g. which dump -L does), the entire system may become unresponsive > >untilk the snapshot completes, which can take many minutes. >=20 > I know snapshot takes a while -- we're used to that. >=20 > >How long are you waiting before pronouncing the system deadlocked? > > >=20 >=20 > 10's of minutes. > >What does ^T on the console (e.g. when trying to log in), show you? >=20 There were no active snapshotting in the progress. Snapshot was already made, and dump happily processed in the moment captured in the script. > nothing. the console is non-responsive. the remote shells are non =20 > responsive to any input. >=20 > I'm now convinced it was all stemming from some bug in bge driver (at =20 > least for my specific chipset.) Last night I put in an old spare =20 > 3c905 NIC and turned off the motherboard bge via BIOS. >=20 > I can't make the machine lock up at all, even with the watchdog =20 > running, and doing level0 dumps. >=20 > Also, even though this NIC is only 10/100 and the prior was running =20 > at GigE speed, the system is *way* more responsive to network =20 > operations. For example, when I logged in this morning my IMAP mail =20 > client took barely a second or or so to open my inbox, whereas before =20 > it would take upwards of 10 seconds. >=20 > This machine was always this way since it was first set up running =20 > 5.3. I can't believe I lived with it for so long... I'd like to =20 > find a nice stable GigE NIC for it, since I know that the onboard bge =20 > is definitely sub-optimal with FreeBSD. Dell's diagnostics don't =20 > find any hardware fault, for what that's worth. >=20 > Curiously, I have a handful of other Dell servers at the office which =20 > all have bge and run just great at GigE speed to the same switch. >=20 > If it does lock up again, I'll be sure to let you know! >=20 Was this system patched by the stuff I submitted to you ? --oxV4ZoPwBLqAyY+a Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFJp5jC3+MBN1Mb4gRAk9+AKCEQ2EqglvQZ8hZtieYjDEcQlED8ACgg6if 9iALnpP8YUnwh9/bkTb0ZdQ= =XDS/ -----END PGP SIGNATURE----- --oxV4ZoPwBLqAyY+a--