From owner-freebsd-stable@FreeBSD.ORG Fri Oct 6 18:11:07 2006 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A350216A403 for ; Fri, 6 Oct 2006 18:11:07 +0000 (UTC) (envelope-from vivek@khera.org) Received: from yertle.kcilink.com (yertle.kcilink.com [65.205.34.180]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3079143D46 for ; Fri, 6 Oct 2006 18:11:06 +0000 (GMT) (envelope-from vivek@khera.org) Received: from [192.168.7.103] (host-103.int.kcilink.com [192.168.7.103]) by yertle.kcilink.com (Postfix) with ESMTP id 5302EB80F; Fri, 6 Oct 2006 14:11:06 -0400 (EDT) In-Reply-To: <20061006175714.GA15880@xor.obsecurity.org> References: <917B087C-5E13-4D7F-94FA-95CB0E5C1884@khera.org> <20060922190328.GA64849@xor.obsecurity.org> <555B84D2-520F-44D6-84D6-CF9CE7EE47C7@khera.org> <20060922203654.GA65693@xor.obsecurity.org> <847DD3A5-D5DD-4D3E-B755-64B13D1DA506@khera.org> <20061003084315.GA89654@deviant.kiev.zoral.com.ua> <40CE3CF0-49D2-4335-A0B8-34B5251E9E19@khera.org> <20061005083027.GK89654@deviant.kiev.zoral.com.ua> <5178C89F-B645-4A82-A7C9-FC09D458FE30@khera.org> <20061006073950.GD26993@deviant.kiev.zoral.com.ua> <20061006175714.GA15880@xor.obsecurity.org> Mime-Version: 1.0 (Apple Message framework v752.2) X-Gpgmail-State: !signed Content-Type: multipart/signed; micalg=sha1; boundary=Apple-Mail-7--685044619; protocol="application/pkcs7-signature" Message-Id: From: Vivek Khera Date: Fri, 6 Oct 2006 14:11:05 -0400 To: Kris Kennaway X-Mailer: Apple Mail (2.752.2) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Kostik Belousov , stable@freebsd.org Subject: Re: ffs snapshot lockup X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Oct 2006 18:11:07 -0000 --Apple-Mail-7--685044619 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed On Oct 6, 2006, at 1:57 PM, Kris Kennaway wrote: >> This is very strange. You 3 instances of getty where just reading the >> tty input, and all suspectible processes (like sshd) are waiting >> on net >> events. No processes are blocked on the fs. One nfsd is serving >> the request, >> and dump is active. > > To repeat something I said earlier: when creating a snapshot > (e.g. which dump -L does), the entire system may become unresponsive > untilk the snapshot completes, which can take many minutes. I know snapshot takes a while -- we're used to that. > How long are you waiting before pronouncing the system deadlocked? > 10's of minutes. > What does ^T on the console (e.g. when trying to log in), show you? nothing. the console is non-responsive. the remote shells are non responsive to any input. I'm now convinced it was all stemming from some bug in bge driver (at least for my specific chipset.) Last night I put in an old spare 3c905 NIC and turned off the motherboard bge via BIOS. I can't make the machine lock up at all, even with the watchdog running, and doing level0 dumps. Also, even though this NIC is only 10/100 and the prior was running at GigE speed, the system is *way* more responsive to network operations. For example, when I logged in this morning my IMAP mail client took barely a second or or so to open my inbox, whereas before it would take upwards of 10 seconds. This machine was always this way since it was first set up running 5.3. I can't believe I lived with it for so long... I'd like to find a nice stable GigE NIC for it, since I know that the onboard bge is definitely sub-optimal with FreeBSD. Dell's diagnostics don't find any hardware fault, for what that's worth. Curiously, I have a handful of other Dell servers at the office which all have bge and run just great at GigE speed to the same switch. If it does lock up again, I'll be sure to let you know! --Apple-Mail-7--685044619--