From owner-freebsd-stable@FreeBSD.ORG Thu Nov 23 05:25:55 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7CF8116A407 for ; Thu, 23 Nov 2006 05:25:55 +0000 (UTC) (envelope-from chrcoluk@gmail.com) Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.180]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1FD1F43D55 for ; Thu, 23 Nov 2006 05:25:22 +0000 (GMT) (envelope-from chrcoluk@gmail.com) Received: by py-out-1112.google.com with SMTP id f31so231223pyh for ; Wed, 22 Nov 2006 21:25:21 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=EJTEykVDvo4lMAjUb/8b4HxsieDGbt8UIz5kkLt8RULRR6K1z5KBG9kXE6yaEBhThP9UljABIihQjDL9af9uLaTwB36jF0uO/Z+Gx8hLAAluWS/OyOa/xJhbbG9Kkj5Z0n+TLvdqIT3Vgpi6QatBM/3eRliQMg5Uw88XxYk3wYY= Received: by 10.35.100.6 with SMTP id c6mr2641215pym.1164259521323; Wed, 22 Nov 2006 21:25:21 -0800 (PST) Received: by 10.35.17.16 with HTTP; Wed, 22 Nov 2006 21:25:21 -0800 (PST) Message-ID: <3aaaa3a0611222125v36344f17rbc59a60516836b44@mail.gmail.com> Date: Thu, 23 Nov 2006 05:25:21 +0000 From: Chris To: "Kris Kennaway" In-Reply-To: <20061122170353.GA38104@xor.obsecurity.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <3aaaa3a0611212149u21146180ra84503472a0336e3@mail.gmail.com> <20061122170353.GA38104@xor.obsecurity.org> Cc: FreeBSD Stable Subject: Re: sshfs/nfs cause server lockup X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Nov 2006 05:25:55 -0000 On 22/11/06, Kris Kennaway wrote: > On Wed, Nov 22, 2006 at 05:49:12AM +0000, Chris wrote: > > On a few occasions all different remote servers I have had nfs cause > > servers to stop responding so I stopped using it all the servers were > > either 6.0 release 6.1 release or 6-stable. > > > > We recently discovered sshfs which supports cross platform mounting > > server is linux and I mounted on a freebsd 6.1 release using security > > branch up to date. > > > > it was working fine for around 5 to 6 days with some problems with > > sshfs not updating files that are updated but wasnt compromising the > > stability of the freebsd server I just remounted to keep up to date. > > Then today the linux server had network problems so the sshfs timed > > out and there is 2 dirs I mount, the first mounted fine a bit slow but > > connected but when I ran the command to mount the 2nd dir the server > > stopped responding. > > > > My 2nd ssh terminal was alive I tried to run top to see if sshfs was > > hanging or something but when I hit enter top didnt run and the 2nd > > terminal was froze, note both terminals didnt timeout and a ircd > > running on the server also did not timeout but the box wasnt listening > > to any new requests, it was responding to pings fine. > > > > I have a remote reboot facility on the box but no local access and no > > kvm/serial console facility available this is the case for all of my > > servers. I initially tried a soft reboot which uses ctrl-alt-delete > > but the pings kept replying so I could see the reboot wasn initiated > > indicating some kind of console lockup as well, I then did a hard > > reboot which brought the server back. > > > > All logs stopped when the first lockup occured so no errors etc. > > recorded bear in mind I have no local access to this machine. It does > > appear that 6.x has some kind of serious remote mounting bug because I > > never had these nfs problems in freebsd 5.x. > > > > I would be interested in any thoughts as to what could help me I have > > rebooted the server now with network mpsafe disabled to see if this > > will help it is using a generic kernel with the following changes. > > Sounds like your "sshfs" is causing the kernel to deadlock in that > error situation. You can confirm by enabling DEBUG_LOCKS and > DEBUG_VFS_LOCKS, then breaking to DDB and running 'show lockedvnods' > when the deadlock occurs. > > If you're still having problems with NFS on 6.2, we'd much rather you > reported those so that we can investigate and try to fix them. > > Kris > > > Ok thanks, I will make sure this box is updated to 6.2 when it hits release, if I enable the options in the kernel I will need local access to use ddb? Chris