From owner-freebsd-current Wed Jul 7 10:31:33 1999 Delivered-To: freebsd-current@freebsd.org Received: from cygnus.rush.net (cygnus.rush.net [209.45.245.133]) by hub.freebsd.org (Postfix) with ESMTP id AB43D14CDE for ; Wed, 7 Jul 1999 10:31:29 -0700 (PDT) (envelope-from bright@rush.net) Received: from localhost (bright@localhost) by cygnus.rush.net (8.9.3/8.9.3) with SMTP id NAA12545; Wed, 7 Jul 1999 13:37:42 -0400 (EDT) Date: Wed, 7 Jul 1999 12:37:40 -0500 (EST) From: Alfred Perlstein To: Peter Wemm Cc: current@FreeBSD.ORG Subject: Re: nfs ick in -current In-Reply-To: <19990707160736.E151878@overcee.netplex.com.au> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Thu, 8 Jul 1999, Peter Wemm wrote: > Alfred Perlstein wrote: > > On Wed, 7 Jul 1999, Peter Wemm wrote: > > > > > > attempting to compile xscreensaver has triggered it twice in a row > > > > /usr/ports is mounted off "server" (a freebsd -current box) and > > > > doing the make will kill the machine. > > > > > > > > Once I figure out where the heck I have the console redirected > > > > I'll have something more substantial. > > > > > > You have a block size of 32K, I'll bet it's 'Bad nfs svc reply' in > > > nfs_syscalls.c. This is triggered when the READDIRPLUS op generates > > > an oversized reply and it triggers the sanity check. > > > > odd, shouldn't it know not to violate its own sanity checks? Just > > throttle down the requests, or spit out an error? > > Well, I still am having trouble getting my head around what's going on. > In a nutshell, the code goes to a fair bit of trouble to not generate a > packet that violates the protocols, and yet it still does, by a long shot. > I'm setting up to try and reproduce this so I can catch it live rather than > long afterwards. odd. > > > Change it to 16K and I think it'll work. Otherwise change the panic to a > > > printf(), but that is sweeping the problem under the carpet and might just > > > give the client indigestion. It does stop the server crashing though. > > > > I'll try that, thanks, set my -r and -w to 16k. Um something odd > > though, from the code: > > > > if (siz <= 0 || siz > NFS_MAXPACKET) { > > printf("mbuf siz=%d\n",siz); > > panic("Bad nfs svc reply"); > > } > > > > then earlier: > > nfsproto.h:#define NFS_MAXPKTHDR 404 > > nfsproto.h:#define NFS_MAXPACKET (NFS_MAXPKTHDR + NFS_MAXDATA) > > nfsproto.h:#define NFS_MAXDATA 32768 > > > > isn't 32k within the safe limits, or sometimes when building the RPC > > reply it can just get too big? > > It's running over by at least 250 bytes or so. ie: 32K + 680, which is > more than 32K + 404. I have not got the protocol maps handy so I'm not yet > sure if the limits are wrong, or the request is wrong, or the readdirplus > code is wrong. In any case, *something* is wrong. :-) oy vey. :) > > > Oh, one more thing, I'm getting "device busy" even when using -f > > during my unmount of wedged nfs mounts, all I needed to do was > > kill -9 a shell sitting in that dir, but shouldn't that be > > unnessesary? > > Umm.. what were you unmounting? server:/mount or the /localmount? There > are some well known problems with trying to unmount /localmount for a > wedged mount - the damn code stat's it and hangs itself. Trying to fix > this and yet accomodate all the other tweaks for handling symlinks > (ie: /dev/cdrom -> /dev/cd0a) etc style hacks got me rather confused last > time. > > Also, the VM has changed a bit over the last few months, deadfs may not be > finding all the hooks into the VFS that might be still there. unmounting a hung nfs mount: client% mount /dev/da0s1a on / (local, soft-updates, writes: sync 10 async 9392) ... server:/vol/extra/ports on /usr/ports server:/vol/extra/ncvs on /home/ncvs client% umount -f /home/ncvs Device busy. client% -Alfred To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message