From owner-freebsd-current Thu Aug 1 11:04:57 1996 Return-Path: owner-current Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id LAA28724 for current-outgoing; Thu, 1 Aug 1996 11:04:57 -0700 (PDT) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id LAA28717 for ; Thu, 1 Aug 1996 11:04:55 -0700 (PDT) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id LAA04239; Thu, 1 Aug 1996 11:02:12 -0700 From: Terry Lambert Message-Id: <199608011802.LAA04239@phaeton.artisoft.com> Subject: Re: NFS Diskless Dispare... To: dfr@render.com (Doug Rabson) Date: Thu, 1 Aug 1996 11:02:12 -0700 (MST) Cc: jkh@time.cdrom.com, tony@fit.qut.edu.au, freebsd-current@FreeBSD.ORG In-Reply-To: from "Doug Rabson" at Aug 1, 96 11:54:03 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-current@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > > > 1. The inability to mount file systems. The clients start barfing with > > > something like "RPC mount timeout". This problem goes away after a > > > while as the clients retry. I think its the mountd getting too many > > > requests at once. Each client mounts 9 file systems. Mountd is far from being concurrent enough. At one time, back in the 1.1.5.1 days, I had it hacked up sufficiently to allow NFS access by 20 or so X terminals, all at the same time. I think this is kludgable by hacking the timeout up for now. Mountd wants a bit of a rewrite once real threading is generally available. > > I think that this is a more generic NFS bug in -current. I can > > reproduce this, even causing mountd to silently exit (no core, no > > syslog msg) with just one client and some fierce AMD-assisted pounding > > on a 2.2-current NFS server. I think the exit is a seperate problem. I'd be curious about what you could find out from a trace of the process started before it dies. > > > 2. Files permissions are read incorrectly. Files that should be able to > > > be executed are giving "permission denied" messages. Sometimes even > > > the kernel can't be loaded by netboot.com but if you persist by > > > typing "autoboot" it will magically start to work. Machines fail to > > > boot correctly as programms called in /etc/rc don't start > > > (permission denied). > > > > Probably more NFS bogosity. [ ... ] > I think for diskless root filesystems, you must export the fs with > -root=0, otherwise lots of stuff will break. [ this is true, but it's not the cause ] > > > 3. Pageing in of binaries cause the system to panic. Vnode_pager does > > > not seem to like it when it can't page in executables, even when the > > > > See #2. :-) > > Probably paging from a file which root can't access (see above). Actually, I think it's the problem in vop_bmap for nfs that David noted the other day. > > 2.1.5? Its NFS is still unstable, but I don't believe anywhere near > > the state it's in with -current. > > I think some of the stability problems with NFS are due to its lack of > vnode locking primitives. This might be addressed by the lite2 fs work > but if not, I will try to get something in after that work is merged. The NFS, procfs, and several other non-boot-critical FS's didn't have the new primitives in the patch sets we've seen so far. I don't think they will have much positive effect on this problem, but there are three or four other problems that will clear up (mostly two client race conditions). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.