From owner-freebsd-current  Fri Aug  2 09:17:45 1996
Return-Path: owner-current
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id JAA14871
          for current-outgoing; Fri, 2 Aug 1996 09:17:45 -0700 (PDT)
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id JAA14866
          for <freebsd-current@FreeBSD.ORG>; Fri, 2 Aug 1996 09:17:43 -0700 (PDT)
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id JAA05802; Fri, 2 Aug 1996 09:15:22 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199608021615.JAA05802@phaeton.artisoft.com>
Subject: Re: NFS Diskless Dispare...
To: dfr@render.com (Doug Rabson)
Date: Fri, 2 Aug 1996 09:15:21 -0700 (MST)
Cc: terry@lambert.org, jkh@time.cdrom.com, tony@fit.qut.edu.au,
        freebsd-current@FreeBSD.ORG
In-Reply-To: <Pine.BSI.3.95.960802102720.21604H-100000@minnow.render.com> from "Doug Rabson" at Aug 2, 96 10:33:24 am
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-current@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

> > Mountd is far from being concurrent enough.  At one time, back in
> > the 1.1.5.1 days, I had it hacked up sufficiently to allow NFS
> > access by 20 or so X terminals, all at the same time.  I think
> > this is kludgable by hacking the timeout up for now.  Mountd wants
> > a bit of a rewrite once real threading is generally available.
> 
> Wouldn't that need to wait until a threaded rpc library was available...

Not really; it could be done with non-blocking I/O and an automaton
with a context record per connection.

> > Actually, I think it's the problem in vop_bmap for nfs that David noted
> > the other day.
> 
> Which problem is this?  I have always been slightly worried about the
> hacky nature of the nfs_bmap code (basically just multiplies b_lblkno by
> 16 or so, depending on the fs blocksize).  The higher level fs code seems
> to try to figure out whether to call VOP_BMAP by comparing b_blkno to
> b_lblkno and mapping if they are equal.  For NFS, they will always be
> equal for the first block of the file.  I didn't think it would be a
> problem since it would just call nfs_bmap a bit more often for that block.

I think the bmap code can fail in some cases because of the recent vm
changes; I can't rememebr for sure.  It was an answer to one of the
questions about NFS in -current... it was in the remote boot discussion
about paging from an NFS mounted init.  Sorry, I didn't save it; I thought
the suggested fix was going to go in immediately because of who posted it.

> > > I think some of the stability problems with NFS are due to its lack of
> > > vnode locking primitives.  This might be addressed by the lite2 fs work
> > > but if not, I will try to get something in after that work is merged.
> > 
> > The NFS, procfs, and several other non-boot-critical FS's didn't have
> > the new primitives in the patch sets we've seen so far.  I don't think
> > they will have much positive effect on this problem, but there are three
> > or four other problems that will clear up (mostly two client race
> > conditions).
> 
> I think the worst races would be between VOP_READ or VOP_WRITE and vclean.
> I think that you could cause real damage with one of those :-).

I had a vclean patch to deal with the "free vnode isn't" error a long
time ago... it was a serious kludge, IMO, since it fixed the symptom
instead of the problem (which is that vclean is a bad idea).  I think
the patch would also save us from the race conditions as well.  Are you
maybe thinking of client side nfsnodes?


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.