Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 26 Jan 2012 10:42:43 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Bruce Evans <brde@optusnet.com.au>
Cc:        svn-src-head@FreeBSD.org, Rick Macklem <rmacklem@FreeBSD.org>, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org
Subject:   Re: svn commit: r230516 - in head/sys: fs/nfsclient nfsclient
Message-ID:  <1513888354.186214.1327592563003.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20120127000316.U1055@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Bruce Evans wrote:
> On Wed, 25 Jan 2012, Rick Macklem wrote:
> 
> > Bruce Evans wrote:
> >> On Tue, 24 Jan 2012, Rick Macklem wrote:
> >>
> >>> Bruce Evans wrote:
> >>>> On Wed, 25 Jan 2012, Rick Macklem wrote:
> >>>>
> >>>>> Log:
> >>>>>  If a mount -u is done to either NFS client that switches it
> >>>>>  from TCP to UDP and the rsize/wsize/readdirsize is greater
> >>>>>  than NFS_MAXDGRAMDATA, it is possible for a thread doing an
> >>>>>  I/O RPC to get stuck repeatedly doing retries. This happens
> >>>>>  ...
> >>
> >>>> Could it wait for the old i/o to complete (and not start any new
> >>>> i/o?). This is little different from having to wait when changing
> >>>> from rw to ro. The latter is not easy, and at least the old nfs
> >>>> client seems to not even dream of it. ffs has always called a
> >>>> ...
> >>
> >>> As you said above "not easy ... uses complicated suspension of
> >>> i/o".
> >>> I have not tried to code this, but I think it would be
> >>> non-trivial.
> >>> The code would need to block new I/O before RPCs are issued and
> >>> wait
> >>> for all in-progress I/Os to complete. At this time, the kernel RPC
> >>> handles the in-progress RPCs and NFS doesn't "know" what is
> >>> outstanding. Of course, code could be added to keep track of
> >>> in-progress
> >>> I/O RPCs, but that would have to be written, as well.
> >>
> >> Hmm, this means that even when the i/o sizes are small, the mode
> >> switch
> >> from tcp to udp may be unsafe since there may still be i/o's with
> >> higher
> >> sizes outstanding. So to switch from tcp to udp, the user should
> >> first
> >> reduce the sizes, when wait a while before switching to udp. And
> >> what
> >> happens with retries after changing sizes up or down? Does it retry
> >> with the old sizes?
> >>
> >> Bruce
> > Good point. I think (assuming a TCP mount with large rsize):
> > # mount -u -o rsize=16384 /mnt
> > # mount -u -o udp /mnt
> > - could still result in a wedged thread trying to do a read that
> >  is too large for UDP.
> >
> > I'll revert r230516, since it doesn't really fix the problem, it
> > just
> > reduced its lieklyhood.
> 
> That seems a regression.
> 
> > I'll ask on freebsd-fs@ if anyone finds switching from TCP->UDP via
> > a
> > "mount -u" is useful to them. If no one thinks it's necessary, the
> > patch
> > could just disallow the switch, no matter what the old
> > rsize/wsize/readdirsize
> > is.
> 
> I use it a lot for performance testing. Of course it is unnecessary,
> since a least for performance testing it is possible to do a full
> unmount and re-mount, but mount -u is more convenient.
> 
> > Otherwise, the fix is somewhat involved and difficult for a scenario
> > like this, where the NFS server is network partitioned or crashed:
> > - sysadmin notices NFS mount is "hung" and does
> >  # mount -u -o udp /path
> >  to try and fix it, but it doesn't help
> > - sysadmin tries "umount -f /path" to get rid of the "hung" mount.
> 
> Now I wonder what makes a full unmount (without without -f) and
> re-mount work.
> 
If the server is unresponsive (network partitoned or crashed, which
was the above scenario), the full unmount won't work if there are RPCs
in progress. It will wait indefinitely for those RPCs to complete.

> > If "mount -u -o udp /path" is waiting for I/O ops to complete,
> > (which is what the somewhat involved patch would need to do) the
> > "umount -f /path" will get stuck waiting for the "mount -u"
> > which will be waiting for I/O RPCs to complete. This could
> 
> I often misremember -f for umount is meaning don't wait. It actually
> means to forcibly close files before proceeding.
> 
> > be partially fixed by making sure that the "mount -u -o udp /path"
> > is
> > interruptible (via <ctrl>C), but I still don't like the idea that
> > "umount -f /path" won't work if "mount -u -o udp /path" is sitting
> > in
> > the kernel waiting for RPCs to complete, which would need to be done
> > to make a TCP->UDP switch work.
> 
> Doesn't umount -f have to wait for i/o anyway? When it closes files,
> it must wait for all in-progress i/o for the files, and for all new
> i/o's that result from closing.
> 
umount -f kills off RPCs in progress. There was an email discussion
about this, where it seemed there was no advantage in defining a
separate "hard forced dismount" for this case.

So long as "umount -f" makes it as far as nfs_unmount(), I believe
this works ok, at least for the new NFS client. If there is already
a full "umount" stuck waiting for RPCs to complete, the "umount -f"
never makes it as far as nfs_unmount(), so it is stuck, as well.

rick



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1513888354.186214.1327592563003.JavaMail.root>