Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 13 Sep 1997 00:03:16 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        durian@plutotech.com (Mike Durian)
Cc:        hackers@FreeBSD.ORG
Subject:   Re: VFS/NFS client wedging problem
Message-ID:  <199709130003.RAA21620@usr04.primenet.com>
In-Reply-To: <199709122231.QAA00374@pluto.plutotech.com> from "Mike Durian" at Sep 12, 97 04:31:43 pm

next in thread | previous in thread | raw e-mail | index | archive | help
>   I've got a VFS problem I'm hoping someone out there can give
> me some ideas on.  I've written a VFS based filesystem that is
> an interface to our RAID system.  The RAID system stores video
> frames and the filesystem allows access to the data and automatically
> translates the data to a variety of file formats (TIFF, Targa, YUV,
> etc.).  The frame number and conversion type are defined by the
> path name.  Eg /pfs/frames/tiff/0.tiff or
> /pfs/HMSF/tga/hour00/minute01/second10//00.01.10.29.tga.
>   The filesystem is implemented partially in the kernel and partially
> as a user application.  The two parts communicate via a socket.

How do you serialize upcalls through the socket, such that you
don't have overlapped requests?  Or do you have seperate request
contexts so that this doesn't cause problems?

If you don't have seperate contexts, eventually you'll make a request
before the previous one completes.

The NFS export stuff is a bit problematic.  I don't know what to
say about it, except that it should be in the common mount code
instead of being duplicated per FS.

If you can give more architectural data about your FS, and you can
give the FS you used as a model of how a VFS should be written, I
might be able to give you more detailed help.

This is probably something that should be taken off the general
-hackers list, and onto fs@freebsd.org


>   The filesystem works well for normal accesses, but I'm having
> a strange problem with NFS.  I've supplied the fhtovp and vptofh
> hooks and things basically work, but I can get the client side
> wedged under heavy accesses.
>   If I run four simultaneous processes copying data to my filesystem,
> after a while I'll see one of the nfsiod go to sleep on "vfsfsy"
> and not return.  Eventually, the other nfsiods will go to sleep on
> "nfsrcv" and that's that.

Where is the NFS server sleeping, that's the question.


>   In both cases, it looks like the clients aren't getting a acks
> from the server.  Strangely, none of the nfsd processes on the
> server are sleeping and the user mount_pfs process isn't sleeping
> either.

That's not strange.  It's a request context that's wedged.  When a
request context would be slept, the nfsd on the server isn't slept,
the context is.  The nfsd provides an execution context for a different
request context at that point.  Try nfsstat instead, and/or iostat,
on the server.

>   I'm not sure where the problem lies.  Is it an NFS issue or
> a (more likely) bug in my filesystem?  Does anybody have any ideas
> on why an NFS server might drop an ACK and wedge the client?

Because the request context is blocked.  It's not dorpped the ACK, it
just hasn't sent one yet.


>   I did get the same results with both NFSv3 and NFSv2.

This proves to us that it isn't async requests over the wire that are
hosing you.  That the server is an NFSv3 capable server argues that
the v2 protocol is implemented by a v3 engine, which would explain
the blockages.

Have you tried bot TCP and UDP based mounts?


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199709130003.RAA21620>