Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Oct 1996 12:04:06 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        michaelh@cet.co.jp
Cc:        karl@Mcs.Net, freebsd-hackers@FreeBSD.org
Subject:   Re: NFS node: disappearing directory
Message-ID:  <199610181904.MAA01751@phaeton.artisoft.com>
In-Reply-To: <Pine.SV4.3.93.961018145633.1217B-100000@parkplace.cet.co.jp> from "Michael Hancock" at Oct 18, 96 03:07:13 pm

next in thread | previous in thread | raw e-mail | index | archive | help
Ah, another question requiring more than a 10 line answer.  8-(.


> > > But 3) says it does get reloaded.
> > 
> > Sometimes.  But if go to the "up-level" when it happens and do a "ls", you
> > get a VERY short list (~10% of what's really there - right about 200
> > entries)
> 
> Umm.  Is John around?  What kind of memory does the result of readdir go
> into?

Depends on the FS.  For NFS, "bogus cookie handling memory which was
allocated for fear the user buffer would be too small to return the
data".

The problem is cookie related.  The fix is to get rid of the cookie
code.

The cookie code was introduced because the on disk directory structure
and the exported directory structure for NFS is no longer the same
(the NFS standard did not get changed to accomodate BSD).

In other words, struct direct != struct dirent.

In FFS, the exported directory structure is the same one that is
returned via the system call interface; in other words, it matches the
on disk structure of the default FS.

Because the NFS structure is a different size, there is no way to
know if the interface will have a buffer large enough to deal with the
data returned.

In general, we can consider the NFS server a consumer of the VFS
interface.

Similarly, system calls are a consumer of the VFS interface.

The generic soloution is therefore to pass back a directory block
reference from the FS, and then *use an FS specific VOP_ call to
translate the buffer contents on demand ito the consumer buffer
format*.

For UFS, this would be a null op on the buffer.

For NFS, this would be a page allocate and a data copy, and since
memory access is significantly faster than network access, this would
not impose too much overhead (the cookie crap adds copy overhead anyway).


For the generic "restart", the entry offset is passed in, and the block
is traversed by the underlying FS.  If it gets to an entry whose offset
matches that passed in, then it is just returned; if it goes past it,
then an FS dependent action is taken:

o	for pre-compation FS's, the entry is assumed to be compacted,
	and the entry prior to the entry following the restart point
	is returned, on the assumption that entries were moved down in
	the block.

o	for post-compacting sparse directory block FS's (like ffs),
	the entry is assumed to have been deleted, and is returned.

This is legal because the getdents() call is assumed to work on a
"snaphot" of the directory, not the actual directory structure.


For what it's worth, the cookie code presumes a copy, which no longer
takes place in the unified cache case, so that it can reference it out
out the VM instead of out of the (potentially volatile) buffer cache.


This implementation was discussed in great detail by myself and Doug
Rabson about 18 months ago on the -current list.


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199610181904.MAA01751>