Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Sep 1996 22:12:10 -0700
From:      David Greenman <dg@root.com>
To:        Terry Lambert <terry@lambert.org>
Cc:        bde@zeta.org.au, proff@suburbia.net, freebsd-hackers@FreeBSD.org
Subject:   Re: attribute/inode caching 
Message-ID:  <199609170512.WAA08889@root.com>
In-Reply-To: Your message of "Mon, 16 Sep 1996 11:05:55 PDT." <199609161805.LAA02580@phaeton.artisoft.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
>> >> It could be hung off the vnode for the mounted device.  I'm not sure if
>> >> it isn't already.  This problem is secondary.  Repeated tree traversals
>> >> aren't all that common, and you don't really want them to eat the buffer
>> >> cache (you probably want to buffer precisely the inodes and directories
>> >> that will be hit again a long time later in the same search, e.g.,
>> >> intermediate directories for a depth-first seach).
>> >
>> >It is not hung off the vnode for the device.  It probably should not be,
>> >in any case (there is no "device" for NFS, for instance).
>> 
>>    Inode blocks are hung off the device vnode.
>
>I find this hard to believe.  This would imply a limitation of the device
>size of the file size, since the adressable extent for a vnode is smaller
>than the addressable extent for a device.

   Huh? In FreeBSD, the device is refered to via the device vnode. How do
you think FFS does the I/O for the inode block? It uses the block device-
special vnode. As for any implied size limitation, vnodes don't have any
"size" associated with them. Anything (except the VM system) that deals with
file offsets deals in 64bit quad_t's, and it doesn't matter if it's a file
or a device or whatever. Depending on which version of the merged VM/buffer
cache we're talking about, metadata may or may not be stored in VM pages. In
all versions, however, it is cached in buffers (buffers can point to either
malloced memory or VM pages).

>Which value are you caliming is in error?  It seems to me that if inode
>blocks are hung of the device vnode (so why have the ihash?!?), then it
>is an error to not limit the device size to the max file size.

   I think you're really starting to confuse things. The maximum file size
is not a function of vnodes. We do have a problem with representing file
offsets in the VM system beyond 31bits worth of pages (43bits total == 8TB),
but this is hardly a concern. John may correct me on this, but I believe in
the current scheme we do cache inode blocks in VM pages in -current. In 2.1.5,
we couldn't because of the vm_page offset limitation. So for 2.1.5, we only
cache inode blocks in malloced memory that is attached to buffers. Offsets
in buffers are 40 bits large (31bits for signed long to hold the block number
which is in units of 512 bytes (9bits)), this effectively limits all operations
that involve struct buf's to 1TB, thus neither a device nor a file may be
larger than this. We've had no compelling reason to fix this as it is more
difficult than just changing the size of a daddr_t, and noone that I know of
is using a 1TB filesystem.

>The fact that the device size was allowed to be larger than the max file
>size was one of the justifications John Dyson gave for not using caching
>based on device/extent instead of (in addition to) vnode/extent in order
>to keep the buffer cache unification of the vnode/extent mapping, but
>resolve a lot of other issues.  For instance, if the device vnode is
>in fact a device/extent cache, then there is no need for the ihash, since
>the inodes are determiistically layed out and thus indexable by fault.  In
>addition, the abilit to address device blocks by fault on the device vnode
>means that vclean is totally unnecessary.

   I can't parse this.

-DG

David Greenman
Core-team/Principal Architect, The FreeBSD Project



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199609170512.WAA08889>