Date: Thu, 12 Aug 1999 21:02:32 -0400 (EDT) From: Zhihui Zhang <zzhang@cs.binghamton.edu> To: Terry Lambert <tlambert@primenet.com> Cc: Poul-Henning Kamp <phk@critter.freebsd.dk>, roberto@keltia.freenix.fr, freebsd-fs@FreeBSD.ORG Subject: Re: Help with understand file system performance Message-ID: <Pine.GSO.3.96.990812202049.1878A-100000@sol.cs.binghamton.edu> In-Reply-To: <199908122314.QAA23506@usr04.primenet.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 12 Aug 1999, Terry Lambert wrote: > The filesystem block allocation table in directories is unique, in > that it is generally used as a convenience for locating physical > blocks, rather than using the standard filesystem block access > mechanisms, when reading or writing directories. Directory files have the same on-disk structure as regular files. However, they can never have holes and they can only be incremented at the end of the file in device block chunks. No directory entry can cross the device block boundary to guarantee the atomic update. However, I do not know why you say the block map (direct and indirect blocks) of a directory is only used as a convenience. I mean there is a need to call VOP_BMAP() on a directory file. The routine ffs_blkatoff() calls bread(), which in turn calls VOP_BMAP(). The in-core inode does have several fields to facilitate the insertion of new directory entries. But we still need the block map (block allocation table). Directory files are also specical in that we can not write into them with the write() system call as normal files. They use a special routine to grow, i.e., ufs_direnter(). By the way, we can use read() system call to read directory files as we do with normal files. > There are a number of performance penalties for this, especially > on large directories, where it is not possible to trigger sequential > readahead through use of the getdents() system call sequentially > accessing sequential 512b/physical_block_size extents. I do not understand this. The read-ahead mechanism should work on any files. I thought the reorganization of diretory entries within a directory block when you delete an entry is an inefficiency. Does this issue have anything to do with the VMIO directory issue discussed earlier this year? > The frag size can be tuned down below this (i.e. 1/4, 1/2, 1). > > The only case where 1024 bytes of physical disk would be used is at > a filesystem block size of 8192 (or greater), which, divided by 8, > gives 1024b (or greater). I did not realize this before. The maximum ratio is 8. So if the filesystem block is 8192, the allocation unit (fragment size) can not be 512 because 8192/512 > 8. > This is called an encapsulated two stage commit, in database terms. > > For inodes, indirect blocks, and directory entry blocks, there is > no two stage commit, because there is no indirection of their data > contents. I guess you mean that their data are not managed by any higher level metadata which must be updated together. Thanks for your help. -Zhihui To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.GSO.3.96.990812202049.1878A-100000>