Date: Mon, 24 Jan 2005 22:49:07 -0600 From: John <john@starfire.mn.org> To: Mervin McDougall <mcd_advisory@yahoo.com> Cc: freebsd questions <freebsd-questions@freebsd.org> Subject: Re: fragmentation Message-ID: <20050124224907.D8180@starfire.mn.org> In-Reply-To: <20050125020305.70545.qmail@web30908.mail.mud.yahoo.com>; from mcd_advisory@yahoo.com on Mon, Jan 24, 2005 at 06:03:04PM -0800 References: <20050125020305.70545.qmail@web30908.mail.mud.yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jan 24, 2005 at 06:03:04PM -0800, Mervin McDougall wrote: > hi > I wanted to know whether it is unusal or is a > problem if when my system starts it indicates that > there is some fragmentation of the files but the file > system is clean and thus it is skipping the fsck. Is > this a bad thing? Is this unusual? Ah hah! I think this is simply a misunderstanding of what is meant by fragmentation - because we use it in many different ways. One way we use the term "fragment" is a "subatomic" unit of storage. To try to keep the internal bookkeeping reasonable, FFS and many other filesystems don't allocate storage in units of sectors - they use a larger aggregating factor. The larger the factor, the lower the storage and computational overhead in maintaining the filesystem. The problem with this approach is that it can be wasteful. If you have 8k or 16k allocation units, then even small files would have to use that amount of storage - if you have hundreds or thousands of small files, you can end up wasting a lot of space. The same thing happens with larger files. If you have a file of 128k plus one byte, that last byte would end up all by itself in an allocation unit You can consider these allocation units the atomic storage size. Some filesystems, such as the filesystem used by BSD (UFS1 or UFS2), also support "subatomic" parts for these left-overs and small files. We happen to call those units "fragments." When the system boots and reports the number of fragments - that is what it is talking about. It is nothing to be concerned about: there's nothing wrong, and running fsck wouldn't "fix" it. According to the manual page for newfs, the default block size (allocation unit or atomic storage unit) is 16k and the default fragment size is 2k. Another way we use the term fragmentation is when data which are logically contiguous end up being discontiguous on the disk. When you have a lot of file creation, modification, and deletion going on, this happens. Let's say that you have a file which consists of 3 allocation units. When you start with a fresh filesystem, that file would be created on three consecutive allocation units on the disk. Other files are then created, which may use the allocation units immediately following the three given to the first file. If the first file now grows, the next block, which will be logically sequential, will be physically separated on the disk. As some of the in-between files are modified or deleted, there may even be free blocks between the first three allocation units and the new one. This is also called fragmentation, and is another thing that may not be "wrong" with your filesystem (though it can impact very high-performance applications due to increased seek activity on the disk, and breaking up sequential read or prefetch sequences on intelligent storage subsystems), and nothing that fsck would fix, either. Now, on top of that, the UFS family of filesystems were built to intentionally do a certain type of "fragmentation" to deal with slower disk drive electronics and controllers. This is sometimes called interleave and causes the system to intentionally skip over blocks between what are logically contiguous blocks to allow the CPU and drive electronics some "breathing room" before having to be ready (or have asked for) the next block. Imagine a spinning disk. The data are passing beneath the head: block 1, block 2, block 3, etc. Some program in your system reads a block from a file, and lets say it is block 2. The disk and controller electronics watch the sectors going by, sort of like watching for your baggage at the airport - when the right one comes along, they read it. Before your program can ask for the next block, the disk has already rotated so that are are in the midst of that block somewhere. The controller has to wait for the disk to come all the way around again before it can be read. By allocating the next logical block of data from the file to be in block 4 instead of block 3, we may be able to avoid having to wait for that disk to spin all the way around again. A lot of work has gone into designing the UFS filesystems to avoid unwanted fragmentation, and to support extent-based growth for large files. That's probably the reason that there is no standard "defrag" utility like there is for most Windows filesystems. You may be thinking of "fragmented" as in broken, or fractured. That is not what is meant here. It doesn't mean that anything is wrong with your filesystem, just some statistics on how much those "sub atomic" allocation units are being used. Your filesystem is clean, is using some fragments, and may or may not be fragmented (data scattered in logically discontigous ways). -- John Lind john@starfire.MN.ORG
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050124224907.D8180>
