Date: Wed, 21 Mar 2001 08:55:48 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: don@coleman.org Cc: tlambert@primenet.com (Terry Lambert), fschapachnik@vianetworks.com.ar, freebsd-fs@FreeBSD.ORG Subject: Re: growfs Message-ID: <200103210856.BAA24089@usr05.primenet.com> In-Reply-To: <200103210806.AAA09667@eozoon.coleman.org> from "Don Coleman" at Mar 21, 2001 12:06:02 AM
next in thread | previous in thread | raw e-mail | index | archive | help
> I don't think the picture is quite as bad as you paint it. > > The clustering code of FFS will defragment files automatically as > they grow. While it is true that a highly fragmented filesystem > will not be magically fixed by growing it, any new files will > be written out as large files as they get large. I think that rather misses the most probable time for someone to actually run the command: when they need more disk space, because their disks are full. Also realize that the free reserve has been eroded over time as disks get larger; ideally, it would be 15% (1G on a 6G) drive; the current tradeoff for "more space" vs. "better efficiency" is 8% (1G on a 12.5G drive). I don't think if someone has a 37G drive today (say it's dedicated to a Vinum plex, so it can be made bigger if we want, or say it's part of a larger RAID array), that they will think of running "growfs" on the thing when it has "only 3G free". With some of the 75G disks out there today, that becomes "only 6G free". People are used to thinking of the free reserve as "a hell of a lot of wasted space"; mostly because they aren't computer scientists, and simly don't understand hash fil algorithms or the reason for the free reserve. Even if they understand it intellectually, there are many computer scientists who grew up on systems where main memory was 4k, or even with the first PC, where that free reserve is equivalent to 600 times the size of the largest available hard drive for the original IBM PC XT. Intellectually, they may know the math, but their gut still tells them "that's a hell of a lot of wasted space". My gut reaction, which I have to fight, is to tweak the free reserve down to 6%, and get another 600M/1.2G of disk space. I know that if I did this, I'd be able to rationalize it as a temporary stopgap that I will fix correctly by deleting and/or compressing junk later (yeah, right), or by doing The Right Thing and adding more disk, and then using backup/restore to defragment things. I know that if I did this, I would be trying to pull one over on myself. So I don't do it. Finally, say you are right: assume that we are talking about files which grow over time, instead of just talking about the normal disks you'd see at any ISP or commercial or educational environment, where the only things that grow over time are log files, directories, and email folders (if they happen to be stored in mail spool, rather than half a dozen other formats). Even with that, we take an 80% full disk, and we "growfs" it to twice it's previous size. There is a 50/50 chance that a new allocation wil be on the new region vs. the old. This means that for a disk size K with a new aggregate size 2*K, that it takes .5 * (2*K) more data to hit the 85% hash limit, and .12 * (2*K) more data to hit the 92% fill limit of the current eroded newfs free reserve. In other words, the disk is 80% full, and you double the space so that it's conceptually 40% full, but then you only get to add 24% (of the original) or 12% (of the new) more data before the original disk is 92% full. 24% - 7% = 18%... in other words, serious fragmentation based thrashing becomes a problem when there is 80%+18% = 98% of the original disk worth of data, as opposed to 92% of the original disk worth of data (8% of the original disk free being the "worst acceptable allowable tradeoff" for newfs as it sits today). A backup and restore (or intentional -- non-side effect -- defragmentation) will drop both disks in the combined plex to 42%, with another 50% of the available space (i.e. "one whole disk") until it becomes a problem. And if *I* can come close to being able to rationalize doing this anyway to allow me to procrastinate... Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200103210856.BAA24089>