From owner-freebsd-current Tue Sep 23 23:17:35 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id XAA05242 for current-outgoing; Tue, 23 Sep 1997 23:17:35 -0700 (PDT) Received: from usr07.primenet.com (tlambert@usr07.primenet.com [206.165.6.207]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id XAA05230; Tue, 23 Sep 1997 23:17:29 -0700 (PDT) Received: (from tlambert@localhost) by usr07.primenet.com (8.8.5/8.8.5) id XAA04899; Tue, 23 Sep 1997 23:17:23 -0700 (MST) From: Terry Lambert Message-Id: <199709240617.XAA04899@usr07.primenet.com> Subject: Re: New timeout capability (was Re: cvs commit:....) To: dyson@FreeBSD.ORG Date: Wed, 24 Sep 1997 06:17:23 +0000 (GMT) Cc: syssgm@dtir.qld.gov.au, freebsd-current@FreeBSD.ORG In-Reply-To: <199709230920.EAA00190@dyson.iquest.net> from "John S. Dyson" at Sep 23, 97 04:20:00 am X-Mailer: ELM [version 2.4 PL23] Content-Type: text Sender: owner-freebsd-current@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk > I could possibly imagine a reasonable use for a 16K basic allocation size. 8k is where I typically stop, mostly because of frag size. 1k frags are about my limit. 8-). > I think that 4K performs pretty darned well anyway though. In the > real world, I wouldn't think that one would see much of a performance > difference between 4K and 16K. For 8k, there used to be about a 40% improvement over 4k for iozone; I haven't really tried this for about 5 moths now, though. I expect a bit of a drop for 16k because of the 2k frags, actually. I'd thing that 32k would go back up -- perhaps way up -- because of 4k page aligned frags being good for you. It really matters how sequentially you are accessing your files. For random writes less not equal to 4k, there is a requirement of read-before-write. Technically, you could take this down to 512b, since the VM has the bitmap for it. If so, block sizes over 4k (with frags larger than a disk block) would get relatively more expensive *fast*, as long as you were doing I/O on block boundries. I'm not sure whether I/O on a block boundry for a page causes a read before write or not. It probably does; this is technically not needed, so theres a tiny optimization there for better iozone numbers. 8-). If the read-before-write could be done on a block basis using a block bitmap to indicate which 512b chunks had been read and which hadn't, and you were guaranteed read-before-write, and if you wrote a whole block, you'd map it read without reading, and you respected this bitmap when responding to the dirty bit, well... that'd be a lot of work. 8-). It would also give a more uniform win for block aligned accesses in block increments (ndbm?), and certainly make IOZONE happier, as well as making MSDOS FS happier. So to recap, a 512b aligned write of block 3 in a new 4k page would result in b00001000 in the bitmap, and the dirty bit set on the page. A 43b write in block 5 not crossing a block boundry would result in b00100000 in the bitmap, a 512b read of that block from disk, and a 43b write somewhere in the block, with the dirty bit set on the page. Probably a usesful optimization for fixed size record based random record I/O for records 2k or smaller (so page locality is less of an issue, and so that you shouldn't just read the whole page anyway). I don't know what the impact would be on the pager in the general case; probably not pretty at all, actually. Maybe John could comment (probably to csay I'm insane ;-)). Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.