From owner-freebsd-fs Wed Nov 13 2: 6:13 2002 Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6C05637B401 for ; Wed, 13 Nov 2002 02:06:11 -0800 (PST) Received: from scaup.mail.pas.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by mx1.FreeBSD.org (Postfix) with ESMTP id 24F0E43E97 for ; Wed, 13 Nov 2002 02:06:08 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0207.cvx21-bradley.dialup.earthlink.net ([209.179.192.207] helo=mindspring.com) by scaup.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 18BuPN-0005JU-00; Wed, 13 Nov 2002 02:06:05 -0800 Message-ID: <3DD22326.74544EAF@mindspring.com> Date: Wed, 13 Nov 2002 02:02:14 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Tomas Pluskal Cc: freebsd-fs@freebsd.org Subject: Re: seeking help to rewrite the msdos filesystem References: <20021113094824.N1339-100000@localhost.localdomain> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Tomas Pluskal wrote: > > This has more to do with sequential access. Technically, you can > > read a FAT cluster at a time instead of an FS block at a time, and > > you will achieve some multiplier on sequential access, but you will > > find that under load, that the fault rate for blocks will go up. > > When I read from my ZIP drive, according to iostat the request size is > 2KB. When I run dd with 2KB request size: > > # dd if=/dev/afd0 of=/dev/null bs=2048 count=100 > 100+0 records in > 100+0 records out > 204800 bytes transferred in 2.127448 secs (96266 bytes/sec) > > If I understand this right, I can never get faster then 96KB/s with > sequential access, when using 2KB requests ? It is quite slow :) Uh... the way you are using it here doesn't involve MSDOSFS at all. What happens if you say: # dd if=/dev/afd0 of=/dev/null bs=2048 count=100 # dd if=/dev/afd0 of=/dev/null bs=2048 count=100 Does the second one complete out of cache, and therefore faster? # dd if=/dev/afd0 of=/dev/null bs=204800 count=1 # dd if=/dev/afd0 of=/dev/null bs=204800 count=1 Are the requests still 2K according to iostat? If so, is it because it's a device driver limitation, or a hardware limitation of the ZIP disks themselves? Does the second one complete out of cache, and therefore faster? - I'll assume that the answers to the above questions are, in order, "no, no, N/A", unless you want to contradict. Is this a SCSI ZIP disk? The fastest possible read time you can possibly get out of any disk is to read SCSI mode page 2, and read a track at a time, so that you avoid track-to-track seeks in the middle of reads, using tagged commands to interleave requests to amortize a single seek latency across all requests combined. - I'll assume that the answer is "no"; basically, you will have to learn to live with a 1.5 times seek latency per virtual "fixed size track" read. Assuming all that... Most likely, the 2K is because that's the underlying FS block size. There is no optimization for sequential reads in MSDOSFS because the block offset requires metadata access, which is going to cause a seek, and there's no sequential optimization (see msdosfs_bmap() in /sys/fs/msdosfs/msdosfs_vnops.c), for lack of available metadata (without another seek and read of the FAT table... unless it's cached; see pcbmap() in msdosfs_fat.c). You probably want an is_sequential() to avoid really, really pessimizing random I/O. You should also look at the large block comment above the manifest constants defined for fs_setcache() in denode.h; you can see that it tries to implement the "one behind" entry I was talking about previously, but that this doesn't really help you, so there are either multiple opens, directory traversals, or other things going on because of the application you are running. Finally, note that the cache for the entire FAT, or as much of the FAT, and it's locality asyou can afford, LRU'ed, is not mapped. per the MACH MSDOSFS paper reference. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message