Date: Sun, 10 Feb 2002 21:09:43 -0800 From: Kirk McKusick <mckusick@mckusick.com> To: Alfred Perlstein <bright@mu.org> Cc: Ian Dowse <iedowse@maths.tcd.ie>, fs@FreeBSD.ORG Subject: Re: fsck and predictive readahead? Message-ID: <200202110509.g1B59hi06207@beastie.mckusick.com> In-Reply-To: Your message of "Sat, 22 Dec 2001 15:08:14 CST." <20011222150814.Z48837@elvis.mu.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Date: Sat, 22 Dec 2001 15:08:14 -0600 From: Alfred Perlstein <bright@mu.org> To: Ian Dowse <iedowse@maths.tcd.ie> Cc: mckusick@freebsd.org, fs@freebsd.org Subject: Re: fsck and predictive readahead? * Ian Dowse <iedowse@maths.tcd.ie> [011222 07:53] wrote: > In message <20011222053306.Y48837@elvis.mu.org>, Alfred Perlstein writes: > >I'm wondering if fsck uses any sort of tricks to do read-ahead > >to prefect data for pass1 and pass2. > > > >If not does anyone thing it might speed things up? > > I've wondered about this also. Since fsck spends virtually all of > its time waiting for disk reads, doing most kinds of speculative > disk reads would only slow things down. However, there is some > potential for re-ordering the reads to reduce seeking and to allow > data to be read in larger chunks. > > Pass 1 involves quite a lot of disk seeking because it goes off and > retrieves all indirection blocks (blocks of block numbers) for any > inodes that have them. Otherwise pass 1 would be a simple linear > scan through all inodes. It would be possible to defer the reading > of indirection blocks and then read them in order (having 2nd- and > 3rd-level indirection blocks complicates this). I think I tried a > simple form of this a few years ago, but the speedup was only > marginal. I believe I also tried changing fsck's bread() to read > larger blocks when contiguous reads were detected, again with no > significant improvements. > > For pass 2, the directories are sorted by the block number of their > first block, so there is very little seeking. Some speed improvement > might be possible by doing a larger read when a few directory blocks > are close together on the disk. > > An interesting exercise would be to modify fsck to print out a list > of the offset and length for every disk read it performs. Then sort > that list, coalesce contiguous reads, and see how long it takes the > disk to read the new list as compared to the original. Such perfect > sorting is obviously not feasable in practice, but it would give > some idea of the potential for improvements. The problem you didn't address with all these changes was stalls due to disk IO. /usr/src/sbin/fsck_ffs # time ./fsck_ffs -d -n /vol/spare ** /dev/ad0s1g (NO WRITE) ** Last Mounted on /vol/spare ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 303580 files, 12865338 used, 9183427 free (51827 frags, 1141450 blocks, 0.2% fragmentation) ./fsck_ffs -d -n /vol/spare 24.50s user 4.72s system 19% cpu 2:30.73 total No matter how you order the IO, fsck is going to have to wait for read(2) to return. If we can offload that waiting to a child process we may be able to fix this. Is there any detailed commenting on the sources available, they are quite readable, but still very terse. A more in depth explanation of each function would really help. Do you know of a paper, manpage or do you have the time to sprinkle some commentary into the code? -- -Alfred Perlstein [alfred@freebsd.org] 'Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom.' http://www.morons.org/rants/gpl-harmful.php3 There was apaper in the Usenix general conference in the early 1990's which talked about several ways of speeding up fsck. I incorporated most of those ideas into fsck at that time. The two most significant were reading the inodes in big chunks, and collecting all the directory block numbers and sorting them as Ian describes above. More recently with the advent of soft updates, a further optimization to fsck is to read only those inodes that are marked as allocated in the bit maps. Kirk McKusick To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200202110509.g1B59hi06207>