From owner-freebsd-current Mon Mar 17 13:27: 1 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B4C1A37B404; Mon, 17 Mar 2003 13:26:59 -0800 (PST) Received: from thunderer.cnchost.com (thunderer.concentric.net [207.155.252.72]) by mx1.FreeBSD.org (Postfix) with ESMTP id CDE1543F75; Mon, 17 Mar 2003 13:26:58 -0800 (PST) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (adsl-209-204-185-216.sonic.net [209.204.185.216]) by thunderer.cnchost.com id QAA23205; Mon, 17 Mar 2003 16:26:56 -0500 (EST) [ConcentricHost SMTP Relay 1.15] Message-ID: <200303172126.QAA23205@thunderer.cnchost.com> To: Julian Elischer Cc: FreeBSD current users , fs@FreeBSD.ORG Subject: Re: Anyone working on fsck? In-reply-to: Your message of "Mon, 17 Mar 2003 12:22:33 PST." Date: Mon, 17 Mar 2003 13:26:56 -0800 From: Bakul Shah Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG UFS is the real problem here, not fsck. Its tradeoffs for improving normal access latencies may have been right in the past but not for modern big disks. The seek time & RPM have not improved very much in the past 20 years while disk capacity has increased by a factor of about 20,000 (and GB/$ even more). IMHO there is not much you can do at the fsck level -- you stil have to visit all the cyl groups and what not. Even a factor of 10 improvement in fsck means 36 minutes which is far too long. Keeping track of 67M to 132M blocks and (assuming avg file size of 8k to 16k) something like 60M to 80M files takes quite a bit of time when you are also seeking all over the disk. A few ideas: When you have about 67M (2^26) files, ideally you want to *avoid* checking as many as you can. Given access times, you are only going to be able to do a few hundred disk accesses at most in a minute. So you are going to have only a few files/dirs that may be inconsistent in case of a crash. Why not keep track of that somehow? If you need about 1GB of space to store the state of a TB file system that needs to be checked, may be it _should_ be *stored* in a contiguous area on the FS itself. 1GB is about 0.1% of space. Typically only a few cyl grps may be inconsistent in case of a crash. May be some info about which cyl groups need to be checked can be stored so that brute force checking of all grps can be avoided. Typically a file will be stored in one or a small number of cyl groups. If that info. is stored somewhere it can speed things up. Extant based allocation will reduce the number of indirect blocks. But may be this is not such a big issue if most of your files fit in a few blocks. Anyway, support for all of these have to be done in the filesystem first before fsck can benefit. If instead you spend time "optimizing" just fsck, you will likely make it far more complex (and potentially harder to get right). To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message