From owner-freebsd-arch@FreeBSD.ORG Mon Mar 31 22:21:57 2008 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D2B29106566B for ; Mon, 31 Mar 2008 22:21:57 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from mail.bitblocks.com (ns1.bitblocks.com [64.142.15.60]) by mx1.freebsd.org (Postfix) with ESMTP id C3C888FC19 for ; Mon, 31 Mar 2008 22:21:56 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1]) by mail.bitblocks.com (Postfix) with ESMTP id C976C5B50; Mon, 31 Mar 2008 15:21:54 -0700 (PDT) To: Matthew Dillon In-reply-to: Your message of "Mon, 31 Mar 2008 13:06:10 PDT." <200803312006.m2VK6Aom028133@apollo.backplane.com> Date: Mon, 31 Mar 2008 15:21:54 -0700 From: Bakul Shah Message-Id: <20080331222154.C976C5B50@mail.bitblocks.com> Cc: Christopher Arnold , arch@freebsd.org, qpadla@gmail.com, freebsd-arch@freebsd.org, Martin Fouts Subject: Re: Flash disks and FFS layout heuristics X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Mar 2008 22:21:57 -0000 On Mon, 31 Mar 2008 13:06:10 PDT Matthew Dillon wrote: > But how do you index that information? You can't simply append the > information to the NAND unless you also have a way to access it. So > does the filesystem have to scan the NAND (or significant portions of it) > in order to build an index of the filesystem topology in system memory? One possible way: I'd design the system so that each update ends with the write of a root block[1]. I'd also write root blocks at fixed locations to find them easily without having to scann the whole disk. Given this, on reboot use binary search to locate the latest root block at a fixed location. There may be further updates so scan forward until you locate the most uptodate root block and once you have that, you are home free! Everything before that root block will be consistent with it. Even if the system crashes in the middle of a compacting GC, the design should be able to recover all data. What I am not sure about is whether one can do incremental GC. A stop-and-copy GC is always possible but I don't like the idea of long pauses. [1] The root block contains block # of the earliest valid block, a sequence number (that will not roll over in device's lifetime), block #s for various structures such as the root of inodes, superblock, freelist if any, etc.