From owner-freebsd-arch@FreeBSD.ORG  Mon Mar 31 22:21:57 2008
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D2B29106566B
	for <arch@freebsd.org>; Mon, 31 Mar 2008 22:21:57 +0000 (UTC)
	(envelope-from bakul@bitblocks.com)
Received: from mail.bitblocks.com (ns1.bitblocks.com [64.142.15.60])
	by mx1.freebsd.org (Postfix) with ESMTP id C3C888FC19
	for <arch@freebsd.org>; Mon, 31 Mar 2008 22:21:56 +0000 (UTC)
	(envelope-from bakul@bitblocks.com)
Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1])
	by mail.bitblocks.com (Postfix) with ESMTP id C976C5B50;
	Mon, 31 Mar 2008 15:21:54 -0700 (PDT)
To: Matthew Dillon <dillon@apollo.backplane.com>
In-reply-to: Your message of "Mon, 31 Mar 2008 13:06:10 PDT."
	<200803312006.m2VK6Aom028133@apollo.backplane.com> 
Date: Mon, 31 Mar 2008 15:21:54 -0700
From: Bakul Shah <bakul@bitblocks.com>
Message-Id: <20080331222154.C976C5B50@mail.bitblocks.com>
Cc: Christopher Arnold <chris@arnold.se>, arch@freebsd.org, qpadla@gmail.com,
	freebsd-arch@freebsd.org, Martin Fouts <mfouts@danger.com>
Subject: Re: Flash disks and FFS layout heuristics 
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 31 Mar 2008 22:21:57 -0000

On Mon, 31 Mar 2008 13:06:10 PDT Matthew Dillon <dillon@apollo.backplane.com>  wrote:
>     But how do you index that information?  You can't simply append the
>     information to the NAND unless you also have a way to access it.  So
>     does the filesystem have to scan the NAND (or significant portions of it)
>     in order to build an index of the filesystem topology in system memory?

One possible way:

I'd design the system so that each update ends with the write
of a root block[1]. I'd also write root blocks at fixed
locations to find them easily without having to scann the
whole disk. Given this, on reboot use binary search to locate
the latest root block at a fixed location. There may be
further updates so scan forward until you locate the most
uptodate root block and once you have that, you are home
free!  Everything before that root block will be consistent
with it.

Even if the system crashes in the middle of a compacting GC,
the design should be able to recover all data.

What I am not sure about is whether one can do incremental
GC. A stop-and-copy GC is always possible but I don't like
the idea of long pauses.

[1]
The root block contains block # of the earliest valid block,
a sequence number (that will not roll over in device's
lifetime), block #s for various structures such as the root
of inodes, superblock, freelist if any, etc.