From owner-freebsd-arch@FreeBSD.ORG Mon Mar 31 23:06:40 2008 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EFA591065681 for ; Mon, 31 Mar 2008 23:06:40 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.freebsd.org (Postfix) with ESMTP id D39328FC22 for ; Mon, 31 Mar 2008 23:06:40 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.14.1/8.14.1) with ESMTP id m2VN6Smg029759; Mon, 31 Mar 2008 16:06:28 -0700 (PDT) Received: (from dillon@localhost) by apollo.backplane.com (8.14.1/8.13.4/Submit) id m2VN6SRa029758; Mon, 31 Mar 2008 16:06:28 -0700 (PDT) Date: Mon, 31 Mar 2008 16:06:28 -0700 (PDT) From: Matthew Dillon Message-Id: <200803312306.m2VN6SRa029758@apollo.backplane.com> To: Bakul Shah References: <20080331223846.CFD975BAE@mail.bitblocks.com> Cc: Christopher Arnold , Martin Fouts , qpadla@gmail.com, arch@freebsd.org, Poul-Henning Kamp , freebsd-arch@freebsd.org Subject: Re: Flash disks and FFS layout heuristics X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Mar 2008 23:06:41 -0000 :[Poul, use positive encouragement and you'd inspire a lot more :people!] : :Note that in effect this is exactly what zfs does. Update of :any block implies finding a new place for the updated copy, :which means the block pointing to it must be also updated, :which means a new place for it etc. etc. : :But hey, I spent just a few minutes sketching out the idea so :it is possible I missed a whole bunch of things! If I was :actually implementing this (which I am tempted to...) I'd :certainly want to know what others did. : :One thing I forgot to add: I'd let the lower level handle bad :block forwarding and wear levelling (like on the m-tron :device). This is my understanding of what ZFS does too, and I considered it when I was designing HAMMER. I ultimately decided not to go that route because I was worried it would destroy seek-locality-of-reference on-disk (i.e. read/access performance). Seek locality of reference is of course very important for a disk-based filesystem but not so important for a flash-based filesystem. The one hard part I have left to do in HAMMER is the UNDO meta-data log. Or, more precisely, the recover-on-mount code for the UNDO meta-data log. Everything else is done and working. I knew it would be the hardest part of the filesystem when I ultimately decided not to go ZFS's route. The UNDO log is basically one seek-write per fsync or whenever the filesystem is flushed (every 30 seconds on BSDs)... not too bad, particularly because it stores only meta-data changes and not data-changes. Ultimately I think I can make it worthwhile by including data elements for small seek/write/fsync sequences in the UNDO record and just syncing it, which would be awesome for database applications. I have no immediate plans to do that right now, though. -Matt Matthew Dillon