From owner-freebsd-fs Fri Jan 31 16:11:43 2003 Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F0ABD37B401 for ; Fri, 31 Jan 2003 16:11:41 -0800 (PST) Received: from mail.netbsd.org (mail.netbsd.org [155.53.1.253]) by mx1.FreeBSD.org (Postfix) with SMTP id 98A0443F79 for ; Fri, 31 Jan 2003 16:11:41 -0800 (PST) (envelope-from wrstuden@netbsd.org) Received: (qmail 15315 invoked by uid 1130); 1 Feb 2003 00:11:40 -0000 Date: Fri, 31 Jan 2003 16:11:29 -0800 (PST) From: Bill Studenmund X-X-Sender: To: Steve Byan Cc: , Subject: Re: DEV_B_SIZE In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Fri, 31 Jan 2003, Steve Byan wrote: > I keep getting a response that reads like "we'll detect the larger > block size and run with it". I'm concerned that I'm not being clear > that IDEMA is thinking of proposing a backward-compatibility mode with > the presumption that it will work fine (albeit slowly) with existing > binaries, i.e. code that hasn't been modified to be aware of the larger > block size. > > If you think there are no functional problems with this > backwards-compatibility scenario, including during recovery (fsck or > journal roll-forward), I'd be happy to hear a clear "no problem". I think Stephan Uphof hit on the main issues. I think there are functional problems with this, but that it may be usefull in some situations. It just needs a BIG warning. Note I am assuming that if there's an error writing a 512-byte sector the full 4k sector will have issues. If that is avoided (say only the 512-byte area actually has an issue) then things are fine. I think the main place that problems will arrise is that methods to reduce error exposure won't necessarily work. Methods that try to resist single- sector errors, say by making multiple copies of data, will need to know that the single-sector error size (how much data goes away) is 4k, not 512 bytes. Exactly how may programs use these methods is not something I know, so I can't tell you exactly what the exposure is. The fact that the errors from a 4k re-write failing are not unheard of isn't the issie. phk is right that that just looks like multiple sectors dying. The problem is that we would have multiple-sector-death happening with single-sector failure dynamics. If you want this to not be an issue 100%, then just put a battery-backed up cache on the device. Note I'm not saying back up the write cache, just have a cache of the last area(s) being writen. We're talking maybe 8k of cache plus checksumming plus the logical block addresses. Shouldn't be hard (read should be cheep in mass quantities) to make a battery back up something that small. Use a rechargable battery, and just say that if you loose power while writing, you should restore power within say a month or a few months to let said cache drain. With well-tuned CMOS, you might even be able to get away with just static charge or a capacitor for power storage. Take care, Bill To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message