Date: Tue, 16 Nov 1999 20:41:01 -0500 From: Greg Lehey <grog@mojave.sitaranetworks.com> To: Bernd Walter <ticso@cicely.de> Cc: Mattias Pantzare <pantzer@ludd.luth.se>, freebsd-fs@FreeBSD.ORG Subject: Re: RAID-5 and failure Message-ID: <19991116204101.12932@mojave.sitaranetworks.com> In-Reply-To: <19991115210607.A6252@cicely7.cicely.de>; from Bernd Walter on Mon, Nov 15, 1999 at 09:06:08PM %2B0100 References: <ticso@cicely.de> <199911061716.SAA20783@zed.ludd.luth.se> <19991106183316.A9420@cicely7.cicely.de> <19991113213325.57908@mojave.sitaranetworks.com> <19991115203828.B5417@cicely7.cicely.de> <19991115145200.09633@mojave.sitaranetworks.com> <19991115210607.A6252@cicely7.cicely.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, 15 November 1999 at 21:06:08 +0100, Bernd Walter wrote: > On Mon, Nov 15, 1999 at 02:52:00PM -0500, Greg Lehey wrote: >> On Monday, 15 November 1999 at 20:38:28 +0100, Bernd Walter wrote: >>> On Sat, Nov 13, 1999 at 09:33:25PM -0500, Greg Lehey wrote: >>>> >>>> 4. The system crashes after writing the first data block for a RAID-5 >>>> stripe and before writing the last data block. >>>> >>>> When the system comes up, both data and parity are inconsistent. >>>> >>>> 5. The system crashes after writing the last data block for a RAID-5 >>>> stripe and before writing the last parity block. >>>> >>>> When the system comes up, data is consistent, and parity is >>>> inconsistent. >>>> >>>> There are a number of ways of dealing with situations 4 and 5. The >>>> real problem is that they only occur when the system crashes, so >>>> whatever recovery information is required must be stored in >>>> non-volatile storage. Some systems do include a NOVRAM for this kind >>>> of information, but in general purpose systems the only possibility is >>>> to write the information to disk, which would make the inherently slow >>>> RAID-5 write even slower. My attitude here is that RAID-5 writes are >>>> comparatively infrequent, and so are crashes. In the case of (5), you >>>> could rebuild parity after a crash. In the case of (4), I have no >>>> good answer. Suggestions welcome. >>> >>> Case 4 is not that different from case 5 as any differences should be >>> handled by the FS using the volume. >> >> The problem is that in case 4 you don't have anything to go by. You >> don't know which data are inconsistent unless you keep a log. The FS >> using the volume has followed the kernel into the eternal bit bucket. > > Of course - but that may happen with R0 too and even it may be possible with > a single disk. Sure. It's not specific to RAID-5. > The FS should realy be able to handle this case as it knows that > there is an outstanding write operation. How does it know? That's the question. All state information has gone to /dev/null. The only alternative is to write this state information to some non-volatile location, which usually means disk and associated severe loss of performance. Greg -- Finger grog@lemis.com for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19991116204101.12932>