Date: Wed, 3 Jan 2001 10:36:21 +1030 From: Greg Lehey <grog@lemis.com> To: Josef Karthauser <joe@tao.org.uk>, Matraquilla@cs.com, Roman Shterenzon <roman@harmonic.co.il>, freebsd-stable@FreeBSD.ORG Subject: RAID-5 reliability (was: vinum malfunction!) Message-ID: <20010103103621.G40453@wantadilla.lemis.com> In-Reply-To: <20010102140616.B1391@tao.org.uk>; from joe@tao.org.uk on Tue, Jan 02, 2001 at 02:06:16PM %2B0000 References: <d6.7493e7.278332c6@cs.com> <20010102140616.B1391@tao.org.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, 2 January 2001 at 14:06:16 +0000, Josef Karthauser wrote:
>
> The problem with vinum RAID5 in -stable is that in my experience
> there are some nasty bugs in it,
Well, one nasty bug.
> and I don't believe that Greg has managed to reproduce these himself
> and so is a bit confused as to what is causing the trouble.
Well, I don't know if I'd use the word "confused". More "uninformed".
> I'm worried that he may be inclined to believe that it is only a
> small minority of people who are having problems with this. I would
> challange this view because I don't know _anyone_ who is
> successfully using RAID5 under vinum. They've all migrated away to
> using dedicated hardware and thus solved their vinum problems that
> way.
Yes, I do believe it's a small minority of people. I have had exactly
four reports of this problem, yours included. In none of the cases
have I been able to get enough information to reproduce it. On the
other hand, I know plenty of people who are successfully using RAID-5.
> I don't know what it's going to take for Greg to get enough
> information to fix the problem. I spent a week trying to extract a
> set of debug information for him that was useful enough that he
> could work from it, but it seems that my week was just wasted
> because it looks like I didn't capture the bug that he was expecting
> :(.
Well, it's not wasted, but it didn't help enough. As I said, it's
elusive.
If anybody else out there has experienced the following problem,
please contact me:
The system runs fine most of the time, but under heavy load it dies
with a trap 12 (page fault in kernel mode). The dump shows the
system is trying to call the specific iodone routine from biodone.
More careful analysis shows that the buffer header in question has
had some fields zeroed out.
> To summarise. The idea of software raid 5 is great, but it's got to
> work otherwise it is dangerous.
Like anything else. Note that a large number of these comments could
apply equally well to soft updates: it worked fine for most people,
but a small minority of people have had trouble with it.
> I note that the BUGS section of the manual page doesn't exist.
> There should be a warning to potential users that there are known
> problems that can cause the data to become corrupted in some
> configurations.
It looks as if you have missed it:
BUGS
1. vinum is a new product. Bugs can be expected. The configuration
mechanism is not yet fully functional. If you have difficulties,
please look at the section DEBUGGING PROBLEMS WITH VINUM before re-
porting problems.
2. Kernels with the vinum pseudo-device appear to work, but are not
supported. If you have trouble with this configuration, please
first replace the kernel with a non-Vinum kernel and test with the
kld module.
3. Detection of differences between the version of the kernel and the
kld is not yet implemented.
4. The RAID-5 functionality is new in FreeBSD 3.3. Some problems have
been reported with vinum in combination with soft updates, but these
are not reproducible on all systems. If you are planning to use
vinum in a production environment, please test carefully.
Looking at this, I suppose it should be updated. But the section
definitely exists.
Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010103103621.G40453>
