Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 16 Jun 2007 16:52:55 +0200
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Joao Barros <joao.barros@gmail.com>
Cc:        freebsd-fs@freebsd.org, Lapo Luchini <lapo@lapo.it>
Subject:   Re: ZFS panic on mdX-based raidz
Message-ID:  <20070616145255.GD39202@garage.freebsd.pl>
In-Reply-To: <70e8236f0706151300l72e48e03r40d09f09c6d0ff9d@mail.gmail.com>
References:  <f4tjii$bus$1@sea.gmane.org> <20070615145224.GA39202@garage.freebsd.pl> <4672A8CD.5060009@lapo.it> <20070615154009.GB39202@garage.freebsd.pl> <70e8236f0706151300l72e48e03r40d09f09c6d0ff9d@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--d01dLTUuW90fS44H
Content-Type: text/plain; charset=iso-8859-2
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Jun 15, 2007 at 09:00:07PM +0100, Joao Barros wrote:
> On 6/15/07, Pawel Jakub Dawidek <pjd@freebsd.org> wrote:
> >On Fri, Jun 15, 2007 at 04:57:17PM +0200, Lapo Luchini wrote:
> >> Pawel Jakub Dawidek wrote:
> >> >> Follows the status with two invalid disks (I wonder why :P) and a s=
crub
> >> >> in progress; but the host will panic before the scrub ends.
> >> >>
> >> >
> >> > You corrupted two components in configuration with accepts only one =
disk
> >> > failure.
> >> Yes, I'm well aware of that (it was intentional, in fact), and the
> >> "invalid" state of the pool was fully expected.. while the kernel panic
> >> short thereafter was a bit less so ;-)
> >> Not a very urgent or pressing issue, I do agree, I reported it mainly
> >> for completeness sake.
> >
> >But this is very expected behaviour of ZFS.
>=20
> Let's suppose I have 2 raidz, one volume using disks on one enclosure
> and another volume using disks on another enclosure. One of the
> enclosures is disconnected, doesn't matter why. Is ZFS going to panic
> the machine thus rendering the other volume unavailable? From what
> I've seen the volume is marked as failed and that's what's supposed to
> happen.
> Or when you say intentional was just for md backed devices?

When ZFS cannot write data and there is not enough redundancy it will
panic. Sun is working on this AFAIK.

Write failure is not easy to handle, because most writes are delayed, so
we can't return an error back to application. It shouldn't panic entire
system still, but eventually unmount it forcibly.

ZFS can survive write failure of one component in two-way mirror or
raidz1 configuration or two components in raidz2 configuration. It can
also survive write failure in non-redundant configuration when copies
property is greater than 1 and we can write at least one copy
successfully.

Two cases on the other hand should be handled better (now they panic the
system):

1. Write fails, because of more number of components is missing that we
can accept in our configuration. It should most likely forcibly unmount
the file system (file systems?).

2. Write fails, because of bad block. ZFS should just try to write in
another location and doesn't do it now. This should be quite easy to
implement because of COW model, but it's not there yet.

Anyway, be aware of those limitation when you decide to abuse it too
much next time:)

--=20
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--d01dLTUuW90fS44H
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (FreeBSD)

iD8DBQFGc/lHForvXbEpPzQRAvACAKDpKAia7dyrcdXGJN5A5mWgtNADcgCfcmJ/
u4KTDvgz9kjzfaNAgvMmFVk=
=LKYT
-----END PGP SIGNATURE-----

--d01dLTUuW90fS44H--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070616145255.GD39202>