Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 1 Jul 2014 20:47:49 +0200
From:      Edward Tomasz =?utf-8?Q?Napiera=C5=82a?= <trasz@FreeBSD.org>
To:        Dmitry Sivachenko <trtrmitya@gmail.com>
Cc:        freebsd-stable@freebsd.org, Ronald Klop <ronald-lists@klop.ws>
Subject:   Re: 10/stable panic: softdep_deallocate_dependencies: dangling deps
Message-ID:  <20140701184749.GA8617@brick.home>
In-Reply-To: <F84286C1-EB0F-4049-A567-1BB0E0FD19AE@gmail.com>
References:  <021AFCAD-7B0B-47FB-AAFF-8F7085C7E1A6@gmail.com> <op.xia61kp0kndu52@ronaldradial.radialsg.local> <F84286C1-EB0F-4049-A567-1BB0E0FD19AE@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 0701T1457, Dmitry Sivachenko wrote:
> 
> On 01 июля 2014 г., at 11:57, Ronald Klop <ronald-lists@klop.ws> wrote:
> 
> > On Mon, 30 Jun 2014 14:22:02 +0200, Dmitry Sivachenko <trtrmitya@gmail.com> wrote:
> > 
> >> Hello!
> >> 
> >> I have several machines with rather fresh FreeBSD-10/stable.
> >> 
> >> They all have 4 SATA drives, I have small gmirrored root+var and the rest of the drive space is mounted as /disk1, /disk2, etc (UFS2+SU).
> >> When a single disk fails, system panics with "softdep_deallocate_dependencies: dangling deps" message:
> >> http://people.freebsd.org/~demon/softdep.png
> >> 
> >> Since all vital data (root+var) are mirrored, I expect OS to stay alive.
> > 
> > Hi,
> > 
> > So /disk1, /disk2 are not (g)mirrored? In that case the system cannot handle write failure. Because writes are not synchronous (for speed) there is no possibility to return an error to the application.
> 
> 
> No, they are not (g)mirrored.
> I expect read/write errors, but not kernel panic.  Actually I encountered disk I/O errors since 2.2.5 and it is first time I faced kernel panic.

Soft updates cannot gracefully handle IO errors.  It _will_ panic.
You can either prevent errors from happening by using redundancy
(ie. mirroring), or disable soft updates.  That's how it works,
sorry.

In theory it would be possible to prevent this from happening;
panic here is actually to terminate the system before it corrupts
data, and in situations like this one, where the disk is no longer
accessible, it's not possible to corrupt anything.  IIRC I've
actually added a workaround for that a while ago, but, as you can
see, it's not enough, and I don't understand soft updates well
enough to fix it properly.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140701184749.GA8617>