From owner-freebsd-stable@freebsd.org  Mon Jul 11 17:39:52 2016
Return-Path: <owner-freebsd-stable@freebsd.org>
Delivered-To: freebsd-stable@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id CD584B925FD
 for <freebsd-stable@mailman.ysv.freebsd.org>;
 Mon, 11 Jul 2016 17:39:52 +0000 (UTC) (envelope-from ian@freebsd.org)
Received: from pmta2.delivery6.ore.mailhop.org
 (pmta2.delivery6.ore.mailhop.org [54.200.129.228])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id B0ACC15AD
 for <freebsd-stable@freebsd.org>; Mon, 11 Jul 2016 17:39:52 +0000 (UTC)
 (envelope-from ian@freebsd.org)
X-MHO-User: 96b3331b-478e-11e6-8929-8ded99d5e9d7
X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information
X-Originating-IP: 73.34.117.227
X-Mail-Handler: DuoCircle Outbound SMTP
Received: from ilsoft.org (unknown [73.34.117.227])
 by outbound2.ore.mailhop.org (Halon Mail Gateway) with ESMTPSA;
 Mon, 11 Jul 2016 17:40:41 +0000 (UTC)
Received: from rev (rev [172.22.42.240])
 by ilsoft.org (8.15.2/8.14.9) with ESMTP id u6BHdnKC003908;
 Mon, 11 Jul 2016 11:39:49 -0600 (MDT) (envelope-from ian@freebsd.org)
Message-ID: <1468258789.72182.122.camel@freebsd.org>
Subject: Re: Not-so stable if you take a CAM error....
From: Ian Lepore <ian@freebsd.org>
To: Karl Denninger <karl@denninger.net>, freebsd-stable@freebsd.org
Date: Mon, 11 Jul 2016 11:39:49 -0600
In-Reply-To: <f22ab40d-42f0-78a3-d3a7-945387259109@denninger.net>
References: <2b0c454b-c1a0-4b5b-e778-bf0939e90ae1@denninger.net>
 <op.ykfe1fvbkndu52@ronaldradial.radialsg.local>
 <6e9c07e1-12a6-a7cd-f775-6b0fe5a706bc@denninger.net>
 <1468243977.72182.118.camel@freebsd.org>
 <877f5e8e-c1e7-6fb0-6ceb-031ce3e68582@denninger.net>
 <CAKFCL4WrRS1ic1CZqcmbCEnsrD2pkh4VHPBFyB+-3NaNJZ+Jkw@mail.gmail.com>
 <1468254746.72182.121.camel@freebsd.org>
 <f22ab40d-42f0-78a3-d3a7-945387259109@denninger.net>
Content-Type: text/plain; charset="us-ascii"
X-Mailer: Evolution 3.16.5 FreeBSD GNOME Team Port 
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-stable>, 
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Jul 2016 17:39:52 -0000

On Mon, 2016-07-11 at 12:30 -0500, Karl Denninger wrote:
> On 7/11/2016 11:32, Ian Lepore wrote:
> > On Mon, 2016-07-11 at 09:50 -0400, Brandon Allbery wrote:
> > > On Mon, Jul 11, 2016 at 9:46 AM, Karl Denninger <
> > > karl@denninger.net>
> > > wrote:
> > > 
> > > > Here's the backtrace ... sounds like expected behavior, which
> > > > is
> > > > not-so
> > > > good all-in for a situation like this.  I guess the strategy is
> > > > to
> > > > turn
> > > > off softupdates before attempting such an update so as not to
> > > > crash
> > > > the
> > > > host machine if there's a problem with the card.
> > > > 
> > > I would tend to assume that removable media should not have
> > > softupdates
> > > enabled. Even with properly working media, it's practically
> > > begging
> > > for
> > > corruption.
> > > 
> > Writing to an sdcard without softupdates enabled will be an
> > exercise in
> > patience.  Like, come back next week and maybe it'll be done.
> > 
> > The only thing that comes to mind with this is maybe some sort of
> > mount
> > flag to say you're willing to live with any amount of filesystem
> > corruption in lieu of panicking.  I'm not sure how easy/practical
> > that
> > would be to implement, though.
> > 
> > -- Ian
> Why not force-detach the volume that takes the error instead of a
> panic()?
> 

Patches welcome.

-- Ian

> That would lead to a panic if the detached volume was the system
> volume
> (obviously) but for a data volume it would simply result in it being
> forcibly unmounted (and dirty, so if it's corrupt it will get caught
> when reattached.)
> 
> It seems that the current paradigm of saying "screw you, panic the
> machine" violates the principle of least astonishment and is overly
> punitive vis-a-vis necessity.  Refusing further I/O because the
> volume
> may now have a corrupt filesystem appears to be facially reasonable,
> but
> that doesn't necessarily wind up being fatal the system itself -- it
> is
> if that's the system volume and is not covered by some sort of
> redundancy, obviously, but it's not in all cases.
> 
> (Note that you can't just unmount the filesystem involved in the
> error;
> it has to be the volume that gets forcibly detached and whatever
> flows
> through from that you have to live with.  The reason is that on any
> sort
> of solid-state media the OS has zero control over zoning and write
> amplification means far more the data you were actually modifying may
> have been lost -- it's entirely possible that *several megabytes* of
> data just got trashed by the write error, and it's even possible that
> the block(s) involved cross a filesystem boundary!)
>