From owner-svn-src-all@freebsd.org Thu Apr 20 20:02:19 2017 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 766C7D479C1; Thu, 20 Apr 2017 20:02:19 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: from mail-wm0-x244.google.com (mail-wm0-x244.google.com [IPv6:2a00:1450:400c:c09::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 08E0F973; Thu, 20 Apr 2017 20:02:19 +0000 (UTC) (envelope-from etnapierala@gmail.com) Received: by mail-wm0-x244.google.com with SMTP id z129so543345wmb.1; Thu, 20 Apr 2017 13:02:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:mail-followup-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=4eTa0i9wy9FugigJHQYpPD2vnLHnInaRAs8ERdk9Kq8=; b=E0DF+PoRuaHufzrxPkN/SnsDq73az+I3glqAo57Cn1fMqiTn08RpyvOHQiOHi6x45w 0tRsEeWC2Yk/Tbw9x4lY85ybR6KcjGtRbikWQvGwMB19KUJtfLWsljE14p97S/JIRcKL KzcDjdEcXBI1SCmJbhRPkGt4YzTHowJiaqClKl0TK1skYKK+ilv6elknsZRsrUTeTt3+ jB6HipcJsMhvhp3zosjbdLA/+Ju879vREFEbsZIP25kPj5yxIusQXePdHLrfMYbO8NzL s8rDNYrknQgxnp9Dj/MMByZut3HKrxTIoxiz7by9phEeLLDIV10tnOboga4wHz8d3zmf 9yCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to:user-agent; bh=4eTa0i9wy9FugigJHQYpPD2vnLHnInaRAs8ERdk9Kq8=; b=Ljdc8f0Yc5iD0Q9s4uAImeiHoyhtWsbb0r4WTZQ0cYTpTxKe3RTKdouGecRK5B751k Saj+1NT6nkhDXsto/OMsYLYbIzVaS5T7WQHol7TsPO5KJfgHtBgsqnK+/qQrKvOv1/wv +c0LOb9Q3sQ41Jf432Q3SWkzs85YWv+pX5hMkUN4TWstQ0AydTfG7G6LsHjx4WxOHClE HsK2MR7jmJoPTdaPweqFCpxCWBfPH7PTw9NaVhQPegMML7lOr78MHzfmnYp16cIEyuVS 0Ynk4fQmQTWdsMseY8MdnULIkYmnByh+CtfbF73l6vpeLSXu3L6IBtigDoaQdqIP2BAa K/iQ== X-Gm-Message-State: AN3rC/6mBc5UP1+l6Z0aGsxrORSElzKjzU1w91tTY7k5ZEhyzHLI4Nit msIRwMwnJQrggA== X-Received: by 10.28.143.71 with SMTP id r68mr5053851wmd.61.1492718537079; Thu, 20 Apr 2017 13:02:17 -0700 (PDT) Received: from brick (cpc92310-cmbg19-2-0-cust934.5-4.cable.virginm.net. [82.9.227.167]) by smtp.gmail.com with ESMTPSA id k4sm222918wmf.12.2017.04.20.13.02.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Apr 2017 13:02:16 -0700 (PDT) Sender: =?UTF-8?Q?Edward_Tomasz_Napiera=C5=82a?= Date: Thu, 20 Apr 2017 21:02:14 +0100 From: Edward Tomasz Napierala To: Bruce Evans Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r316941 - head/sys/kern Message-ID: <20170420200214.GA1717@brick> Mail-Followup-To: Bruce Evans , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org References: <201704142015.v3EKFYWA017623@repo.freebsd.org> <20170415064658.L4428@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170415064658.L4428@besplex.bde.org> User-Agent: Mutt/1.8.0 (2017-02-23) X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Apr 2017 20:02:19 -0000 On 0415T0736, Bruce Evans wrote: > On Fri, 14 Apr 2017, Edward Tomasz Napierala wrote: > > > Log: > > Don't try to write out bufs that have already failed with ENXIO. > > This fixes some panics after disconnecting mounted disks. > > > > Submitted by: imp (slightly different version, which I've then lost) > > Reviewed by: kib, imp, mckusick > > MFC after: 2 weeks > > Differential Revision: https://reviews.freebsd.org/D9674 > > > > Modified: > > head/sys/kern/vfs_bio.c > > > > Modified: head/sys/kern/vfs_bio.c > > ============================================================================== > > --- head/sys/kern/vfs_bio.c Fri Apr 14 20:15:17 2017 (r316940) > > +++ head/sys/kern/vfs_bio.c Fri Apr 14 20:15:34 2017 (r316941) > > @@ -2290,18 +2290,28 @@ brelse(struct buf *bp) > > bdirty(bp); > > } > > if (bp->b_iocmd == BIO_WRITE && (bp->b_ioflags & BIO_ERROR) && > > + (bp->b_error != ENXIO || !LIST_EMPTY(&bp->b_dep)) && > > !(bp->b_flags & B_INVAL)) { > > /* > > - * Failed write, redirty. Must clear BIO_ERROR to prevent > > - * pages from being scrapped. > > + * Failed write, redirty. All errors except ENXIO (which > > + * means the device is gone) are expected to be potentially > > + * transient - underlying media might work if tried again > > + * after EIO, and memory might be available after an ENOMEM. > > + * > > + * Do this also for buffers that failed with ENXIO, but have > > + * non-empty dependencies - the soft updates code might need > > + * to access the buffer to untangle them. > > + * > > + * Must clear BIO_ERROR to prevent pages from being scrapped. > > */ > > This is hard to fix, but I have used a version that only retries after > EIO for 15-20 years. I didn't think of ENOMEM. > > The media is unlikely to come back after EIO too. For removable media, > you might be able to get the write done to new media, but a panic reading > from the new media is just as likely. Geom "tasting" might prevent the > new media being used. I think media that actually disappeared will eventually result in ENXIO. That's what GEOMs return when they "wither". > ENXIO is actually the one error that can often be recovered from. I > wrote a form of "tasting" in a toy OS 30-35 years ago. It handled > removal of "mounted" disks with pending writes too well, in a way that > made recovery from non-transient I/O errors almost impossible without > turning off the system. ENXIO was treated as a transient I/O error. > It was recovered from perfectly if the user could find the original > media and unremove it. The "tasting" usually worked to detect different > media and disallow writing cached data to a different disk. Media > errors were common, and when one occurred for writing the method of > replacing the disk by a garbage one did't work since it was a different > disk. The most common one was writing to a write protected disk, and > that was recoverable by removing the write protection. But often you > really didn't want to write to that disk, but wanted to write somewhere. > The only way to continue was to reboot to discard the write. Hah. I actually wrote something similar for FreeBSD: gmountver(8). It's a GEOM class that simply passes BIOs to the lower layer, except when it returns EIO or ENXIO - when that happens it queues the BIO in its queue, closes the provider, and then when it comes back it reattaches and resubmits the BIOs. It might actually be useful again due to not always reliable SD cards one might use for rootfs on Raspberry Pi, for example.