Date: Fri, 24 Nov 2017 15:34:36 +0200 From: Andriy Gapon <avg@FreeBSD.org> To: Warner Losh <imp@bsdimp.com> Cc: FreeBSD FS <freebsd-fs@freebsd.org>, freebsd-geom@freebsd.org, Scott Long <scottl@samsco.org> Subject: Re: add BIO_NORETRY flag, implement support in ata_da, use in ZFS vdev_geom Message-ID: <64f37301-a3d8-5ac4-a25f-4f6e4254ffe9@FreeBSD.org> In-Reply-To: <CANCZdfoE5UWMC6v4bbov6zizvcEMCbrSdGeJ019axCUfS_T_6w@mail.gmail.com> References: <391f2cc7-0036-06ec-b6c9-e56681114eeb@FreeBSD.org> <CANCZdfoE5UWMC6v4bbov6zizvcEMCbrSdGeJ019axCUfS_T_6w@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 24/11/2017 15:08, Warner Losh wrote: > > > On Fri, Nov 24, 2017 at 3:30 AM, Andriy Gapon <avg@freebsd.org > <mailto:avg@freebsd.org>> wrote: > > > https://reviews.freebsd.org/D13224 <https://reviews.freebsd.org/D13224> > > Anyone interested is welcome to join the review. > > > I think it's a really bad idea. It introduces a 'one-size-fits-all' notion of > QoS that seems misguided. It conflates a shorter timeout with don't retry. And > why is retrying bad? It seems more a notion of 'fail fast' or so other concept. > There's so many other ways you'd want to use it. And it uses the same return > code (EIO) to mean something new. It's generally meant 'The lower layers have > retried this, and it failed, do not submit it again as it will not succeed' with > 'I gave it a half-assed attempt, and that failed, but resubmission might work'. > This breaks a number of assumptions in the BUF/BIO layer as well as parts of CAM > even more than they are broken now. > > So let's step back a bit: what problem is it trying to solve? A simple example. I have a mirror, I issue a read to one of its members. Let's assume there is some trouble with that particular block on that particular disk. The disk may spend a lot of time trying to read it and would still fail. With the current defaults I would wait 5x that time to finally get the error back. Then I go to another mirror member and get my data from there. IMO, this is not optimal. I'd rather pass BIO_NORETRY to the first read, get the error back sooner and try the other disk sooner. Only if I know that there are no other copies to try, then I would use the normal read with all the retrying. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?64f37301-a3d8-5ac4-a25f-4f6e4254ffe9>