From owner-freebsd-geom@freebsd.org Fri Nov 24 13:41:27 2017 Return-Path: Delivered-To: freebsd-geom@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 31195DE6395; Fri, 24 Nov 2017 13:41:27 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lf0-f54.google.com (mail-lf0-f54.google.com [209.85.215.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DE9487522F; Fri, 24 Nov 2017 13:41:26 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lf0-f54.google.com with SMTP id k66so25549237lfg.3; Fri, 24 Nov 2017 05:41:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=r36jXUEGMIy23uM2+d2siF/0w4/bQvR41bQZSuWZoOo=; b=Dfi8pWBslpSts5kHV2uMfnBKr0qLawEXf/LFbwNONjeNdHjS+GRcKi/JkswWoybJC0 /rPaKsOtIEtTpj7Do/u1aD6AX6tw567BrhkrD9adt5cwuoBNvZb54sD01vQkUPf7aS3m CVZ55F+MkfPL9bX+3pn5P/CfwP9ToD35ND8iUgT/J0KT37/OEhWDnRJu45k1wXq8Ugjx mlEonz7+DWAdBHZgzhbjc/MD7RfNd03TJBCy0uV7KGCUD75ldSU+vKr/DWrD8WJc1zEt F5OTJmnvp3OpZOIodt9e/cDvXa9ES/ig+dvFe/sPuzjnBhn11Y3KnAfY9eZWuu7nRx3W YSJQ== X-Gm-Message-State: AJaThX4pRU8wTx/muTap/6DsWa6erbTJirBRhltJXyn3PGXfz4wPPN6F 1Vddzi37QxkMPQ6Nh9h1dIs= X-Google-Smtp-Source: AGs4zMbnrPvRtz68h9+hnPjT0RKaFjc6RWusF10UJUlzr7pZDgsEUvOC4aqpTDJh8D/LuFrtWW/MAQ== X-Received: by 10.25.163.11 with SMTP id m11mr9390033lfe.179.1511530479138; Fri, 24 Nov 2017 05:34:39 -0800 (PST) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id s66sm4550021lje.40.2017.11.24.05.34.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 24 Nov 2017 05:34:38 -0800 (PST) Subject: Re: add BIO_NORETRY flag, implement support in ata_da, use in ZFS vdev_geom To: Warner Losh Cc: FreeBSD FS , freebsd-geom@freebsd.org, Scott Long References: <391f2cc7-0036-06ec-b6c9-e56681114eeb@FreeBSD.org> From: Andriy Gapon Message-ID: <64f37301-a3d8-5ac4-a25f-4f6e4254ffe9@FreeBSD.org> Date: Fri, 24 Nov 2017 15:34:36 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Nov 2017 13:41:27 -0000 On 24/11/2017 15:08, Warner Losh wrote: > > > On Fri, Nov 24, 2017 at 3:30 AM, Andriy Gapon > wrote: > > > https://reviews.freebsd.org/D13224 > > Anyone interested is welcome to join the review. > > > I think it's a really bad idea. It introduces a 'one-size-fits-all' notion of > QoS that seems misguided. It conflates a shorter timeout with don't retry. And > why is retrying bad? It seems more a notion of 'fail fast' or so other concept. > There's so many other ways you'd want to use it. And it uses the same return > code (EIO) to mean something new. It's generally meant 'The lower layers have > retried this, and it failed, do not submit it again as it will not succeed' with > 'I gave it a half-assed attempt, and that failed, but resubmission might work'. > This breaks a number of assumptions in the BUF/BIO layer as well as parts of CAM > even more than they are broken now. > > So let's step back a bit: what problem is it trying to solve? A simple example. I have a mirror, I issue a read to one of its members. Let's assume there is some trouble with that particular block on that particular disk. The disk may spend a lot of time trying to read it and would still fail. With the current defaults I would wait 5x that time to finally get the error back. Then I go to another mirror member and get my data from there. IMO, this is not optimal. I'd rather pass BIO_NORETRY to the first read, get the error back sooner and try the other disk sooner. Only if I know that there are no other copies to try, then I would use the normal read with all the retrying. -- Andriy Gapon