From owner-freebsd-hackers@freebsd.org Tue Dec 8 19:03:17 2015 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4AC9C9D3D5B for ; Tue, 8 Dec 2015 19:03:17 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from mail-oi0-x235.google.com (mail-oi0-x235.google.com [IPv6:2607:f8b0:4003:c06::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0824D1493 for ; Tue, 8 Dec 2015 19:03:17 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: by oies6 with SMTP id s6so14648615oie.1 for ; Tue, 08 Dec 2015 11:03:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :message-id:references:to; bh=GMjzDPf8br8cyiNfA0oVdEriOpGdd0voEyO0Gcv6Ruk=; b=VObVRsK5Ho4Ea4xhjvuMWbE9nIDxJbppghgvNV1/3FcLej959nKhkAGIMzUcDX4c70 p8RhXoeR2qiu6qzxp3jrirm9eVqOZhrXCXaMUceiLMOK6qZNi/9Pebtv9JDYiyMfU8MP H/WCZyPukEA1F7OueRnGza1nlUlrFCoTX1lgh4Y6duYoqdZ0hkEssQsfwO3T/RaJafX6 Qd549H7nZTuAuNjRRfupKY7HtnBBSwaXEWygCc5+aahPrsqWr+1vr9P0KxGi3q+OpDwp ltbjWn3IOaLJyL+0OaN7r3FAvcc4Fdtq4o0DmmUSvUuvMB/PR5WbXupFAW5pNeT+/xJi FSaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:mime-version:content-type:from :in-reply-to:date:cc:message-id:references:to; bh=GMjzDPf8br8cyiNfA0oVdEriOpGdd0voEyO0Gcv6Ruk=; b=OEkwErOCb8PrM0wHB1HMcDKIEea0ME9TrzvQe02Y7c4jtqGYezx7fZqlsK3YgG20Hb Ycl0cV2c7YC9XRWvmSU9BfvYDhpfKb5Y4Jc7uCn1LHjlXDrOO/l80gdopK9z5lQHow+7 O+QkNVtjeVGdbC1zlZaQ6MUJZPNWjGZyDAxbTC4ndaKMfvu+Nmi9FdJQuH/Msi9dWglP jo+kzkb141C/HbEq6aCq9FgQnCcOg0rz1TeuEXdSOwNQCwaMQ9dMaGS5aFZ95ws8iGkv CUQPCegalo19ac889g3eSIQz1/gKAIoY+i/RY5Fn+aES+bR9ZxQjKGk9V5XEHtNfX89S krKg== X-Gm-Message-State: ALoCoQk5B3ck1BhC6g7fF/LfmJz5ARWUWTDoeJqFwJkZSfbCtdZkVIOmQBmL6mcQJLwuaYlOpTz3OsRULSV9hYRSxHxBEKSoQw== X-Received: by 10.202.181.3 with SMTP id e3mr862163oif.67.1449601396181; Tue, 08 Dec 2015 11:03:16 -0800 (PST) Received: from ?IPv6:2601:280:4900:3700:4d3f:8eba:ea86:7700? ([2601:280:4900:3700:4d3f:8eba:ea86:7700]) by smtp.gmail.com with ESMTPSA id m206sm1914536oig.13.2015.12.08.11.03.15 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 08 Dec 2015 11:03:15 -0800 (PST) Sender: Warner Losh Subject: Re: DELETE support in the VOP_STRATEGY(9)? Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Content-Type: multipart/signed; boundary="Apple-Mail=_EC74369D-41B2-4A4A-B350-17C3E63CCCCA"; protocol="application/pgp-signature"; micalg=pgp-sha512 X-Pgp-Agent: GPGMail 2.5.2 From: Warner Losh In-Reply-To: <566726ED.2010709@multiplay.co.uk> Date: Tue, 8 Dec 2015 12:03:11 -0700 Cc: freebsd-hackers@freebsd.org Message-Id: <0DB97CBA-4DC3-4D52-AE9D-54546292D66F@bsdimp.com> References: <201512052002.tB5K2ZEA026540@chez.mckusick.com> <86poyhqsdh.fsf@desk.des.no> <86fuzdqjwn.fsf@desk.des.no> <864mfssxgt.fsf@desk.des.no> <86wpsord9l.fsf@desk.des.no> <566726ED.2010709@multiplay.co.uk> To: Steven Hartland X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Dec 2015 19:03:17 -0000 --Apple-Mail=_EC74369D-41B2-4A4A-B350-17C3E63CCCCA Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Dec 8, 2015, at 11:52 AM, Steven Hartland = wrote: >=20 >=20 >=20 > On 08/12/2015 18:44, Dag-Erling Sm=C3=B8rgrav wrote: >> Warner Losh writes: >>> Dag-Erling Sm=C3=B8rgrav writes: >>>> But the filesystem does not know whether the underlying storage is >>>> electromechanical or solid-state, nor does it know whether the user >>>> cares much about seek times (unless we introduce the heuristic >>>> "avoid creating holes unless the file already has them, in which >>>> case the userland probably does not care"). >>> Actually, the filesystem does know. Or has some knowledge of what >>> is supported and what isn't. BIO_DELETE support is a strong = indicator >>> of a flash or other log-type system. >> The filesystem can ask the layer below if BIO_DELETE is supported, = but >> should not assume anything about what it means. For instance, I = could >> write a gnop-like module that translates BIO_DELETE into an = all-zeroes >> BIO_WRITE and passes everything else unmodified. It would provide a >> stronger guarantee than, say, SATA TRIM but would also have a = completely >> different performance profile (even on SSDs, since it would do its = work >> synchronously whereas TRIM works asynchronously). That ship has sailed. UFS, at least, assumes that if TRIM is supported = then relocating files to be contiguous is bad. But writing a gnop module that did the BIO_DELETE thing would be bogus. BIO_DELETE does not mean that blocks will read back as zeros. But = that=E2=80=99s not what BIO_DELETE means. So, sure you could invent a stupid thing that breaks the rules, and thus the assumptions of the other code, but why = would you want to do that? The SATA trims are actually synchronous (in the absence of power = failures). Once you TRIM The data, it is gone. And depending what bits are set in the identify response, you can count on different things. But to say = they happen asynchronously because of implementation details about when the = data is actually erased is missing the point. Also, your BIO_DELETE example wouldn=E2=80=99t guarantee the data is erased either. Writes to log = append devices (like SSDs) are like a TRIM followed by a write: the old LBA mapping is discarded and a new one replaces it. >> Anyway, my point is that Maxim needs to revise his assumptions. > Just to clarify most consumer devices process TRIM synchronously, not = asynchronously. It also depends on what you mean by =E2=80=98process=E2=80=99 here. > Your example isn't actually just an example CAM scsi_da has a number = of different ways it can process BIO_DELETE: > * ATA TRIM > * SCSI UMAP > * Write Same 16 > * Write Same 10 > * Zero >=20 > So you example is actually exists in practice in the FreeBSD code base = ;-) All these are effectively TRIM operations. The devices that implement = them use them as hints to optimize storage. DES=E2=80=99 BIO_DELETE -> WRITE = zero example doesn=E2=80=99t optimize storage at all, nor does it give the = lower layers any clue about how to optimize the storage. All the SCSI delete types do give that hint. Warner --Apple-Mail=_EC74369D-41B2-4A4A-B350-17C3E63CCCCA Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJWZylxAAoJEGwc0Sh9sBEAQVsQAMmXAMbT4ujIa1fwJHBI0NKK Mk/v2egPRzkqzp6WrXYFaK1h5XOpjFtnJt2R3SD9PbVdz80M3CHVhb0Qv8XaXWVr z6KYLaQpcrMCPe1Gdeym0gVNvS1loUTsotRisVAiW1EhbJnsx+0Wl7ad8O8bwx2+ 6zGX6hz+Kpcy2vHzzubRoJaNoRJnm8lY/lP+qsRdZWNsROdCwjCj/qwUDeGr25xA Z0/MS+NZrPoyEuPdr5vjCrChyK/mzl5+mv3GJWO3OP+JsAcTOzhSxEr0nMtEcXyN zHKeo/k79UcIPnXOd6bg2UM0/P+9+m/VkE3rCLF/MVY5wt2O6PiMDEblpbVGQCid 2q3MTBtQ3DDJ0IW6TveLlrfvk6XnlyO5NX+Th0sgB2HASCc6OmyyC2gu8T+gB37Q 2UsOU1pJvx4XBfE7cvPMpj0T40ax88zPWNamaTSyW5AhxZI8rGPst8BF3hCzHZqX yULRAlLn5lE1yeIUGvV86VLP+xFFnoRf7o8TB6uL6wIQ/O+s34EdPzQzmKpFVJMO MKr1dffdZLKtWsFnUY8yUXExhLxUR1W+zwRbNwetqwmgy6VJ4btu/ZES91t+wmVu gnDMGMaYv2HHhQHGZMicN3H60ywGIVLpA4PxLPK7s9vojMY6u9NNx/Bp3LRqkQ97 Vli8lviRtFAtZbNEr+4r =flRM -----END PGP SIGNATURE----- --Apple-Mail=_EC74369D-41B2-4A4A-B350-17C3E63CCCCA--