Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Dec 2015 12:03:11 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        Steven Hartland <killing@multiplay.co.uk>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: DELETE support in the VOP_STRATEGY(9)?
Message-ID:  <0DB97CBA-4DC3-4D52-AE9D-54546292D66F@bsdimp.com>
In-Reply-To: <566726ED.2010709@multiplay.co.uk>
References:  <CAH7qZftSVAYPmxNCQy=VVRj79AW7z9ade-0iogv2COfo2x%2Ba2Q@mail.gmail.com> <201512052002.tB5K2ZEA026540@chez.mckusick.com> <CAH7qZfs6ksE%2BQTMFFLYxY0PNE4hzn=D5skzQ91=gGK2xvndkfw@mail.gmail.com> <86poyhqsdh.fsf@desk.des.no> <CAH7qZftVj9m_yob=AbAQA7fh8yG-VLgM7H0skW3eX_S%2Bv75E-g@mail.gmail.com> <86fuzdqjwn.fsf@desk.des.no> <CANCZdfo=NfKy51%2B64-F_v%2BDh2wkrFYP4gXe=X9RWSSao49gO9g@mail.gmail.com> <CANCZdfqHoduhdCss0b6=UsBPAxfRZv4hF8vyuUVLBdP5gYUduQ@mail.gmail.com> <864mfssxgt.fsf@desk.des.no> <CANCZdfoXdcD%2B9jeVR1Np16gafBf0_4B2wombwxze8DvJwf7cMg@mail.gmail.com> <86wpsord9l.fsf@desk.des.no> <566726ED.2010709@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_EC74369D-41B2-4A4A-B350-17C3E63CCCCA
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8


> On Dec 8, 2015, at 11:52 AM, Steven Hartland <killing@multiplay.co.uk> =
wrote:
>=20
>=20
>=20
> On 08/12/2015 18:44, Dag-Erling Sm=C3=B8rgrav wrote:
>> Warner Losh <imp@bsdimp.com> writes:
>>> Dag-Erling Sm=C3=B8rgrav <des@des.no> writes:
>>>> But the filesystem does not know whether the underlying storage is
>>>> electromechanical or solid-state, nor does it know whether the user
>>>> cares much about seek times (unless we introduce the heuristic
>>>> "avoid creating holes unless the file already has them, in which
>>>> case the userland probably does not care").
>>> Actually, the filesystem does know. Or has some knowledge of what
>>> is supported and what isn't. BIO_DELETE support is a strong =
indicator
>>> of a flash or other log-type system.
>> The filesystem can ask the layer below if BIO_DELETE is supported, =
but
>> should not assume anything about what it means.  For instance, I =
could
>> write a gnop-like module that translates BIO_DELETE into an =
all-zeroes
>> BIO_WRITE and passes everything else unmodified.  It would provide a
>> stronger guarantee than, say, SATA TRIM but would also have a =
completely
>> different performance profile (even on SSDs, since it would do its =
work
>> synchronously whereas TRIM works asynchronously).

That ship has sailed. UFS, at least, assumes that if TRIM is supported =
then
relocating files to be contiguous is bad.

But writing a gnop module that did the BIO_DELETE thing would be bogus.
BIO_DELETE does not mean that blocks will read back as zeros. But =
that=E2=80=99s
not what BIO_DELETE means. So, sure you could invent a stupid thing that
breaks the rules, and thus the assumptions of the other code, but why =
would
you want to do that?

The SATA trims are actually synchronous (in the absence of power =
failures).
Once you TRIM The data, it is gone. And depending what bits are set in
the identify response, you can count on different things. But to say =
they
happen asynchronously because of implementation details about when the =
data
is actually erased is missing the point. Also, your BIO_DELETE example
wouldn=E2=80=99t guarantee the data is erased either. Writes to log =
append devices
(like SSDs) are like a TRIM followed by a write: the old LBA mapping is
discarded and a new one replaces it.

>> Anyway, my point is that Maxim needs to revise his assumptions.
> Just to clarify most consumer devices process TRIM synchronously, not =
asynchronously.

It also depends on what you mean by =E2=80=98process=E2=80=99 here.

> Your example isn't actually just an example CAM scsi_da has a number =
of different ways it can process BIO_DELETE:
> * ATA TRIM
> * SCSI UMAP
> * Write Same 16
> * Write Same 10
> * Zero
>=20
> So you example is actually exists in practice in the FreeBSD code base =
;-)

All these are effectively TRIM operations. The devices that implement =
them
use them as hints to optimize storage. DES=E2=80=99 BIO_DELETE -> WRITE =
zero
example doesn=E2=80=99t optimize storage at all, nor does it give the =
lower layers
any clue about how to optimize the storage. All the SCSI delete types
do give that hint.

Warner


--Apple-Mail=_EC74369D-41B2-4A4A-B350-17C3E63CCCCA
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJWZylxAAoJEGwc0Sh9sBEAQVsQAMmXAMbT4ujIa1fwJHBI0NKK
Mk/v2egPRzkqzp6WrXYFaK1h5XOpjFtnJt2R3SD9PbVdz80M3CHVhb0Qv8XaXWVr
z6KYLaQpcrMCPe1Gdeym0gVNvS1loUTsotRisVAiW1EhbJnsx+0Wl7ad8O8bwx2+
6zGX6hz+Kpcy2vHzzubRoJaNoRJnm8lY/lP+qsRdZWNsROdCwjCj/qwUDeGr25xA
Z0/MS+NZrPoyEuPdr5vjCrChyK/mzl5+mv3GJWO3OP+JsAcTOzhSxEr0nMtEcXyN
zHKeo/k79UcIPnXOd6bg2UM0/P+9+m/VkE3rCLF/MVY5wt2O6PiMDEblpbVGQCid
2q3MTBtQ3DDJ0IW6TveLlrfvk6XnlyO5NX+Th0sgB2HASCc6OmyyC2gu8T+gB37Q
2UsOU1pJvx4XBfE7cvPMpj0T40ax88zPWNamaTSyW5AhxZI8rGPst8BF3hCzHZqX
yULRAlLn5lE1yeIUGvV86VLP+xFFnoRf7o8TB6uL6wIQ/O+s34EdPzQzmKpFVJMO
MKr1dffdZLKtWsFnUY8yUXExhLxUR1W+zwRbNwetqwmgy6VJ4btu/ZES91t+wmVu
gnDMGMaYv2HHhQHGZMicN3H60ywGIVLpA4PxLPK7s9vojMY6u9NNx/Bp3LRqkQ97
Vli8lviRtFAtZbNEr+4r
=flRM
-----END PGP SIGNATURE-----

--Apple-Mail=_EC74369D-41B2-4A4A-B350-17C3E63CCCCA--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0DB97CBA-4DC3-4D52-AE9D-54546292D66F>