Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Dec 2015 12:58:49 -0700
From:      Warner Losh <imp@bsdimp.com>
To:        =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <des@des.no>
Cc:        "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>, Steven Hartland <killing@multiplay.co.uk>
Subject:   Re: DELETE support in the VOP_STRATEGY(9)?
Message-ID:  <E7F0EECA-724C-4595-AB4A-96E29EF6871B@bsdimp.com>
In-Reply-To: <868u54radx.fsf@desk.des.no>
References:  <CAH7qZftSVAYPmxNCQy=VVRj79AW7z9ade-0iogv2COfo2x%2Ba2Q@mail.gmail.com> <201512052002.tB5K2ZEA026540@chez.mckusick.com> <CAH7qZfs6ksE%2BQTMFFLYxY0PNE4hzn=D5skzQ91=gGK2xvndkfw@mail.gmail.com> <86poyhqsdh.fsf@desk.des.no> <CAH7qZftVj9m_yob=AbAQA7fh8yG-VLgM7H0skW3eX_S%2Bv75E-g@mail.gmail.com> <86fuzdqjwn.fsf@desk.des.no> <CANCZdfo=NfKy51%2B64-F_v%2BDh2wkrFYP4gXe=X9RWSSao49gO9g@mail.gmail.com> <CANCZdfqHoduhdCss0b6=UsBPAxfRZv4hF8vyuUVLBdP5gYUduQ@mail.gmail.com> <864mfssxgt.fsf@desk.des.no> <CANCZdfoXdcD%2B9jeVR1Np16gafBf0_4B2wombwxze8DvJwf7cMg@mail.gmail.com> <86wpsord9l.fsf@desk.des.no> <566726ED.2010709@multiplay.co.uk> <0DB97CBA-4DC3-4D52-AE9D-54546292D66F@bsdimp.com> <86d1ugrb7j.fsf@desk.des.no> <CANCZdfrgkA-znp8jL%2BfDgkXwaTSBeNJVTXj6mDKQxdYtht3uzA@mail.gmail.com> <868u54radx.fsf@desk.des.no>

next in thread | previous in thread | raw e-mail | index | archive | help

--Apple-Mail=_EB5456C4-B287-466A-AA13-FAEC4F84D4BB
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8


> On Dec 8, 2015, at 12:46 PM, Dag-Erling Sm=C3=B8rgrav <des@des.no> =
wrote:
>=20
> Warner Losh <imp@bsdimp.com> writes:
>> Dag-Erling Sm=C3=B8rgrav <des@des.no> writes:
>>> My point is that it's wrong to infer anything else from
>>> GEOM::candelete than the fact that BIO_DELETE requests will be
>>> accepted and may or may not do something, somewhere, at some point.
>>> We can easily create a different GEOM attribute which indicates that
>>> seeks are essentially free, and FFS could use that instead of
>>> GEOM::candelete to disable relocation.
>> When this was implemented, we thought about that. But we couldn't =
come
>> up with any cases where you'd have one set and not the other.  And =
the
>> actual thing you'd want isn't that seeks are free, though that's a
>> good clue. The actual thing you want is to know if there's a
>> performance benefit to keeping files contiguous, and the extent size
>> where that stops making sense.
>=20
> I'm having a hard time understanding how the fact that seeks are
> essentially free is *not* a good indication that there is no benefit =
to
> keeping files contiguous, since keeping files contiguous is something =
we
> do to avoid the cost of seeking.  Support for deletion, on the other
> hand, is *completely* orthogonal.  And my example was not taken =
entirely
> out of the blue: I'm sure there would be a huge market for storage
> devices, whether electromechanical or solid-state, which implemented
> this in hardware, along with guarantees that reallocated sectors are
> truly non-recoverable.

I=E2=80=99d say it=E2=80=99s not nuanced enough. Seeks may be =
essentially from for SSDs,
but there=E2=80=99s still some benefit to clustering writes. Only the =
SSD=E2=80=99s firmware
can know what units it would prefer to write things in so that it =
spreads the
sectors across however many banks / chips make up one unit internally
that can be done in parallel.

While not perfect, GEOM::candelete gives an indicate that the device =
does
storage management. In the vast majority of the cases in actual =
hardware,
this means that the actual physical media is obscured by at least one
layer of indirection. Since there=E2=80=99s the layer of indirection, =
assumptions about
continuity are out the window. While there may be a tiny fraction where
drives try to shoe-horn =E2=80=98reliable erasure=E2=80=99 into data set =
management trim
operations, or similar, I=E2=80=99d imagine that to get the assurances =
they=E2=80=99d want
from the OS and filesystems, they=E2=80=99d implement a new primitive or =
attribute
which would allow them to use a more robust command set to ensure when
the feature is engaged it=E2=80=99s working as advertised. So using =
GEOM::candelete
is good enough until actual problems can be demonstrated. We can do the
work then to solve the problem found with it rather than guess about =
what
the problems might be and design to that.

And to be fair, having an additional property of =E2=80=98seeks are =
nearly free=E2=80=99 would
also be a good way to tell. I=E2=80=99m not convinced it is worth the =
effort to add it to
all the storage devices in the tree when GEOM::candelete is a good =
proxy.
I guess if we=E2=80=99re going to the effort, I=E2=80=99d like there to =
be a richer set of data
provided than just =E2=80=98seeks are nearly free=E2=80=99.

Warner

--Apple-Mail=_EB5456C4-B287-466A-AA13-FAEC4F84D4BB
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment;
	filename=signature.asc
Content-Type: application/pgp-signature;
	name=signature.asc
Content-Description: Message signed with OpenPGP using GPGMail

-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iQIcBAEBCgAGBQJWZzZ6AAoJEGwc0Sh9sBEARw8QAJ26d8PwzKSUzP9jiOl8NP4e
EA9m9EJ4DaikBfB8sndtpq6sQBh9SYBXEr8Xg+FtVmVnJuU70pp1yyepIRCWDoiH
Egz7XXkhkdPiTZu6TlwjZCJuj8E+kPBJyInLURx/hsF+TkJ3wzzY2wHeOuloHkKZ
uujnllF4nASDmfVxFJmWYpvvP7wlzTA4bFdzKeiF1Fdja/sXIoo/Nm744Vwb56q7
o0bDNSkVZhZLozS1nCz9tGwFq5cyZF0r20J91NnPSD3DmJ/l8dkRJ3RKIoI28cYI
A3NjNVHNM4pUQE3Hafx3/H+WkN1rLaSsMOLH6J0ZSVvT5b51BqNVF844LlvGaZpy
YngtZnFRrhRoJl/LaMmfSvYUw/vzvrCMpU4uOatGZq9Cq6qNAqAn2EWhe/d1MOe3
ieaa89tIN6A8CR4hgpZl0Y1lUH5DS5fuGLV72aBfDJE6pPeQmepcRQ+qbiDITFk4
I/tW7KBOMko8HbqfSEJ2w3eVAzR5pl9WbL2ChRmhcS8TYvX0McJtQqbDmTW0moFa
ZkyKCl1NpBL81OsWyrMEprEq3LEN8uFARGVvftY4zhCnLCbjKqhbCv/tjzpF3TEL
ZosfubJFYa/IlpLVC8sp54jDtvxyuJzOXsqxbTsGqYKmIvadJSMxKgo5PU6Yxbfs
Jcx6I3DznL0mHMuJ8AQW
=3ORU
-----END PGP SIGNATURE-----

--Apple-Mail=_EB5456C4-B287-466A-AA13-FAEC4F84D4BB--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E7F0EECA-724C-4595-AB4A-96E29EF6871B>