Date: Tue, 8 Dec 2015 12:58:49 -0700 From: Warner Losh <imp@bsdimp.com> To: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= <des@des.no> Cc: "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>, Steven Hartland <killing@multiplay.co.uk> Subject: Re: DELETE support in the VOP_STRATEGY(9)? Message-ID: <E7F0EECA-724C-4595-AB4A-96E29EF6871B@bsdimp.com> In-Reply-To: <868u54radx.fsf@desk.des.no> References: <CAH7qZftSVAYPmxNCQy=VVRj79AW7z9ade-0iogv2COfo2x%2Ba2Q@mail.gmail.com> <201512052002.tB5K2ZEA026540@chez.mckusick.com> <CAH7qZfs6ksE%2BQTMFFLYxY0PNE4hzn=D5skzQ91=gGK2xvndkfw@mail.gmail.com> <86poyhqsdh.fsf@desk.des.no> <CAH7qZftVj9m_yob=AbAQA7fh8yG-VLgM7H0skW3eX_S%2Bv75E-g@mail.gmail.com> <86fuzdqjwn.fsf@desk.des.no> <CANCZdfo=NfKy51%2B64-F_v%2BDh2wkrFYP4gXe=X9RWSSao49gO9g@mail.gmail.com> <CANCZdfqHoduhdCss0b6=UsBPAxfRZv4hF8vyuUVLBdP5gYUduQ@mail.gmail.com> <864mfssxgt.fsf@desk.des.no> <CANCZdfoXdcD%2B9jeVR1Np16gafBf0_4B2wombwxze8DvJwf7cMg@mail.gmail.com> <86wpsord9l.fsf@desk.des.no> <566726ED.2010709@multiplay.co.uk> <0DB97CBA-4DC3-4D52-AE9D-54546292D66F@bsdimp.com> <86d1ugrb7j.fsf@desk.des.no> <CANCZdfrgkA-znp8jL%2BfDgkXwaTSBeNJVTXj6mDKQxdYtht3uzA@mail.gmail.com> <868u54radx.fsf@desk.des.no>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_EB5456C4-B287-466A-AA13-FAEC4F84D4BB Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > On Dec 8, 2015, at 12:46 PM, Dag-Erling Sm=C3=B8rgrav <des@des.no> = wrote: >=20 > Warner Losh <imp@bsdimp.com> writes: >> Dag-Erling Sm=C3=B8rgrav <des@des.no> writes: >>> My point is that it's wrong to infer anything else from >>> GEOM::candelete than the fact that BIO_DELETE requests will be >>> accepted and may or may not do something, somewhere, at some point. >>> We can easily create a different GEOM attribute which indicates that >>> seeks are essentially free, and FFS could use that instead of >>> GEOM::candelete to disable relocation. >> When this was implemented, we thought about that. But we couldn't = come >> up with any cases where you'd have one set and not the other. And = the >> actual thing you'd want isn't that seeks are free, though that's a >> good clue. The actual thing you want is to know if there's a >> performance benefit to keeping files contiguous, and the extent size >> where that stops making sense. >=20 > I'm having a hard time understanding how the fact that seeks are > essentially free is *not* a good indication that there is no benefit = to > keeping files contiguous, since keeping files contiguous is something = we > do to avoid the cost of seeking. Support for deletion, on the other > hand, is *completely* orthogonal. And my example was not taken = entirely > out of the blue: I'm sure there would be a huge market for storage > devices, whether electromechanical or solid-state, which implemented > this in hardware, along with guarantees that reallocated sectors are > truly non-recoverable. I=E2=80=99d say it=E2=80=99s not nuanced enough. Seeks may be = essentially from for SSDs, but there=E2=80=99s still some benefit to clustering writes. Only the = SSD=E2=80=99s firmware can know what units it would prefer to write things in so that it = spreads the sectors across however many banks / chips make up one unit internally that can be done in parallel. While not perfect, GEOM::candelete gives an indicate that the device = does storage management. In the vast majority of the cases in actual = hardware, this means that the actual physical media is obscured by at least one layer of indirection. Since there=E2=80=99s the layer of indirection, = assumptions about continuity are out the window. While there may be a tiny fraction where drives try to shoe-horn =E2=80=98reliable erasure=E2=80=99 into data set = management trim operations, or similar, I=E2=80=99d imagine that to get the assurances = they=E2=80=99d want from the OS and filesystems, they=E2=80=99d implement a new primitive or = attribute which would allow them to use a more robust command set to ensure when the feature is engaged it=E2=80=99s working as advertised. So using = GEOM::candelete is good enough until actual problems can be demonstrated. We can do the work then to solve the problem found with it rather than guess about = what the problems might be and design to that. And to be fair, having an additional property of =E2=80=98seeks are = nearly free=E2=80=99 would also be a good way to tell. I=E2=80=99m not convinced it is worth the = effort to add it to all the storage devices in the tree when GEOM::candelete is a good = proxy. I guess if we=E2=80=99re going to the effort, I=E2=80=99d like there to = be a richer set of data provided than just =E2=80=98seeks are nearly free=E2=80=99. Warner --Apple-Mail=_EB5456C4-B287-466A-AA13-FAEC4F84D4BB Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIcBAEBCgAGBQJWZzZ6AAoJEGwc0Sh9sBEARw8QAJ26d8PwzKSUzP9jiOl8NP4e EA9m9EJ4DaikBfB8sndtpq6sQBh9SYBXEr8Xg+FtVmVnJuU70pp1yyepIRCWDoiH Egz7XXkhkdPiTZu6TlwjZCJuj8E+kPBJyInLURx/hsF+TkJ3wzzY2wHeOuloHkKZ uujnllF4nASDmfVxFJmWYpvvP7wlzTA4bFdzKeiF1Fdja/sXIoo/Nm744Vwb56q7 o0bDNSkVZhZLozS1nCz9tGwFq5cyZF0r20J91NnPSD3DmJ/l8dkRJ3RKIoI28cYI A3NjNVHNM4pUQE3Hafx3/H+WkN1rLaSsMOLH6J0ZSVvT5b51BqNVF844LlvGaZpy YngtZnFRrhRoJl/LaMmfSvYUw/vzvrCMpU4uOatGZq9Cq6qNAqAn2EWhe/d1MOe3 ieaa89tIN6A8CR4hgpZl0Y1lUH5DS5fuGLV72aBfDJE6pPeQmepcRQ+qbiDITFk4 I/tW7KBOMko8HbqfSEJ2w3eVAzR5pl9WbL2ChRmhcS8TYvX0McJtQqbDmTW0moFa ZkyKCl1NpBL81OsWyrMEprEq3LEN8uFARGVvftY4zhCnLCbjKqhbCv/tjzpF3TEL ZosfubJFYa/IlpLVC8sp54jDtvxyuJzOXsqxbTsGqYKmIvadJSMxKgo5PU6Yxbfs Jcx6I3DznL0mHMuJ8AQW =3ORU -----END PGP SIGNATURE----- --Apple-Mail=_EB5456C4-B287-466A-AA13-FAEC4F84D4BB--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E7F0EECA-724C-4595-AB4A-96E29EF6871B>