Date: Fri, 3 May 2024 09:20:26 -0600 From: Warner Losh <imp@bsdimp.com> To: Matthew Grooms <mgrooms@shrew.net> Cc: stable@freebsd.org Subject: Re: how to tell if TRIM is working Message-ID: <CANCZdfp4vGW=6FrHfLwkUeck5c3TSbVSRwcxS4jqnZmfzNyaUA@mail.gmail.com> In-Reply-To: <67721332-fa1d-4b3c-aa57-64594ad5d77a@shrew.net> References: <5e1b5097-c1c0-4740-a491-63c709d01c25@sentex.net> <67721332-fa1d-4b3c-aa57-64594ad5d77a@shrew.net>
next in thread | previous in thread | raw e-mail | index | archive | help
--00000000000016188e06178e43f3 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hey Matthew, On Wed, May 1, 2024 at 2:25=E2=80=AFPM Matthew Grooms <mgrooms@shrew.net> w= rote: > On 5/1/24 14:38, mike tancsa wrote: > > Kind of struggling to check if TRIM is actually working or not with my > > SSDs on RELENG_14 in ZFS. > > > > On a pool that has almost no files on it (capacity at 0% out of 3TB), > > should not > > > > zpool -w trim <pool> be almost instant after a couple of runs ? > > Instead it seems to always take about 10min to complete. > > > > Looking at the stats, > > > > kstat.zfs.tortank1.misc.iostats.trim_bytes_failed: 0 > > kstat.zfs.tortank1.misc.iostats.trim_extents_failed: 0 > > kstat.zfs.tortank1.misc.iostats.trim_bytes_skipped: 2743435264 > > kstat.zfs.tortank1.misc.iostats.trim_extents_skipped: 253898 > > kstat.zfs.tortank1.misc.iostats.trim_bytes_written: 14835526799360 > > kstat.zfs.tortank1.misc.iostats.trim_extents_written: 1169158 > > > > what and why are bytes being skipped ? > > > > One of the drives for example > > > > sysctl -a kern.cam.ada.0 > > kern.cam.ada.0.trim_ticks: 0 > > kern.cam.ada.0.trim_goal: 0 > > kern.cam.ada.0.sort_io_queue: 0 > > kern.cam.ada.0.rotating: 0 > > kern.cam.ada.0.unmapped_io: 1 > > kern.cam.ada.0.flags: > > > 0x1be3bde<CAN_48BIT,CAN_FLUSHCACHE,CAN_NCQ,CAN_DMA,WAS_OTAG,CAN_TRIM,OPEN= ,SCTX_INIT,CAN_POWERMGT,CAN_DMA48,CAN_LOG,CAN_WCACHE,CAN_RAHEAD,PROBED,ANNO= UNCED,DIRTY,PIM_ATA_EXT,UNMAPPEDIO> > > kern.cam.ada.0.max_seq_zones: 0 > > kern.cam.ada.0.optimal_nonseq_zones: 0 > > kern.cam.ada.0.optimal_seq_zones: 0 > > kern.cam.ada.0.zone_support: None > > kern.cam.ada.0.zone_mode: Not Zoned > > kern.cam.ada.0.write_cache: -1 > > kern.cam.ada.0.read_ahead: -1 > > kern.cam.ada.0.trim_lbas: 7771432624 > > kern.cam.ada.0.trim_ranges: 371381 > > kern.cam.ada.0.trim_count: 310842 > > kern.cam.ada.0.delete_method: DSM_TRIM > > > > If I take one of the disks out of the pool and replace it with a > > spare, and do a manual trim it seems to work > > > I had a hard time seeing evidence of this at the disk level while > fiddling with TRIM recently. It appeared that at least some counters are > driver and operation specific. For example, the da driver appears to > update counters in some paths but not others. I assume that ada is > different. There is a bug report for da, but haven't seen any feedback ..= . > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D277673 > > You could try to run gstat with the -d flag during the time period when > the delete operations are expected to occur. That should give you an > idea of what's happening at the disk level in real time but may not > offer more info than you're already seeing. > These changes looked good. My apologies for not noticing this sooner (I think that our CC to the mailing list in bugzilla has stopped working, so I didn't see the redirect). I've committed the changes, and queued them to my stable-14 and stable-13 branches. Thank you so much for your submission. Warner --00000000000016188e06178e43f3 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div dir=3D"ltr">Hey Matthew,<br></div><br><div class=3D"g= mail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Wed, May 1, 2024 at 2:= 25=E2=80=AFPM Matthew Grooms <<a href=3D"mailto:mgrooms@shrew.net">mgroo= ms@shrew.net</a>> wrote:<br></div><blockquote class=3D"gmail_quote" styl= e=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);paddin= g-left:1ex">On 5/1/24 14:38, mike tancsa wrote:<br> > Kind of struggling to check if TRIM is actually working or not with my= <br> > SSDs on RELENG_14 in ZFS.<br> ><br> > On a pool that has almost no files on it (capacity at 0% out of 3TB), = <br> > should not<br> ><br> > zpool -w trim <pool> be almost instant after a couple of runs ? = <br> > Instead it seems to always take about 10min to complete.<br> ><br> > Looking at the stats,<br> ><br> > kstat.zfs.tortank1.misc.iostats.trim_bytes_failed: 0<br> > kstat.zfs.tortank1.misc.iostats.trim_extents_failed: 0<br> > kstat.zfs.tortank1.misc.iostats.trim_bytes_skipped: 2743435264<br> > kstat.zfs.tortank1.misc.iostats.trim_extents_skipped: 253898<br> > kstat.zfs.tortank1.misc.iostats.trim_bytes_written: 14835526799360<br> > kstat.zfs.tortank1.misc.iostats.trim_extents_written: 1169158<br> ><br> > what and why are bytes being skipped ?<br> ><br> > One of the drives for example<br> ><br> > =C2=A0sysctl -a kern.cam.ada.0<br> > kern.cam.ada.0.trim_ticks: 0<br> > kern.cam.ada.0.trim_goal: 0<br> > kern.cam.ada.0.sort_io_queue: 0<br> > kern.cam.ada.0.rotating: 0<br> > kern.cam.ada.0.unmapped_io: 1<br> > kern.cam.ada.0.flags: <br> > 0x1be3bde<CAN_48BIT,CAN_FLUSHCACHE,CAN_NCQ,CAN_DMA,WAS_OTAG,CAN_TRI= M,OPEN,SCTX_INIT,CAN_POWERMGT,CAN_DMA48,CAN_LOG,CAN_WCACHE,CAN_RAHEAD,PROBE= D,ANNOUNCED,DIRTY,PIM_ATA_EXT,UNMAPPEDIO><br> > kern.cam.ada.0.max_seq_zones: 0<br> > kern.cam.ada.0.optimal_nonseq_zones: 0<br> > kern.cam.ada.0.optimal_seq_zones: 0<br> > kern.cam.ada.0.zone_support: None<br> > kern.cam.ada.0.zone_mode: Not Zoned<br> > kern.cam.ada.0.write_cache: -1<br> > kern.cam.ada.0.read_ahead: -1<br> > kern.cam.ada.0.trim_lbas: 7771432624<br> > kern.cam.ada.0.trim_ranges: 371381<br> > kern.cam.ada.0.trim_count: 310842<br> > kern.cam.ada.0.delete_method: DSM_TRIM<br> ><br> > If I take one of the disks out of the pool and replace it with a <br> > spare, and do a manual trim it seems to work<br> ><br> I had a hard time seeing evidence of this at the disk level while <br> fiddling with TRIM recently. It appeared that at least some counters are <b= r> driver and operation specific. For example, the da driver appears to <br> update counters in some paths but not others. I assume that ada is <br> different. There is a bug report for da, but haven't seen any feedback = ...<br> <br> <a href=3D"https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D277673" rel= =3D"noreferrer" target=3D"_blank">https://bugs.freebsd.org/bugzilla/show_bu= g.cgi?id=3D277673</a><br> <br> You could try to run gstat with the -d flag during the time period when <br= > the delete operations are expected to occur. That should give you an <br> idea of what's happening at the disk level in real time but may not <br= > offer more info than you're already seeing.<br></blockquote><div><br></= div><div>These changes looked good. My apologies for not noticing this soon= er</div><div>(I think that our CC to the mailing list in bugzilla has stopp= ed working,</div><div>so I didn't see the redirect). I've committed= the changes, and queued them</div><div>to my stable-14 and stable-13 branc= hes.</div><div><br></div><div>Thank you so much for your submission.</div><= div><br></div><div>Warner<br></div></div></div> --00000000000016188e06178e43f3--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfp4vGW=6FrHfLwkUeck5c3TSbVSRwcxS4jqnZmfzNyaUA>