Date: Thu, 2 May 2024 08:29:56 -0400 From: mike tancsa <mike@sentex.net> To: Matthew Grooms <mgrooms@shrew.net>, stable@freebsd.org Subject: Re: how to tell if TRIM is working Message-ID: <77e203b3-c555-408b-9634-c452cb3a57ac@sentex.net> In-Reply-To: <67721332-fa1d-4b3c-aa57-64594ad5d77a@shrew.net> References: <5e1b5097-c1c0-4740-a491-63c709d01c25@sentex.net> <67721332-fa1d-4b3c-aa57-64594ad5d77a@shrew.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 5/1/2024 4:24 PM, Matthew Grooms wrote: > On 5/1/24 14:38, mike tancsa wrote: >> Kind of struggling to check if TRIM is actually working or not with >> my SSDs on RELENG_14 in ZFS. >> >> On a pool that has almost no files on it (capacity at 0% out of 3TB), >> should not >> >> zpool -w trim <pool> be almost instant after a couple of runs ? >> Instead it seems to always take about 10min to complete. >> >> Looking at the stats, >> >> kstat.zfs.tortank1.misc.iostats.trim_bytes_failed: 0 >> kstat.zfs.tortank1.misc.iostats.trim_extents_failed: 0 >> kstat.zfs.tortank1.misc.iostats.trim_bytes_skipped: 2743435264 >> kstat.zfs.tortank1.misc.iostats.trim_extents_skipped: 253898 >> kstat.zfs.tortank1.misc.iostats.trim_bytes_written: 14835526799360 >> kstat.zfs.tortank1.misc.iostats.trim_extents_written: 1169158 >> >> what and why are bytes being skipped ? >> >> One of the drives for example I had a hard time seeing evidence of >> this at the disk level while fiddling with TRIM recently. It appeared >> that at least some counters are driver and operation specific. For >> example, the da driver appears to update counters in some paths but >> not others. I assume that ada is different. There is a bug report for >> da, but haven't seen any feedback ... > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277673 > > You could try to run gstat with the -d flag during the time period > when the delete operations are expected to occur. That should give you > an idea of what's happening at the disk level in real time but may not > offer more info than you're already seeing. > It *seems* to be doing something. What I dont understand is why if I run it once, do nothing (no writes / snapshots etc), and then run trim again, it seems to be doing something with gstat even though there should not be anything to mark as being trimmed ? dT: 1.002s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name 0 1254 0 0 0.0 986 5202 2.0 244 8362733 4.5 55.6 ada0 12 1242 0 0 0.0 1012 5218 1.9 206 4972041 6.0 63.3 ada2 12 1242 0 0 0.0 1012 5218 1.9 206 4972041 6.0 63.3 ada2p1 0 4313 0 0 0.0 1024 5190 0.8 3266 6463815 0.4 62.8 ada3 0 1254 0 0 0.0 986 5202 2.0 244 8362733 4.5 55.6 ada0p1 0 4238 0 0 0.0 960 4874 0.7 3254 6280362 0.4 59.8 ada5 0 4313 0 0 0.0 1024 5190 0.8 3266 6463815 0.4 62.8 ada3p1 0 4238 0 0 0.0 960 4874 0.7 3254 6280362 0.4 59.8 ada5p1 dT: 1.001s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name 2 2381 0 0 0.0 1580 9946 0.9 767 5990286 1.8 70.0 ada0 2 2801 0 0 0.0 1540 9782 0.9 1227 11936510 1.0 65.2 ada2 2 2801 0 0 0.0 1540 9782 0.9 1227 11936510 1.0 65.2 ada2p1 0 2072 0 0 0.0 1529 9566 0.8 509 12549587 2.1 57.0 ada3 2 2381 0 0 0.0 1580 9946 0.9 767 5990286 1.8 70.0 ada0p1 0 2042 0 0 0.0 1517 9427 0.6 491 12549535 1.9 52.4 ada5 0 2072 0 0 0.0 1529 9566 0.8 509 12549587 2.1 57.0 ada3p1 0 2042 0 0 0.0 1517 9427 0.6 491 12549535 1.9 52.4 ada5p1 dT: 1.002s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name 2 1949 0 0 0.0 1094 5926 1.2 827 11267200 1.8 78.8 ada0 0 2083 0 0 0.0 1115 6034 0.7 939 16537981 1.4 67.2 ada2 0 2083 0 0 0.0 1115 6034 0.7 939 16537981 1.4 67.2 ada2p1 2 2525 0 0 0.0 1098 5914 0.8 1399 16021615 1.1 79.3 ada3 2 1949 0 0 0.0 1094 5926 1.2 827 11267200 1.8 78.8 ada0p1 12 2471 0 0 0.0 1018 5399 1.0 1425 15395566 1.1 80.5 ada5 2 2525 0 0 0.0 1098 5914 0.8 1399 16021615 1.1 79.3 ada3p1 12 2471 0 0 0.0 1018 5399 1.0 1425 15395566 1.1 80.5 ada5p1 The ultimate problem is that after a while with a lot of writes, the disk performance will be toast until I do a manual trim -f of the disk :( this is most notable on consumer WD SSDs. I havent done any extensive tests with Samsung SSDs to see if there are performance penalties or not. It might be that they are just better at masking the problem. I dont see the same issue with ZFS on Linux with the same disks / hardware I have an open PR in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277992 that I think might actually have 2 separate problems. ---Mike >> e.g. here was one disk in the pool that was taking a long time for >> each zpool trim >> >> # time trim -f /dev/ada1 >> trim /dev/ada1 offset 0 length 1000204886016 >> 0.000u 0.057s 1:29.33 0.0% 5+184k 0+0io 0pf+0w >> and then if I re-run it >> # time trim -f /dev/ada1 >> trim /dev/ada1 offset 0 length 1000204886016 >> 0.000u 0.052s 0:04.15 1.2% 1+52k 0+0io 0pf+0w >> >> 90 seconds and then 4 seconds after that. >> > > -Matthew >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?77e203b3-c555-408b-9634-c452cb3a57ac>