Date: Thu, 2 May 2024 08:29:56 -0400 From: mike tancsa <mike@sentex.net> To: Matthew Grooms <mgrooms@shrew.net>, stable@freebsd.org Subject: Re: how to tell if TRIM is working Message-ID: <77e203b3-c555-408b-9634-c452cb3a57ac@sentex.net> In-Reply-To: <67721332-fa1d-4b3c-aa57-64594ad5d77a@shrew.net> References: <5e1b5097-c1c0-4740-a491-63c709d01c25@sentex.net> <67721332-fa1d-4b3c-aa57-64594ad5d77a@shrew.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 5/1/2024 4:24 PM, Matthew Grooms wrote: > On 5/1/24 14:38, mike tancsa wrote: >> Kind of struggling to check if TRIM is actually working or not with >> my SSDs on RELENG_14 in ZFS. >> >> On a pool that has almost no files on it (capacity at 0% out of 3TB), >> should not >> >> zpool -w trim <pool> be almost instant after a couple of runs ? >> Instead it seems to always take about 10min to complete. >> >> Looking at the stats, >> >> kstat.zfs.tortank1.misc.iostats.trim_bytes_failed: 0 >> kstat.zfs.tortank1.misc.iostats.trim_extents_failed: 0 >> kstat.zfs.tortank1.misc.iostats.trim_bytes_skipped: 2743435264 >> kstat.zfs.tortank1.misc.iostats.trim_extents_skipped: 253898 >> kstat.zfs.tortank1.misc.iostats.trim_bytes_written: 14835526799360 >> kstat.zfs.tortank1.misc.iostats.trim_extents_written: 1169158 >> >> what and why are bytes being skipped ? >> >> One of the drives for example I had a hard time seeing evidence of >> this at the disk level while fiddling with TRIM recently. It appeared >> that at least some counters are driver and operation specific. For >> example, the da driver appears to update counters in some paths but >> not others. I assume that ada is different. There is a bug report for >> da, but haven't seen any feedback ... > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277673 > > You could try to run gstat with the -d flag during the time period > when the delete operations are expected to occur. That should give you > an idea of what's happening at the disk level in real time but may not > offer more info than you're already seeing. > It *seems* to be doing something. What I dont understand is why if I run it once, do nothing (no writes / snapshots etc), and then run trim again, it seems to be doing something with gstat even though there should not be anything to mark as being trimmed ? dT: 1.002s w: 1.000s  L(q) ops/s   r/s  kBps  ms/r   w/s  kBps  ms/w   d/s kBps  ms/d  %busy Name    0  1254     0     0   0.0   986  5202   2.0   244 8362733   4.5  55.6 ada0   12  1242     0     0   0.0  1012  5218   1.9   206 4972041   6.0  63.3 ada2   12  1242     0     0   0.0  1012  5218   1.9   206 4972041   6.0  63.3 ada2p1    0  4313     0     0   0.0  1024  5190   0.8  3266 6463815   0.4  62.8 ada3    0  1254     0     0   0.0   986  5202   2.0   244 8362733   4.5  55.6 ada0p1    0  4238     0     0   0.0   960  4874   0.7  3254 6280362   0.4  59.8 ada5    0  4313     0     0   0.0  1024  5190   0.8  3266 6463815   0.4  62.8 ada3p1    0  4238     0     0   0.0   960  4874   0.7  3254 6280362   0.4  59.8 ada5p1 dT: 1.001s w: 1.000s  L(q) ops/s   r/s  kBps  ms/r   w/s  kBps  ms/w   d/s kBps  ms/d  %busy Name    2  2381     0     0   0.0  1580  9946   0.9   767 5990286   1.8  70.0 ada0    2  2801     0     0   0.0  1540  9782   0.9  1227 11936510   1.0  65.2 ada2    2  2801     0     0   0.0  1540  9782   0.9  1227 11936510   1.0  65.2 ada2p1    0  2072     0     0   0.0  1529  9566   0.8   509 12549587   2.1  57.0 ada3    2  2381     0     0   0.0  1580  9946   0.9   767 5990286   1.8  70.0 ada0p1    0  2042     0     0   0.0  1517  9427   0.6   491 12549535   1.9  52.4 ada5    0  2072     0     0   0.0  1529  9566   0.8   509 12549587   2.1  57.0 ada3p1    0  2042     0     0   0.0  1517  9427   0.6   491 12549535   1.9  52.4 ada5p1 dT: 1.002s w: 1.000s  L(q) ops/s   r/s  kBps  ms/r   w/s  kBps  ms/w   d/s kBps  ms/d  %busy Name    2  1949     0     0   0.0  1094  5926   1.2   827 11267200   1.8  78.8 ada0    0  2083     0     0   0.0  1115  6034   0.7   939 16537981   1.4  67.2 ada2    0  2083     0     0   0.0  1115  6034   0.7   939 16537981   1.4  67.2 ada2p1    2  2525     0     0   0.0  1098  5914   0.8  1399 16021615   1.1  79.3 ada3    2  1949     0     0   0.0  1094  5926   1.2   827 11267200   1.8  78.8 ada0p1   12  2471     0     0   0.0  1018  5399   1.0  1425 15395566   1.1  80.5 ada5    2  2525     0     0   0.0  1098  5914   0.8  1399 16021615   1.1  79.3 ada3p1   12  2471     0     0   0.0  1018  5399   1.0  1425 15395566   1.1  80.5 ada5p1 The ultimate problem is that after a while with a lot of writes, the disk performance will be toast until I do a manual trim -f of the disk :(  this is most notable on consumer WD SSDs. I havent done any extensive tests with Samsung SSDs to see if there are performance penalties or not. It might be that they are just better at masking the problem. I dont see the same issue with ZFS on Linux with the same disks / hardware I have an open PR in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277992 that I think might actually have 2 separate problems.    ---Mike >> e.g. here was one disk in the pool that was taking a long time for >> each zpool trim >> >> # time trim -f /dev/ada1 >> trim /dev/ada1 offset 0 length 1000204886016 >> 0.000u 0.057s 1:29.33 0.0%     5+184k 0+0io 0pf+0w >> and then if I re-run it >> # time trim -f /dev/ada1 >> trim /dev/ada1 offset 0 length 1000204886016 >> 0.000u 0.052s 0:04.15 1.2%     1+52k 0+0io 0pf+0w >> >> 90 seconds and then 4 seconds after that. >> > > -Matthew >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?77e203b3-c555-408b-9634-c452cb3a57ac>