Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 16 Oct 2022 10:07:48 -0600
From:      Alan Somers <asomers@freebsd.org>
To:        void <void@f-m.fm>
Cc:        freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: zfs with operations like rm -rf takes a very long time recently
Message-ID:  <CAOtMX2hjGnpLgs2vD1pTELK-nF9nBORhCj57zyYmkEmHAGohnQ@mail.gmail.com>
In-Reply-To: <1625dc7b-81c4-4350-8f86-1b65f5a860d9@app.fastmail.com>
References:  <f25d069a-2ee7-491a-a5ef-a14b973c12e2@app.fastmail.com> <CAOtMX2jW1s8n_FGez4-sCMBnbd-GDUoj9Sx83gaTBTEuW58HQw@mail.gmail.com> <1625dc7b-81c4-4350-8f86-1b65f5a860d9@app.fastmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Oct 16, 2022 at 9:13 AM void <void@f-m.fm> wrote:
>
> On Sun, 16 Oct 2022, at 14:15, Alan Somers wrote:
> > The usual reason why rm gets slow is because your pool is nearly full
> > and there's a snapshot.  A snapshot means that rm doesn't actually
> > free space; it just rewrites metadata, which requires even more space.
> > And when a zpool is nearly full, writes always slow way down.
> > But if that's not it, then you should also check gstat to see if the
> > disk itself is slow.
>
> Hi,
>
> The disk in question is one of these:
> https://skinflint.co.uk/toshiba-mobile-hdd-mq01-series-1tb-mq01abd100m-a1820027.html
>
> It's a CMR disk. Power-on hrs = 39253
>
> da0: 400.000MB/s transfers
> da0: 953869MB (1953525168 512 byte sectors)
> da0: quirks=0x2<NO_6_BYTE>
>
> Filesystem                          Size    Used   Avail Capacity  Mounted on
> zroot/ROOT/default                  863G    146G    717G    17%    /
>
> there are no snapshots in "zfs list -t snapshot"
>
> rm -rf /var/cache/ccache/* has been running for 53 mins so far (it's still running)
>
> # gstat -dopC
> timestamp,name,q-depth,total_ops/s,read/s,read-KiB/s,ms/read,write/s,write-KiB/s,ms/write,delete/s,delete-KiB/s,ms/delete,other/s,ms/other,%busy
> 2022-10-16 16:01:24.761122175,da0,5,0,0,0,0.0,0,0,0.0,0,0,0.0,0,0.0,0.0
> 2022-10-16 16:01:25.762821956,da0,4,40,34,136,107.8,4,16,57.1,0,0,0.0,2,293.9,111.1
> 2022-10-16 16:01:26.763826447,da0,3,52,50,200,37.6,1,60,34.1,0,0,0.0,1,152.0,95.0
> 2022-10-16 16:01:27.767079004,da0,3,108,40,159,33.6,67,2173,10.3,0,0,0.0,1,12.2,85.1
> 2022-10-16 16:01:28.813081219,da0,1,57,17,69,62.8,38,467,31.0,0,0,0.0,2,156.9,108.7
> 2022-10-16 16:01:29.818093791,da0,1,56,56,223,17.2,0,0,0.0,0,0,0.0,0,0.0,95.9
> 2022-10-16 16:01:30.825841923,da0,1,57,57,230,16.9,0,0,0.0,0,0,0.0,0,0.0,95.3
> 2022-10-16 16:01:31.828940957,da0,1,58,58,231,16.7,0,0,0.0,0,0,0.0,0,0.0,96.7
> 2022-10-16 16:01:32.830822873,da0,3,150,29,116,33.9,120,8460,9.4,0,0,0.0,1,125.4,99.7
> 2022-10-16 16:01:33.877434519,da0,3,115,6,23,128.4,109,8607,17.1,0,0,0.0,0,0.0,93.4
> ^C
>
> I'm not sure what, of the data above, what would be considered "slow", or why it should be like this now.
>
> There's lots of things in sysctl -a concerning zfs. Can you suggest anything to
> look out for?
>
> thanks,

Gstat is showing that your disk is fully busy.  It's also showing read
latency as high as 128 ms, which is extremely slow.  I suspect a
problem with your disk.  FYI, ZFS naturally has a 5-second rhythm
(unless you changed vfs.zfs.txg.timeout), so gstat's output is
sometimes more consistent if you use "-I 5s".  You can also omit "-d"
for magnetic HDDs, since they don't have anything like TRIM.
I suggest checking dmesg to see if there are any messages about errors
from da0.  It would also be worth running "smartctl -a /dev/da0", from
sysutils/smartmontools.
-Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2hjGnpLgs2vD1pTELK-nF9nBORhCj57zyYmkEmHAGohnQ>