From nobody Fri Oct 27 13:34:35 2023 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SH3YT2dLHz4yGNq for ; Fri, 27 Oct 2023 13:34:37 +0000 (UTC) (envelope-from SRS0=D2w5=GJ=klop.ws=ronald-lists@realworks.nl) Received: from smtp-relay-int-backup.realworks.nl (smtp-relay-int-backup.realworks.nl [87.255.56.188]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4SH3YT1Mj8z4VdS for ; Fri, 27 Oct 2023 13:34:36 +0000 (UTC) (envelope-from SRS0=D2w5=GJ=klop.ws=ronald-lists@realworks.nl) Authentication-Results: mx1.freebsd.org; none Received: from rwvirtual375.colo.realworks.nl (rwvirtual375.colo.realworks.nl [10.0.10.75]) by mailrelayint2.colo2.realworks.nl (Postfix) with ESMTP id 4SH3YR3qWQz2xYC; Fri, 27 Oct 2023 15:34:35 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=klop.ws; s=rw2; t=1698413675; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vK2j89Q4p2nTgw/ePekjVkA/4FaOIH2Er1QfcBrBJAo=; b=Remd/T8NqNVWnJzePyAshUxlDmDyM6gXofFqajM05jbEFqLH/l1HgNDr/ttioO9a/flFuQ ozOnKrs8WbG63h9a9UzOgNTLbGXhqUgDDyG/CkjmJ+tNTBOp9yTWj+KzJpk4cxUI8gRguw Ld1tta8Z8jDNuUWH9BT7SO43PynYyqeoPEGxrFMsM72gH0zjsq4SjH4bvA645KM17IT30h j8k4P3bRFjQD2+6Bk+X5Q9rXjF5HaxlE6xVkm5yny84jJOzr50ieaBxPnaIFg99M7vYM05 tXOAVoppLgWPEY5OT5ZGf2xm0umf42MWN7+8MvUhcp7aE8ptikJq3IsgdW0JGQ== Received: from rwvirtual375.colo.realworks.nl (localhost [127.0.0.1]) by rwvirtual375.colo.realworks.nl (Postfix) with ESMTP id 749FC406C9; Fri, 27 Oct 2023 15:34:35 +0200 (CEST) Date: Fri, 27 Oct 2023 15:34:35 +0200 (CEST) From: Ronald Klop To: void Cc: freebsd-stable@freebsd.org Message-ID: <794932758.6659.1698413675475@localhost> In-Reply-To: References: <1122335317.4913.1698407124469@localhost> Subject: Re: periodic daily takes a very long time to run (14-stable) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_6658_1428730899.1698413675449" X-Mailer: Realworks (677.7) X-Originating-Host: from (84-105-120-103.cable.dynamic.v4.ziggo.nl [84.105.120.103]) by rwvirtual375 [10.0.10.75] with HTTP; Fri, 27 Oct 2023 15:34:35 +0200 Importance: Normal X-Priority: 3 (Normal) X-Originating-User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/119.0 X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:38930, ipnet:87.255.32.0/19, country:NL] X-Rspamd-Queue-Id: 4SH3YT1Mj8z4VdS ------=_Part_6658_1428730899.1698413675449 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Van: void Datum: vrijdag, 27 oktober 2023 14:51 Aan: freebsd-stable@freebsd.org Onderwerp: Re: periodic daily takes a very long time to run (14-stable) > > On Fri, Oct 27, 2023 at 01:45:24PM +0200, Ronald Klop wrote: > > >Can you run "gstat" or "iostat -x -d 1" to see how busy your disk is? >And how much bandwidth is uses. > > when periodic is *not* running, gstat numbers fluctuate between whats indicated below and all zeros: > > dT: 1.011s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name > 0 69 0 0 0.0 67 732 0.6 24.3| da0 > 0 0 0 0 0.0 0 0 0.0 0.0| da0p1 > 0 0 0 0 0.0 0 0 0.0 0.0| da0p2 > 0 69 0 0 0.0 67 732 0.6 24.3| da0p3 > 0 0 0 0 0.0 0 0 0.0 0.0| da0p2.eli > 0 69 0 0 0.0 67 732 0.8 24.4| da0p3.eli > > # iostat -x -d 1 > extended device statistics > device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen %b > da0 15 18 189.4 565.1 47 7 144 27 0 23 pass0 0 0 0.0 0.0 54 0 154 79 0 0 extended device statistics > device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen %b > da0 3 70 172.6 698.0 26 1 129 5 0 33 pass0 0 0 0.0 0.0 0 0 0 0 0 0 extended device statistics > device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen %b > da0 0 0 0.0 0.0 0 0 0 0 0 0 pass0 0 0 0.0 0.0 0 0 0 0 0 0 > (then all 0s repeatedly) > > When periodic *is* running: > > dT: 1.047s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name > 2 89 34 138 88.1 54 783 33.9 95.0| da0 > 0 0 0 0 0.0 0 0 0.0 0.0| da0p1 > 0 0 0 0 0.0 0 0 0.0 0.0| da0p2 > 2 89 34 138 88.1 54 783 34.0 95.0| da0p3 > 0 0 0 0 0.0 0 0 0.0 0.0| da0p2.eli > 2 90 35 141 86.3 54 783 34.2 96.9| da0p3.eli > > the kBps number varies wildly between 100 and 1200. %busy is 92-102% > > # iostat -x -d 1 > extended device statistics > device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen %b > da0 15 18 189.6 564.1 47 7 144 27 10 23 pass0 0 0 0.0 0.0 54 0 154 79 0 0 > Numbers vary wildly here, too. I wonder if theres a switch for iostat that lets it > run for say 1000 seconds then produces an average of the values gathered. > > I'll look for one > -- > > > > What stands out to me is that you do quite a lot of writes on the disk. (I might be mistaken.) The max. number of IOPS for HDD is around 80 for consumer grade harddisks. I think this counts for USB connected disks. https://en.wikipedia.org/wiki/IOPS#Mechanical_hard_drives >From the stats you posted it looks like you are almost always doing 50+ writes/second already. That does not leave much IOPS for the find process. ZFS tries to bundle the writes every 5 seconds to leave room for the reads. See "sysctl vfs.zfs.txg.timeout". Unless it has too much data to write or a sync request comes in. #device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen %b $ iostat -x -w 1 da1 | grep da1 da1 6 19 94.0 525.0 15 2 123 8 1 12 da1 0 4 0.0 15.2 0 1 122 41 0 23 da1 0 0 0.0 0.0 0 0 0 0 0 0 da1 0 2 0.0 738.7 0 4 61 23 0 7 da1 0 0 0.0 0.0 0 0 0 0 0 0 da1 0 98 0.0 2850.9 0 1 0 1 1 7 da1 0 4 0.0 16.0 0 1 139 47 0 28 da1 0 0 0.0 0.0 0 0 0 0 0 0 da1 0 0 0.0 0.0 0 0 0 0 0 0 da1 0 0 0.0 0.0 0 0 0 0 0 0 da1 0 80 0.0 2361.3 0 1 0 1 1 6 da1 0 4 0.0 15.3 0 1 125 42 0 24 Every 5th row is a spike in writes on my RPI4. "zpool iostat 1" should give a similar pattern. So if my observation is right it might be interesting to find out what is writing. I had similar issues after the number of jails on my RPI4 increased and they all were doing a little bit of writing which accumulated in quite a lot of writing. My solution was to add an SSD. Regards, Ronald. ------=_Part_6658_1428730899.1698413675449 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit

Van: void <void@f-m.fm>
Datum: vrijdag, 27 oktober 2023 14:51
Aan: freebsd-stable@freebsd.org
Onderwerp: Re: periodic daily takes a very long time to run (14-stable)

On Fri, Oct 27, 2023 at 01:45:24PM +0200, Ronald Klop wrote:

>Can you run "gstat" or "iostat -x -d 1" to see how busy your disk is? >And how much bandwidth is uses.

when periodic is *not* running, gstat numbers fluctuate between whats indicated below and all zeros:

dT: 1.011s  w: 1.000s
  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
     0     69      0      0    0.0     67    732    0.6   24.3| da0
     0      0      0      0    0.0      0      0    0.0    0.0| da0p1
     0      0      0      0    0.0      0      0    0.0    0.0| da0p2
     0     69      0      0    0.0     67    732    0.6   24.3| da0p3
     0      0      0      0    0.0      0      0    0.0    0.0| da0p2.eli
     0     69      0      0    0.0     67    732    0.8   24.4| da0p3.eli

# iostat -x -d 1
        extended device statistics  
device       r/s     w/s     kr/s     kw/s  ms/r  ms/w  ms/o  ms/t qlen  %b  
da0           15      18    189.4    565.1    47     7   144    27    0  23 pass0          0       0      0.0      0.0    54     0   154    79    0   0         extended device statistics  
device       r/s     w/s     kr/s     kw/s  ms/r  ms/w  ms/o  ms/t qlen  %b  
da0            3      70    172.6    698.0    26     1   129     5    0  33 pass0          0       0      0.0      0.0     0     0     0     0    0   0         extended device statistics  
device       r/s     w/s     kr/s     kw/s  ms/r  ms/w  ms/o  ms/t qlen  %b  
da0            0       0      0.0      0.0     0     0     0     0    0   0 pass0          0       0      0.0      0.0     0     0     0     0    0   0
(then all 0s repeatedly)

When periodic *is* running:

dT: 1.047s  w: 1.000s
  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
     2     89     34    138   88.1     54    783   33.9   95.0| da0
     0      0      0      0    0.0      0      0    0.0    0.0| da0p1
     0      0      0      0    0.0      0      0    0.0    0.0| da0p2
     2     89     34    138   88.1     54    783   34.0   95.0| da0p3
     0      0      0      0    0.0      0      0    0.0    0.0| da0p2.eli
     2     90     35    141   86.3     54    783   34.2   96.9| da0p3.eli

the kBps number varies wildly between 100 and 1200. %busy is 92-102%

# iostat -x -d 1
                         extended device statistics  
device       r/s     w/s     kr/s     kw/s  ms/r  ms/w  ms/o  ms/t qlen  %b  
da0           15      18    189.6    564.1    47     7   144    27   10  23 pass0          0       0      0.0      0.0    54     0   154    79    0   0
Numbers vary wildly here, too. I wonder if theres a switch for iostat that lets it
run for say 1000 seconds then produces an average of the values gathered.

I'll look for one
-- 
 



What stands out to me is that you do quite a lot of writes on the disk. (I might be mistaken.)
The max. number of IOPS for HDD is around 80 for consumer grade harddisks. I think this counts for USB connected disks.
https://en.wikipedia.org/wiki/IOPS#Mechanical_hard_drives
>From the stats you posted it looks like you are almost always doing 50+ writes/second already. That does not leave much IOPS for the find process.

ZFS tries to bundle the writes every 5 seconds to leave room for the reads. See "sysctl vfs.zfs.txg.timeout". Unless it has too much data to write or a sync request comes in.
#device       r/s     w/s     kr/s     kw/s  ms/r  ms/w  ms/o  ms/t qlen  %b
$ iostat -x  -w 1 da1 | grep da1
da1            6      19     94.0    525.0    15     2   123     8    1  12
da1            0       4      0.0     15.2     0     1   122    41    0  23
da1            0       0      0.0      0.0     0     0     0     0    0   0
da1            0       2      0.0    738.7     0     4    61    23    0   7
da1            0       0      0.0      0.0     0     0     0     0    0   0
da1            0      98      0.0   2850.9     0     1     0     1    1   7
da1            0       4      0.0     16.0     0     1   139    47    0  28
da1            0       0      0.0      0.0     0     0     0     0    0   0
da1            0       0      0.0      0.0     0     0     0     0    0   0
da1            0       0      0.0      0.0     0     0     0     0    0   0
da1            0      80      0.0   2361.3     0     1     0     1    1   6
da1            0       4      0.0     15.3     0     1   125    42    0  24

Every 5th row is a spike in writes on my RPI4. "zpool iostat 1" should give a similar pattern.

So if my observation is right it might be interesting to find out what is writing. I had similar issues after the number of jails on my RPI4 increased and they all were doing a little bit of writing which accumulated in quite a lot of writing. My solution was to add an SSD.

Regards,
Ronald.
  ------=_Part_6658_1428730899.1698413675449--