From nobody Fri Oct 27 15:23:49 2023 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SH5zX6GZsz4yMcr for ; Fri, 27 Oct 2023 15:23:52 +0000 (UTC) (envelope-from SRS0=D2w5=GJ=klop.ws=ronald-lists@realworks.nl) Received: from smtp-relay-int.realworks.nl (smtp-relay-int.realworks.nl [194.109.157.24]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4SH5zX0xFWz3Fst for ; Fri, 27 Oct 2023 15:23:52 +0000 (UTC) (envelope-from SRS0=D2w5=GJ=klop.ws=ronald-lists@realworks.nl) Authentication-Results: mx1.freebsd.org; none Date: Fri, 27 Oct 2023 17:23:49 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=klop.ws; s=rw2; t=1698420230; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=SUsACkhxKzepr0g/93zbLD76IK6I4qzx8vU/RkaxFcc=; b=UKkOT/N0q3omokSfI61LGDq6o7fVqLFxfLiBZxHBnCouIjikG0WPqEOuhfbYlRg+ud4DwH NxsA3/32ViW96Z/NWQVC+wgVyvAqNFP03FCBkQJd9B01/cDIf2xHjRL1iGWg9dM4UHZDlJ tAdmNdkzRlQHmgGItuf1ae91Kn38ro0h8tSvsnwy81D56sLaBfzDLgv87HKSv2i7CXyQX+ 3BXNunoAwEMBIOouyHA4vd/KCuPu9784Bs/iODYrADvQwGJvAqa7WNph5GZUqd4ow2cRSJ 5Hiyz/KFJG/jgQ1XHCxDFNB8ZFVzIOey5YNJ850dLGeumBwPg2hWpe7rs8Sznw== From: Ronald Klop To: void Cc: freebsd-stable@freebsd.org Message-ID: <1210534753.8409.1698420229888@localhost> In-Reply-To: References: <1122335317.4913.1698407124469@localhost> <794932758.6659.1698413675475@localhost> Subject: Re: periodic daily takes a very long time to run (14-stable) List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_8408_2001362475.1698420229883" X-Mailer: Realworks (677.7) Importance: Normal X-Priority: 3 (Normal) X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:3265, ipnet:194.109.0.0/16, country:NL] X-Rspamd-Queue-Id: 4SH5zX0xFWz3Fst ------=_Part_8408_2001362475.1698420229883 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: quoted-printable Van: void Datum: vrijdag, 27 oktober 2023 16:42 Aan: freebsd-stable@freebsd.org Onderwerp: Re: periodic daily takes a very long time to run (14-stable) >=20 > On Fri, Oct 27, 2023 at 03:34:35PM +0200, Ronald Klop wrote: > > > >What stands out to me is that you do quite a lot of writes on the disk. = (I might be mistaken.) > >The max. number of IOPS for HDD is around 80 for consumer grade harddisk= s. I think this counts for USB connected disks. > >https://en.wikipedia.org/wiki/IOPS#Mechanical_hard_drives > >From the stats you posted it looks like you are almost always doing 50+ = writes/second already. That does not leave much IOPS for the find process. > > > >ZFS tries to bundle the writes every 5 seconds to leave room for the rea= ds. See "sysctl vfs.zfs.txg.timeout". Unless it has too much data to write = or a sync request comes in. >=20 > % sysctl vfs.zfs.txg.timeout > vfs.zfs.txg.timeout: 5 >=20 > do I need to tune this? >=20 > Here's equivalent output from my setup (I ran periodic daily again) >=20 > #device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen = %b > da0 16 18 191.9 557.9 50 8 144 29 10 = 24 da0 107 0 699.7 0.0 52 0 0 52 1 = 99 da0 102 0 409.2 0.0 71 0 0 71 2= 98 da0 65 6 259.6 49.4 101 143 0 105 1= 2 101 da0 57 14 227.7 123.9 153 163 0 155 = 12 100 da0 40 19 158.8 285.8 205 103 0 172 = 12 98 da0 46 30 191.1 441.9 180 58 0 132 = 11 91 da0 63 4 261.6 16.1 162 250 239 170= 6 112 da0 67 10 273.7 83.6 99 66 0 9= 5 12 91 da0 32 21 129.4 177.9 223 102 0 1= 75 5 97 da0 48 16 191.9 261.3 173 130 0 = 162 9 109 da0 38 19 152.2 191.3 168 61 292 = 139 8 104 da0 92 0 366.9 0.0 104 0 0 = 104 4 100 da0 73 10 291.7 87.9 76 99 0= 79 12 97 da0 49 15 195.2 270.9 156 129 = 0 150 11 103 da0 53 15 212.3 248.3 139 128 = 0 137 12 92 da0 54 22 216.1 272.1 151 81 = 92 130 8 107 da0 80 4 320.9 16.0 74 201 = 125 80 3 100 da0 55 10 218.8 72.9 89 73 = 0 87 11 82 ^C >=20 > % zpool iostat 1 > capacity operations bandwidth > pool alloc free read write read write > ---------- ----- ----- ----- ----- ----- ----- > zroot 93.6G 818G 13 16 161K 506K > zroot 93.6G 818G 91 0 367K 0 > zroot 93.6G 818G 113 0 454K 0 > zroot 93.6G 818G 102 0 411K 0 > zroot 93.6G 818G 98 0 422K 0 > zroot 93.6G 818G 67 18 271K 171K > zroot 93.6G 818G 43 16 173K 252K > zroot 93.6G 818G 43 28 173K 376K > zroot 93.6G 818G 78 3 315K 15.9K > zroot 93.6G 818G 94 0 378K 0 > zroot 93.6G 818G 103 0 414K 0 > zroot 93.6G 818G 102 0 658K 0 > zroot 93.6G 818G 98 0 396K 0 > zroot 93.6G 818G 109 0 438K 0 > zroot 93.6G 818G 101 0 404K 0 > zroot 93.6G 818G 47 13 191K 91.4K > zroot 93.6G 818G 52 11 209K 126K > zroot 93.6G 818G 50 20 202K 301K > zroot 93.6G 818G 46 12 186K 128K > zroot 93.6G 818G 86 0 346K 3.93K > zroot 93.6G 818G 45 18 183K 172K > zroot 93.6G 818G 42 15 172K 343K > zroot 93.6G 818G 43 24 173K 211K > zroot 93.6G 818G 87 0 596K 0 > ^C >=20 > >So if my observation is right it might be interesting to find out what i= s writing. > would ktrace and/or truss be useful? something else? The truss -p output = of the > find PID produces massive amounts of output, all like this: >=20 > fstatat(AT_FDCWD,"5e70d5f895ccc92af6a7d5226f818b-81464.o",{ mode=3D-rw-r-= -r-- ,inode=3D367004,size=3D10312,blksize=3D10752 },AT_SYMLINK_NOFOLLOW) = =3D 0 (0x0) >=20 > with the filename changing each time >=20 > (later...) >=20 > that file is in ccache!!! >=20 > locate 5e70d5f895ccc92af6a7d5226f818b-81464.o > /var/cache/ccache/f/5/5e70d5f895ccc92af6a7d5226f818b-81464.o >=20 > maybe if I can exclude that dir (and /usr/obj) it'll lessen the periodic = runtime. > But i don't know yet whats calling find(1) when periodic daily runs. If I= can, I might be able to tell it not to walk certain heirarchies. >=20 > >I had similar issues after the number of jails on my RPI4 increased and = they all >were doing a little bit of writing which accumulated in quite a l= ot of writing. >=20 > I'm at a loss as to what's doing the writing. The system runs the followi= ng: >=20 > poudriere-devel # for aarch64 and armv7 > apcupsd # for ups monitoring > vnstat # bandwidth use, writes to its db in /var/db/vnstat > sshd > exim (local) > pflogd # right now it's behind a firewall, on NAT so it's not do= ing much > pf # same > ntpd > powerd > nginx # this serves the poudriere web frontend, and that's it (h= ttp-only) syslogd >=20 > >My solution was to add an SSD. >=20 > I have an nfs alternative. The LAN is 1GB. But I think the fix will be to= tell find > to not search some paths. just need to work out how to do it. >=20 > What would the effect be of increasing or decreasing the txg delta with s= ystem > performance? > --=20 > > > =20 >=20 >=20 >=20 Well. You could remove daily_clean_disks_enable=3D"YES" from /etc/periodic.= conf. That saves you the "find". I have never used it before. The default i= s "off". $ grep clean_disks /etc/defaults/periodic.conf =20 daily_clean_disks_enable=3D"NO" # Delete files daily daily_clean_disks_files=3D"[#,]* .#* a.out *.core *.CKP .emacs_[0-9]*" daily_clean_disks_days=3D3 # If older than this The list of files it checks for doesn't look very useful to me in 2023. Thi= s does do a full find over *all* directories and files. *every day* ??? If you have a lot of *.core files you are better of putting this in sysctl.= conf: kern.corefile=3D/var/tmp/%U.%N.%I.%P.core . So you know where to look= to delete them. Actually my RPI3 has this in cron: @daily find /var/tmp/ -name "*.core" -m= time +7 -ls -delete . About the vfs.zfs.txg.timeout. Don't mess with it if you don't have a solid= reason. It used to be 30 seconds when ZFS was new if I remember correctly.= But it could give too big write bursts which paused the system noticeably.= This is all from memory. I might be totally wrong here. :-) The stats of your system look pretty reasonable. Also the zpool iostat 1 in= dicates periods of no writes. So that is ok. You just have a lot of IOPS fo= r 1 spinning disk when everything runs together. Poudriere & ccache indicat= e that you are using it for compiling pkgs or other stuff. That is pretty h= eavy for you setup if you manage to run things in parallel as the RPI4 has = 4CPUs. It doesn't help to run daily_cleanup together. ;-) Regards, Ronald. =20 ------=_Part_8408_2001362475.1698420229883 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: quoted-printable

Van: void <void@f-m.fm>
Datum: vrijdag, 27 oktober 2023 16:42
Aan: freebsd-stable@freebsd.org
Onderwerp: Re: periodic daily takes a very long time to ru= n (14-stable)

On Fri, Oct 27, 2023 at 03:34:35P= M +0200, Ronald Klop wrote:
>
>What stands out to me is that you do quite a lot of writes on the disk.= (I might be mistaken.)
>The max. number of IOPS for HDD is around 80 for consumer grade harddis= ks. I think this counts for USB connected disks.
>h= ttps://en.wikipedia.org/wiki/IOPS#Mechanical_hard_drives
>From the stats you posted it looks like you are almost always doing 50+= writes/second already. That does not leave much IOPS for the find process.=
>
>ZFS tries to bundle the writes every 5 seconds to leave room for the re= ads. See "sysctl vfs.zfs.txg.timeout". Unless it has too much data to write= or a sync request comes in.

% sysctl vfs.zfs.txg.timeout
vfs.zfs.txg.timeout: 5

do I need to tune this?

Here's equivalent output from my setup (I ran periodic daily again)

#device       r/s     w/s=     kr/s     kw/s  ms/r  = ;ms/w  ms/o  ms/t qlen  %b
da0           16  &n= bsp;   18    191.9    557.9 &n= bsp;  50     8   144   &n= bsp;29   10  24 da0       &nbs= p;  107       0    6= 99.7      0.0    52   &nb= sp; 0     0    52    = ;1  99 da0          102 &= nbsp;     0    409.2   &n= bsp;  0.0    71     0  &n= bsp;  0    71    2  98 da0 &nb= sp;         65   &nb= sp;   6    259.6     49.4=   101   143     0   105 =   12 101 da0         &nbs= p; 57      14    227.7  &= nbsp; 123.9   153   163     0 =   155   12 100 da0       =     40      19   &nb= sp;158.8    285.8   205   103  &nbs= p;  0   172   12  98 da0   &nb= sp;       46     &nb= sp;30    191.1    441.9   180  = ;  58     0   132   11 &n= bsp;91 da0           63 &= nbsp;     4    261.6   &n= bsp; 16.1   162   250   239   = 170    6 112 da0        &= nbsp;  67      10    273.= 7     83.6    99    66 &n= bsp;   0    95   12  91 da0 &n= bsp;         32   &n= bsp;  21    129.4    177.9  &n= bsp;223   102     0   175  &nb= sp; 5  97 da0         &nb= sp; 48      16    191.9  =   261.3   173   130     0=   162    9 109 da0      =      38      19  &nb= sp; 152.2    191.3   168    61=   292   139    8 104 da0   &n= bsp;       92     &n= bsp; 0    366.9      0.0  = ; 104     0     0   = 104    4 100 da0        &= nbsp;  73      10    291.= 7     87.9    76    99 &n= bsp;   0    79   12  97 da0 &n= bsp;         49   &n= bsp;  15    195.2    270.9  &n= bsp;156   129     0   150  &nb= sp;11 103 da0           5= 3      15    212.3   &nbs= p;248.3   139   128     0  &nb= sp;137   12  92 da0       &nbs= p;   54      22    2= 16.1    272.1   151    81  &nb= sp; 92   130    8 107 da0    &= nbsp;      80      &= nbsp;4    320.9     16.0   &nb= sp;74   201   125    80   &nbs= p;3 100 da0           55 =      10    218.8    =  72.9    89    73    &nbs= p;0    87   11  82 ^C

% zpool iostat 1
capacity     operations     bandwid= th
pool        alloc   free  = ; read  write   read  write
----------  -----  -----  -----  -----  ----- &nbs= p;-----
zroot       93.6G   818G  &nbs= p;  13     16   161K   50= 6K
zroot       93.6G   818G  &nbs= p;  91      0   367K  &nb= sp;   0
zroot       93.6G   818G  &nbs= p; 113      0   454K   &n= bsp;  0
zroot       93.6G   818G  &nbs= p; 102      0   411K   &n= bsp;  0
zroot       93.6G   818G  &nbs= p;  98      0   422K  &nb= sp;   0
zroot       93.6G   818G  &nbs= p;  67     18   271K   17= 1K
zroot       93.6G   818G  &nbs= p;  43     16   173K   25= 2K
zroot       93.6G   818G  &nbs= p;  43     28   173K   37= 6K
zroot       93.6G   818G  &nbs= p;  78      3   315K  15.= 9K
zroot       93.6G   818G  &nbs= p;  94      0   378K  &nb= sp;   0
zroot       93.6G   818G  &nbs= p; 103      0   414K   &n= bsp;  0
zroot       93.6G   818G  &nbs= p; 102      0   658K   &n= bsp;  0
zroot       93.6G   818G  &nbs= p;  98      0   396K  &nb= sp;   0
zroot       93.6G   818G  &nbs= p; 109      0   438K   &n= bsp;  0
zroot       93.6G   818G  &nbs= p; 101      0   404K   &n= bsp;  0
zroot       93.6G   818G  &nbs= p;  47     13   191K  91.4K zroot       93.6G   818G  &nbs= p;  52     11   209K   12= 6K
zroot       93.6G   818G  &nbs= p;  50     20   202K   30= 1K
zroot       93.6G   818G  &nbs= p;  46     12   186K   12= 8K
zroot       93.6G   818G  &nbs= p;  86      0   346K  3.9= 3K
zroot       93.6G   818G  &nbs= p;  45     18   183K   17= 2K
zroot       93.6G   818G  &nbs= p;  42     15   172K   34= 3K
zroot       93.6G   818G  &nbs= p;  43     24   173K   21= 1K
zroot       93.6G   818G  &nbs= p;  87      0   596K  &nb= sp;   0
^C

>So if my observation is right it might be interesting to find out what = is writing.
would ktrace and/or truss be useful? something else? The truss -p output of= the
find PID produces massive amounts of output, all like this:

fstatat(AT_FDCWD,"5e70d5f895ccc92af6a7d5226f818b-81464.o",{ mode=3D-rw-r--r= -- ,inode=3D367004,size=3D10312,blksize=3D10752 },AT_SYMLINK_NOFOLLOW) =3D = 0 (0x0)

with the filename changing each time

(later...)

that file is in ccache!!!

locate 5e70d5f895ccc92af6a7d5226f818b-81464.o
/var/cache/ccache/f/5/5e70d5f895ccc92af6a7d5226f818b-81464.o

maybe if I can exclude that dir (and /usr/obj) it'll lessen the periodic ru= ntime.
But i don't know yet whats calling find(1) when periodic daily runs. If I c= an, I might be able to tell it not to walk certain heirarchies.

>I had similar issues after the number of jails on my RPI4 increased and= they all >were doing a little bit of writing which accumulated in quite= a lot of writing.

I'm at a loss as to what's doing the writing. The system runs the following= :

poudriere-devel # for aarch64 and armv7
apcupsd         # for ups monitorin= g
vnstat          # bandwidth us= e, writes to its db in /var/db/vnstat
sshd
exim (local)
pflogd          # right now it= 's behind a firewall, on NAT so it's not doing much
pf             =  # same
ntpd
powerd
nginx          # this serves t= he poudriere web frontend, and that's it (http-only) syslogd

>My solution was to add an SSD.

I have an nfs alternative. The LAN is 1GB. But I think the fix will be to t= ell find
to not search some paths. just need to work out how to do it.

What would the effect be of increasing or decreasing the txg delta with sys= tem
performance?
-- 
>
 





Well. You could remove daily_clean_disks_enable=3D"YES" from /etc/periodic.= conf. That saves you the "find". I have never used it before. The default i= s "off".

$ grep clean_disks /etc/defaults/periodic.conf  
daily_clean_disks_enable=3D"NO"         =        # Delete files daily
daily_clean_disks_files=3D"[#,]* .#* a.out *.core *.CKP .emacs_[0-9]*"
daily_clean_disks_days=3D3          = ;      # If older than this

The list of files it checks for doesn't look very useful to me in 2023. Thi= s does do a full find over *all* directories and files. *every day* ???
If you have a lot of *.core files you are better of putting this in sysctl.= conf: kern.corefile=3D/var/tmp/%U.%N.%I.%P.core . So you know where to look= to delete them.
Actually my RPI3 has this in cron: @daily  find /var/tmp/ -name "*.cor= e" -mtime +7 -ls -delete .

About the vfs.zfs.txg.timeout. Don't mess with it if you don't have a solid= reason. It used to be 30 seconds when ZFS was new if I remember correctly.= But it could give too big write bursts which paused the system noticeably.= This is all from memory. I might be totally wrong here. :-)

The stats of your system look pretty reasonable. Also the zpool iostat 1 in= dicates periods of no writes. So that is ok. You just have a lot of IOPS fo= r 1 spinning disk when everything runs together. Poudriere & ccache ind= icate that you are using it for compiling pkgs or other stuff. That is pret= ty heavy for you setup if you manage to run things in parallel as the RPI4 = has 4CPUs. It doesn't help to run daily_cleanup together. ;-)

Regards,
Ronald.


  ------=_Part_8408_2001362475.1698420229883--