Date: Fri, 27 Oct 2023 17:23:49 +0200 (CEST) From: Ronald Klop <ronald-lists@klop.ws> To: void <void@f-m.fm> Cc: freebsd-stable@freebsd.org Subject: Re: periodic daily takes a very long time to run (14-stable) Message-ID: <1210534753.8409.1698420229888@localhost> In-Reply-To: <ZTvMODY-mcBImHZP@int21h> References: <ZTuNvVMW_XG3mZKU@int21h> <1122335317.4913.1698407124469@localhost> <ZTuyXPjddEPqh-bi@int21h> <794932758.6659.1698413675475@localhost> <ZTvMODY-mcBImHZP@int21h>
next in thread | previous in thread | raw e-mail | index | archive | help
------=_Part_8408_2001362475.1698420229883 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: quoted-printable Van: void <void@f-m.fm> Datum: vrijdag, 27 oktober 2023 16:42 Aan: freebsd-stable@freebsd.org Onderwerp: Re: periodic daily takes a very long time to run (14-stable) >=20 > On Fri, Oct 27, 2023 at 03:34:35PM +0200, Ronald Klop wrote: > > > >What stands out to me is that you do quite a lot of writes on the disk. = (I might be mistaken.) > >The max. number of IOPS for HDD is around 80 for consumer grade harddisk= s. I think this counts for USB connected disks. > >https://en.wikipedia.org/wiki/IOPS#Mechanical_hard_drives > >From the stats you posted it looks like you are almost always doing 50+ = writes/second already. That does not leave much IOPS for the find process. > > > >ZFS tries to bundle the writes every 5 seconds to leave room for the rea= ds. See "sysctl vfs.zfs.txg.timeout". Unless it has too much data to write = or a sync request comes in. >=20 > % sysctl vfs.zfs.txg.timeout > vfs.zfs.txg.timeout: 5 >=20 > do I need to tune this? >=20 > Here's equivalent output from my setup (I ran periodic daily again) >=20 > #device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen = %b > da0 16 18 191.9 557.9 50 8 144 29 10 = 24 da0 107 0 699.7 0.0 52 0 0 52 1 = 99 da0 102 0 409.2 0.0 71 0 0 71 2= 98 da0 65 6 259.6 49.4 101 143 0 105 1= 2 101 da0 57 14 227.7 123.9 153 163 0 155 = 12 100 da0 40 19 158.8 285.8 205 103 0 172 = 12 98 da0 46 30 191.1 441.9 180 58 0 132 = 11 91 da0 63 4 261.6 16.1 162 250 239 170= 6 112 da0 67 10 273.7 83.6 99 66 0 9= 5 12 91 da0 32 21 129.4 177.9 223 102 0 1= 75 5 97 da0 48 16 191.9 261.3 173 130 0 = 162 9 109 da0 38 19 152.2 191.3 168 61 292 = 139 8 104 da0 92 0 366.9 0.0 104 0 0 = 104 4 100 da0 73 10 291.7 87.9 76 99 0= 79 12 97 da0 49 15 195.2 270.9 156 129 = 0 150 11 103 da0 53 15 212.3 248.3 139 128 = 0 137 12 92 da0 54 22 216.1 272.1 151 81 = 92 130 8 107 da0 80 4 320.9 16.0 74 201 = 125 80 3 100 da0 55 10 218.8 72.9 89 73 = 0 87 11 82 ^C >=20 > % zpool iostat 1 > capacity operations bandwidth > pool alloc free read write read write > ---------- ----- ----- ----- ----- ----- ----- > zroot 93.6G 818G 13 16 161K 506K > zroot 93.6G 818G 91 0 367K 0 > zroot 93.6G 818G 113 0 454K 0 > zroot 93.6G 818G 102 0 411K 0 > zroot 93.6G 818G 98 0 422K 0 > zroot 93.6G 818G 67 18 271K 171K > zroot 93.6G 818G 43 16 173K 252K > zroot 93.6G 818G 43 28 173K 376K > zroot 93.6G 818G 78 3 315K 15.9K > zroot 93.6G 818G 94 0 378K 0 > zroot 93.6G 818G 103 0 414K 0 > zroot 93.6G 818G 102 0 658K 0 > zroot 93.6G 818G 98 0 396K 0 > zroot 93.6G 818G 109 0 438K 0 > zroot 93.6G 818G 101 0 404K 0 > zroot 93.6G 818G 47 13 191K 91.4K > zroot 93.6G 818G 52 11 209K 126K > zroot 93.6G 818G 50 20 202K 301K > zroot 93.6G 818G 46 12 186K 128K > zroot 93.6G 818G 86 0 346K 3.93K > zroot 93.6G 818G 45 18 183K 172K > zroot 93.6G 818G 42 15 172K 343K > zroot 93.6G 818G 43 24 173K 211K > zroot 93.6G 818G 87 0 596K 0 > ^C >=20 > >So if my observation is right it might be interesting to find out what i= s writing. > would ktrace and/or truss be useful? something else? The truss -p output = of the > find PID produces massive amounts of output, all like this: >=20 > fstatat(AT_FDCWD,"5e70d5f895ccc92af6a7d5226f818b-81464.o",{ mode=3D-rw-r-= -r-- ,inode=3D367004,size=3D10312,blksize=3D10752 },AT_SYMLINK_NOFOLLOW) = =3D 0 (0x0) >=20 > with the filename changing each time >=20 > (later...) >=20 > that file is in ccache!!! >=20 > locate 5e70d5f895ccc92af6a7d5226f818b-81464.o > /var/cache/ccache/f/5/5e70d5f895ccc92af6a7d5226f818b-81464.o >=20 > maybe if I can exclude that dir (and /usr/obj) it'll lessen the periodic = runtime. > But i don't know yet whats calling find(1) when periodic daily runs. If I= can, I might be able to tell it not to walk certain heirarchies. >=20 > >I had similar issues after the number of jails on my RPI4 increased and = they all >were doing a little bit of writing which accumulated in quite a l= ot of writing. >=20 > I'm at a loss as to what's doing the writing. The system runs the followi= ng: >=20 > poudriere-devel # for aarch64 and armv7 > apcupsd # for ups monitoring > vnstat # bandwidth use, writes to its db in /var/db/vnstat > sshd > exim (local) > pflogd # right now it's behind a firewall, on NAT so it's not do= ing much > pf # same > ntpd > powerd > nginx # this serves the poudriere web frontend, and that's it (h= ttp-only) syslogd >=20 > >My solution was to add an SSD. >=20 > I have an nfs alternative. The LAN is 1GB. But I think the fix will be to= tell find > to not search some paths. just need to work out how to do it. >=20 > What would the effect be of increasing or decreasing the txg delta with s= ystem > performance? > --=20 > > > =20 >=20 >=20 >=20 Well. You could remove daily_clean_disks_enable=3D"YES" from /etc/periodic.= conf. That saves you the "find". I have never used it before. The default i= s "off". $ grep clean_disks /etc/defaults/periodic.conf =20 daily_clean_disks_enable=3D"NO" # Delete files daily daily_clean_disks_files=3D"[#,]* .#* a.out *.core *.CKP .emacs_[0-9]*" daily_clean_disks_days=3D3 # If older than this The list of files it checks for doesn't look very useful to me in 2023. Thi= s does do a full find over *all* directories and files. *every day* ??? If you have a lot of *.core files you are better of putting this in sysctl.= conf: kern.corefile=3D/var/tmp/%U.%N.%I.%P.core . So you know where to look= to delete them. Actually my RPI3 has this in cron: @daily find /var/tmp/ -name "*.core" -m= time +7 -ls -delete . About the vfs.zfs.txg.timeout. Don't mess with it if you don't have a solid= reason. It used to be 30 seconds when ZFS was new if I remember correctly.= But it could give too big write bursts which paused the system noticeably.= This is all from memory. I might be totally wrong here. :-) The stats of your system look pretty reasonable. Also the zpool iostat 1 in= dicates periods of no writes. So that is ok. You just have a lot of IOPS fo= r 1 spinning disk when everything runs together. Poudriere & ccache indicat= e that you are using it for compiling pkgs or other stuff. That is pretty h= eavy for you setup if you manage to run things in parallel as the RPI4 has = 4CPUs. It doesn't help to run daily_cleanup together. ;-) Regards, Ronald. =20 ------=_Part_8408_2001362475.1698420229883 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: quoted-printable <html><head></head><body><br> <p><strong>Van:</strong> void <void@f-m.fm><br> <strong>Datum:</strong> vrijdag, 27 oktober 2023 16:42<br> <strong>Aan:</strong> freebsd-stable@freebsd.org<br> <strong>Onderwerp:</strong> Re: periodic daily takes a very long time to ru= n (14-stable)</p> <blockquote style=3D"padding-right: 0px; padding-left: 5px; margin-left: 5p= x; border-left: #000000 2px solid; margin-right: 0px"> <div class=3D"MessageRFC822Viewer" id=3D"P"> <div class=3D"TextPlainViewer" id=3D"P.P">On Fri, Oct 27, 2023 at 03:34:35P= M +0200, Ronald Klop wrote:<br> ><br> >What stands out to me is that you do quite a lot of writes on the disk.= (I might be mistaken.)<br> >The max. number of IOPS for HDD is around 80 for consumer grade harddis= ks. I think this counts for USB connected disks.<br> ><a href=3D"https://en.wikipedia.org/wiki/IOPS#Mechanical_hard_drives">h= ttps://en.wikipedia.org/wiki/IOPS#Mechanical_hard_drives</a><br> >From the stats you posted it looks like you are almost always doing 50+= writes/second already. That does not leave much IOPS for the find process.= <br> ><br> >ZFS tries to bundle the writes every 5 seconds to leave room for the re= ads. See "sysctl vfs.zfs.txg.timeout". Unless it has too much data to write= or a sync request comes in.<br> <br> % sysctl vfs.zfs.txg.timeout<br> vfs.zfs.txg.timeout: 5<br> <br> do I need to tune this?<br> <br> Here's equivalent output from my setup (I ran periodic daily again)<br> <br> #device r/s w/s= kr/s kw/s ms/r  = ;ms/w ms/o ms/t qlen %b<br> da0 16 &n= bsp; 18 191.9 557.9 &n= bsp; 50 8 144 &n= bsp;29 10 24 da0 &nbs= p; 107 0 6= 99.7 0.0 52 &nb= sp; 0 0 52  = ;1 99 da0 102 &= nbsp; 0 409.2 &n= bsp; 0.0 71 0 &n= bsp; 0 71 2 98 da0 &nb= sp; 65 &nb= sp; 6 259.6 49.4= 101 143 0 105 = 12 101 da0 &nbs= p; 57 14 227.7 &= nbsp; 123.9 153 163 0 = 155 12 100 da0 = 40 19 &nb= sp;158.8 285.8 205 103 &nbs= p; 0 172 12 98 da0 &nb= sp; 46 &nb= sp;30 191.1 441.9 180  = ; 58 0 132 11 &n= bsp;91 da0 63 &= nbsp; 4 261.6 &n= bsp; 16.1 162 250 239 = 170 6 112 da0 &= nbsp; 67 10 273.= 7 83.6 99 66 &n= bsp; 0 95 12 91 da0 &n= bsp; 32 &n= bsp; 21 129.4 177.9 &n= bsp;223 102 0 175 &nb= sp; 5 97 da0 &nb= sp; 48 16 191.9 = 261.3 173 130 0= 162 9 109 da0 = 38 19 &nb= sp; 152.2 191.3 168 61= 292 139 8 104 da0 &n= bsp; 92 &n= bsp; 0 366.9 0.0  = ; 104 0 0 = 104 4 100 da0 &= nbsp; 73 10 291.= 7 87.9 76 99 &n= bsp; 0 79 12 97 da0 &n= bsp; 49 &n= bsp; 15 195.2 270.9 &n= bsp;156 129 0 150 &nb= sp;11 103 da0 5= 3 15 212.3 &nbs= p;248.3 139 128 0 &nb= sp;137 12 92 da0 &nbs= p; 54 22 2= 16.1 272.1 151 81 &nb= sp; 92 130 8 107 da0 &= nbsp; 80 &= nbsp;4 320.9 16.0 &nb= sp;74 201 125 80 &nbs= p;3 100 da0 55 = 10 218.8 = 72.9 89 73 &nbs= p;0 87 11 82 ^C<br> <br> % zpool iostat 1<br> capacity operations bandwid= th<br> pool alloc free  = ; read write read write<br> ---------- ----- ----- ----- ----- ----- &nbs= p;-----<br> zroot 93.6G 818G &nbs= p; 13 16 161K 50= 6K<br> zroot 93.6G 818G &nbs= p; 91 0 367K &nb= sp; 0<br> zroot 93.6G 818G &nbs= p; 113 0 454K &n= bsp; 0<br> zroot 93.6G 818G &nbs= p; 102 0 411K &n= bsp; 0<br> zroot 93.6G 818G &nbs= p; 98 0 422K &nb= sp; 0<br> zroot 93.6G 818G &nbs= p; 67 18 271K 17= 1K<br> zroot 93.6G 818G &nbs= p; 43 16 173K 25= 2K<br> zroot 93.6G 818G &nbs= p; 43 28 173K 37= 6K<br> zroot 93.6G 818G &nbs= p; 78 3 315K 15.= 9K<br> zroot 93.6G 818G &nbs= p; 94 0 378K &nb= sp; 0<br> zroot 93.6G 818G &nbs= p; 103 0 414K &n= bsp; 0<br> zroot 93.6G 818G &nbs= p; 102 0 658K &n= bsp; 0<br> zroot 93.6G 818G &nbs= p; 98 0 396K &nb= sp; 0<br> zroot 93.6G 818G &nbs= p; 109 0 438K &n= bsp; 0<br> zroot 93.6G 818G &nbs= p; 101 0 404K &n= bsp; 0<br> zroot 93.6G 818G &nbs= p; 47 13 191K 91.4K<br= > zroot 93.6G 818G &nbs= p; 52 11 209K 12= 6K<br> zroot 93.6G 818G &nbs= p; 50 20 202K 30= 1K<br> zroot 93.6G 818G &nbs= p; 46 12 186K 12= 8K<br> zroot 93.6G 818G &nbs= p; 86 0 346K 3.9= 3K<br> zroot 93.6G 818G &nbs= p; 45 18 183K 17= 2K<br> zroot 93.6G 818G &nbs= p; 42 15 172K 34= 3K<br> zroot 93.6G 818G &nbs= p; 43 24 173K 21= 1K<br> zroot 93.6G 818G &nbs= p; 87 0 596K &nb= sp; 0<br> ^C<br> <br> >So if my observation is right it might be interesting to find out what = is writing.<br> would ktrace and/or truss be useful? something else? The truss -p output of= the<br> find PID produces massive amounts of output, all like this:<br> <br> fstatat(AT_FDCWD,"5e70d5f895ccc92af6a7d5226f818b-81464.o",{ mode=3D-rw-r--r= -- ,inode=3D367004,size=3D10312,blksize=3D10752 },AT_SYMLINK_NOFOLLOW) =3D = 0 (0x0)<br> <br> with the filename changing each time<br> <br> (later...)<br> <br> that file is in ccache!!!<br> <br> locate 5e70d5f895ccc92af6a7d5226f818b-81464.o<br> /var/cache/ccache/f/5/5e70d5f895ccc92af6a7d5226f818b-81464.o<br> <br> maybe if I can exclude that dir (and /usr/obj) it'll lessen the periodic ru= ntime.<br> But i don't know yet whats calling find(1) when periodic daily runs. If I c= an, I might be able to tell it not to walk certain heirarchies.<br> <br> >I had similar issues after the number of jails on my RPI4 increased and= they all >were doing a little bit of writing which accumulated in quite= a lot of writing.<br> <br> I'm at a loss as to what's doing the writing. The system runs the following= :<br> <br> poudriere-devel # for aarch64 and armv7<br> apcupsd # for ups monitorin= g<br> vnstat # bandwidth us= e, writes to its db in /var/db/vnstat<br> sshd<br> exim (local)<br> pflogd # right now it= 's behind a firewall, on NAT so it's not doing much<br> pf = # same<br> ntpd<br> powerd<br> nginx # this serves t= he poudriere web frontend, and that's it (http-only) syslogd<br> <br> >My solution was to add an SSD.<br> <br> I have an nfs alternative. The LAN is 1GB. But I think the fix will be to t= ell find<br> to not search some paths. just need to work out how to do it.<br> <br> What would the effect be of increasing or decreasing the txg delta with sys= tem<br> performance?<br> -- <br> ><br> </div> <hr></div> </blockquote> <br> <br> <br> <br> Well. You could remove daily_clean_disks_enable=3D"YES" from /etc/periodic.= conf. That saves you the "find". I have never used it before. The default i= s "off".<br> <br> $ grep clean_disks /etc/defaults/periodic.conf <br> daily_clean_disks_enable=3D"NO" = # Delete files daily<br> daily_clean_disks_files=3D"[#,]* .#* a.out *.core *.CKP .emacs_[0-9]*"<br> daily_clean_disks_days=3D3  = ; # If older than this<br> <br> The list of files it checks for doesn't look very useful to me in 2023. Thi= s does do a full find over *all* directories and files. *every day* ???<br> If you have a lot of *.core files you are better of putting this in sysctl.= conf: kern.corefile=3D/var/tmp/%U.%N.%I.%P.core . So you know where to look= to delete them.<br> Actually my RPI3 has this in cron: @daily find /var/tmp/ -name "*.cor= e" -mtime +7 -ls -delete .<br> <br> About the vfs.zfs.txg.timeout. Don't mess with it if you don't have a solid= reason. It used to be 30 seconds when ZFS was new if I remember correctly.= But it could give too big write bursts which paused the system noticeably.= This is all from memory. I might be totally wrong here. :-)<br> <br> The stats of your system look pretty reasonable. Also the zpool iostat 1 in= dicates periods of no writes. So that is ok. You just have a lot of IOPS fo= r 1 spinning disk when everything runs together. Poudriere & ccache ind= icate that you are using it for compiling pkgs or other stuff. That is pret= ty heavy for you setup if you manage to run things in parallel as the RPI4 = has 4CPUs. It doesn't help to run daily_cleanup together. ;-)<br> <br> Regards,<br> Ronald.<br> <br> <br> </body></html> ------=_Part_8408_2001362475.1698420229883--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1210534753.8409.1698420229888>