From owner-freebsd-hackers@freebsd.org Tue Jun 23 07:23:51 2020 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E777534BF1C for ; Tue, 23 Jun 2020 07:23:51 +0000 (UTC) (envelope-from gbe@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49rd7C5vNrz47nG; Tue, 23 Jun 2020 07:23:51 +0000 (UTC) (envelope-from gbe@freebsd.org) Received: from localhost (p4fd3af72.dip0.t-ipconnect.de [79.211.175.114]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: gbe) by smtp.freebsd.org (Postfix) with ESMTPSA id 84C9B13396; Tue, 23 Jun 2020 07:23:51 +0000 (UTC) (envelope-from gbe@freebsd.org) Date: Tue, 23 Jun 2020 09:23:48 +0200 From: Gordon Bergling To: Allan Jude Cc: freebsd-hackers@freebsd.org Subject: Re: Constant load of 1 on a recent 12-STABLE Message-ID: <20200623072332.GA54653@lion.0xfce3.net> References: <20200603101607.GA80381@lion.0xfce3.net> <20200603202929.GA65032@lion.0xfce3.net> <8b1498ea-e343-506e-79c7-c25b594808f0@freebsd.org> <20200604123720.GA63595@lion.0xfce3.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200604123720.GA63595@lion.0xfce3.net> X-Url: X-Operating-System: FreeBSD 12.1-STABLE amd64 X-Host-Uptime: 9:00AM up 16 mins, 3 users, load averages: 1.36, 1.21, 0.85 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Jun 2020 07:23:52 -0000 Hi Allan, On Thu, Jun 04, 2020 at 02:37:20PM +0200, Gordon Bergling wrote: > Hi Allan, > > On Wed, Jun 03, 2020 at 05:33:37PM -0400, Allan Jude wrote: > > On 2020-06-03 16:29, Gordon Bergling wrote: > > > Hi Allan, > > > > > > On Wed, Jun 03, 2020 at 03:13:47PM -0400, Allan Jude wrote: > > >> On 2020-06-03 06:16, Gordon Bergling via freebsd-hackers wrote: > > >>> since a while I am seeing a constant load of 1.00 on 12-STABLE, > > >>> but all CPUs are shown as 100% idle in top. > > >>> > > >>> Has anyone an idea what could caused this? > > >>> > > >>> The load seems to be somewhat real, since the buildtimes on this > > >>> machine for -CURRENT increased from about 2 hours to 3 hours. > > >>> > > >>> This a virtualized system running on Hyper-V, if that matters. > > >>> > > >>> Any hints are more then appreciated. > > >>> > > >> Try running 'top -SP' and see if that shows a specific CPU being busy, > > >> or a specific process using CPU time > > > > > > Below is the output of 'top -SP'. The only relevant process / thread that is > > > relatively constant consumes CPU time seams to be 'zfskern'. > > > > > > ----------------------------------------------------------------------------- > > > last pid: 68549; load averages: 1.10, 1.19, 1.16 up 0+14:59:45 22:17:24 > > > 67 processes: 2 running, 64 sleeping, 1 waiting > > > CPU 0: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > > > CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > > > CPU 2: 0.0% user, 0.0% nice, 0.4% system, 0.0% interrupt, 99.6% idle > > > CPU 3: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > > > Mem: 108M Active, 4160M Inact, 33M Laundry, 3196M Wired, 444M Free > > > ARC: 1858M Total, 855M MFU, 138M MRU, 96K Anon, 24M Header, 840M Other > > > 461M Compressed, 1039M Uncompressed, 2.25:1 Ratio > > > Swap: 2048M Total, 2048M Free > > > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > > > 11 root 4 155 ki31 0B 64K RUN 0 47.3H 386.10% idle > > > 8 root 65 -8 - 0B 1040K t->zth 0 115:39 12.61% zfskern > > > ------------------------------------------------------------------------------- > > > > > > The only key performance indicator that is relatively high IMHO, for a > > > non-busy system, are the context switches, that vmstat has reported. > > > > > > ------------------------------------------------------------------------------- > > > procs memory page disks faults cpu > > > r b w avm fre flt re pi po fr sr da0 da1 in sy cs us sy id > > > 0 0 0 514G 444M 7877 2 7 0 9595 171 0 0 0 4347 43322 17 2 81 > > > 0 0 0 514G 444M 1 0 0 0 0 44 0 0 0 121 40876 0 0 100 > > > 0 0 0 514G 444M 0 0 0 0 0 40 0 0 0 133 42520 0 0 100 > > > 0 0 0 514G 444M 0 0 0 0 0 40 0 0 0 120 43830 0 0 100 > > > 0 0 0 514G 444M 0 0 0 0 0 40 0 0 0 132 42917 0 0 100 > > > -------------------------------------------------------------------------------- > > > > > > Any other ideas what could generate that load? > > > > I agree that load average looks out of place here when you look at the % > > cpu idle, but I wonder if it is caused by a lot of short lived processes > > or threads. > > > > How quickly is the 'last pid' number going up? > > > > You might also look at `zpool iostat 1` or `gstat -p` to see how busy > > your disks are > > In the IDLE state the last pid isn't changing within at least 60 seconds. > During 'buildworld' times it is off course much shorter, but a "-j 4" is > resulting in a load average from about 5.0, so that the underlying problem > still persists. 'zpool iostat 1' and 'gstat -p' doesn't show anything > suspicious. > > I had a private mail that made me aware of PR173541, where this problem is > documentated. I'll add my hardware information and performance measurements > to it when I find some time. > > I am currently thinking about how to measure the spawned threads/s. Did you > have an idea how to do it? What is maybe also related to the problem is the following top output, collected by 'top -HS'. The three zfskern threads generate little load, but that constant. The annoying part of the problem is, that the load of 1 is reported and that the hostsystem schedules the thread over all 4 physical cores and keeps the clockrate at the highest frequency. PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11 root 155 ki31 0B 64K CPU2 2 17:28 97.39% idle{idle: cpu2} 11 root 155 ki31 0B 64K CPU3 3 17:29 96.78% idle{idle: cpu3} 11 root 155 ki31 0B 64K CPU1 1 17:29 96.40% idle{idle: cpu1} 11 root 155 ki31 0B 64K RUN 0 17:25 96.13% idle{idle: cpu0} 8 root -8 - 0B 1040K mmp->m 2 0:44 4.32% zfskern{mmp_thread_enter} 8 root -8 - 0B 1040K mmp->m 1 0:44 4.28% zfskern{mmp_thread_enter} 8 root -8 - 0B 1040K mmp->m 3 0:44 4.25% zfskern{mmp_thread_enter} Does anyone has some insight where zfskern is spawned? I maybe can bisect then the resulting change which leads to this behaviour. --Gordon