From owner-freebsd-current@freebsd.org Sun Mar 25 18:36:32 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D3E8CF61D17 for ; Sun, 25 Mar 2018 18:36:31 +0000 (UTC) (envelope-from andy@neu.net) Received: from mail.neu.net (neu.net [104.225.8.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "freebsd-11-64", Issuer "freebsd-11-64" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 769E472FBB; Sun, 25 Mar 2018 18:36:31 +0000 (UTC) (envelope-from andy@neu.net) Received: from neu.net (neu.net [104.225.8.138]) by mail.neu.net (8.15.2/8.15.2) with ESMTPS id w2PIaObX065628 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sun, 25 Mar 2018 14:36:24 -0400 (EDT) (envelope-from andy@neu.net) Date: Sun, 25 Mar 2018 14:36:23 -0400 (EDT) From: AN To: John Baldwin cc: freebsd-current@freebsd.org, "markj@FreeBSD.org" , "" Subject: Re: problem with [intr{swi4: clock (0)}] In-Reply-To: <1901802.BMbLzLVd8F@ralph.baldwin.cx> Message-ID: References: <1901802.BMbLzLVd8F@ralph.baldwin.cx> User-Agent: Alpine 2.21 (BSF 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Spam-Status: No, score=0.0 required=5.0 tests=RP_MATCHES_RCVD, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail.neu.net X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Mar 2018 18:36:32 -0000 Hi: On Fri, 23 Mar 2018, John Baldwin wrote: > Date: Fri, 23 Mar 2018 12:11:03 -0700 > From: John Baldwin > To: freebsd-current@freebsd.org > Cc: AN , "markj@FreeBSD.org" , > "" > Subject: Re: problem with [intr{swi4: clock (0)}] > > On Wednesday, March 21, 2018 11:36:48 AM AN wrote: >> Hi: >> >> I would appreciate any help with this issue, this is a new machine built >> in the last week and if it is a hardware issue I want to return it. The >> problem seems to have started in the last 24 hours or so. I am seeing a >> really high cpu utilization for [intr{swi4: clock (0)}]. I have tried a >> couple things to troubleshoot: > > I would try using dtrace to figure out which functions are running in the > callout thread. I've cc'd a couple of folks in case they already have dtrace > scripts to do this. You would probably want a script that watched > callout_execute::callout-start and callout_execute::callout-end events. You > would want to save the start time in callout-start and then report a delta > along with the values of 'c->c_func' (the last argument to these probes is > 'c'). You might be able to just store the time delta in an aggregate that is > keyed on the function. Actually, I've gone ahead and written a little > script: > > ---- > callout_execute:::callout-start > { > self->start = timestamp; > self->func = args[0]->c_func; > @funcs[self->func] = count(); > } > > callout_execute:::callout-end > { > @functimes[self->func] = sum(timestamp - self->start); > } > > END > { > printf("\n\nCallout function counts:\n"); > printa("%@8u %a\n", @funcs); > printf("\nCallout function runtime:\n"); > printa("%@d %a\n", @functimes); > } > ---- > > Store this in a file named 'callout.d' and then run 'dtrace -s callout.d'. > Let it run for a second or two and then use Ctrl-C to stop it. > > The first table it will output is a histogram showing how many times > different functions were invoked. The second table will count how much > total time was spent in each function: > > CPU ID FUNCTION:NAME > 4 2 :END > > Callout function counts: > 2 kernel`kbdmux_kbd_intr_timo > 2 kernel`usb_power_wdog > 2 kernel`ipport_tick > 2 kernel`tcp_timer_delack > 2 kernel`nd6_timer > 2 kernel`key_timehandler > 2 dtrace.ko`dtrace_state_deadman > 4 kernel`newnfs_timer > 4 kernel`pfslowtimo > 10 kernel`logtimeout > 10 kernel`pffasttimo > 18 kernel`lim_cb > 32 kernel`iflib_timer > 84 kernel`sleepq_timeout > 224 dtrace.ko`dtrace_state_clean > > Callout function runtime: > 2080 kernel`logtimeout > 2198 kernel`kbdmux_kbd_intr_timo > 2890 kernel`ipport_tick > 3550 kernel`iflib_timer > 3672 kernel`lim_cb > 3936 kernel`pffasttimo > 4023 dtrace.ko`dtrace_state_clean > 4224 kernel`newnfs_timer > 4751 kernel`key_timehandler > 5286 kernel`nd6_timer > 6700 kernel`usb_power_wdog > 7341 kernel`pfslowtimo > 19607 kernel`tcp_timer_delack > 20273 dtrace.ko`dtrace_state_deadman > 32262 kernel`sleepq_timeout > > You can use this to figure out which timer events are using CPU in the > softclock thread/process. > > To John and others who responded thanks for your time. I have to apologize though for wasting your spare cpu cycles. It turns out the root cause was a malfunctioning USB keyboard with a stuck key. Removed and replaced, now everything is working normally. Thanks again and sorry for the noise. Best regards, Andy