From owner-freebsd-stable@FreeBSD.ORG  Sun Mar 29 13:25:03 2009
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 30F3F1065672;
	Sun, 29 Mar 2009 13:25:03 +0000 (UTC)
	(envelope-from kamikaze@bsdforen.de)
Received: from mail.bsdforen.de (bsdforen.de [212.204.60.79])
	by mx1.freebsd.org (Postfix) with ESMTP id AAC838FC15;
	Sun, 29 Mar 2009 13:25:02 +0000 (UTC)
	(envelope-from kamikaze@bsdforen.de)
Received: from mobileKamikaze.norad (unknown [88.130.212.151])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.bsdforen.de (Postfix) with ESMTP id 98B318A00D1;
	Sun, 29 Mar 2009 15:25:00 +0200 (CEST)
Message-ID: <49CF6899.2060002@bsdforen.de>
Date: Sun, 29 Mar 2009 14:24:57 +0200
From: Dominic Fandrey <kamikaze@bsdforen.de>
User-Agent: Thunderbird 2.0.0.21 (X11/20090321)
MIME-Version: 1.0
To: Alexander Motin <mav@FreeBSD.org>
References: <1238293386.00093672.1238281804@10.7.7.3>	<49CF0803.1070505@FreeBSD.org>	<49CF2F8D.6000905@bsdforen.de>	<49CF4EB9.60108@FreeBSD.org>	<49CF49F5.6010800@bsdforen.de>	<49CF615A.6050304@FreeBSD.org>	<49CF595A.30805@bsdforen.de>	<49CF6B28.2080400@FreeBSD.org>
	<49CF60AB.4040709@bsdforen.de>
In-Reply-To: <49CF60AB.4040709@bsdforen.de>
X-Enigmail-Version: 0.95.7
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-stable@freebsd.org
Subject: Re: powerd broken
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 29 Mar 2009 13:25:03 -0000

Dominic Fandrey wrote:
> Alexander Motin wrote:
>> Dominic Fandrey wrote:
>>> Alexander Motin wrote:
>>>> It means that one of your CPUs spent most of it's time in interrupt
>>>> processing and so far from idle. What does `top -P` shows you? Where
>>>> have you seen that ~6% CPU load?
>>> That is the load shown by the e17 CPU module. It's display has always
>>> been in sync with top in the past, no longer though, it appears.
>>>
>>> # top -PIS
>>> last pid: 68235;  load averages:  0.09,  0.16, 
>>> 0.17                                                                                                                               
>>> up 0+05:17:29  13:05:10
>>> 137 processes: 4 running, 117 sleeping, 16 waiting
>>> CPU 0:  1.1% user,  0.0% nice,  1.1% system, 61.4% interrupt, 36.3% idle
>>> CPU 1:  9.0% user,  0.0% nice,  3.4% system,  0.0% interrupt, 87.6% idle
>>> Mem: 419M Active, 415M Inact, 416M Wired, 3752K Cache, 183M Buf, 716M
>>> Free
>>> Swap: 4096M Total, 4096M Free
>>>
>>>   PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU
>>> COMMAND
>>>    11 root          1 171 ki31     0K    16K RUN    1 286:42 93.46%
>>> idle: cpu1
>>>    23 root          1 -80    -     0K    16K RUN    0 126:54 59.96%
>>> irq16: hdac0 uhci+
>>>    12 root          1 171 ki31     0K    16K RUN    0 179:27 40.09%
>>> idle: cpu0
>>>  1318 root          1  46    0   439M   312M select 1  12:55  1.76% Xorg
>>>  4361 musicpd       4  44    0 91164K 14412K ucond  1   1:57  0.78% mpd
>>>
>>> Some things strike me as odd. The difference between the load reported
>>> by powerd and top is still very significant and of course the high
>>> interrupt load.
>> powerd now reports/uses summary load of all CPUs (it can be bigger then
>> 100%), while top shows average.
>>
>>> I've got a mouse with a 1khz report rate (the only connected USB
>>> device), but unplugging it doesn't change the load. Neither does
>>> stopping moused (I'm running the system without HAL). There also
>>> is a fingerprint reader, but it is only detected by ugen.
>> I would start from identifying all devices sharing that IRQ and trying
>> to disable them (or unload their drivers) one by one. `systat -vm 1`
>> will show you how much interrupts actually happens there per second.
>>
>> On most of modern systems you can make hdac0 to not share that IRQ by
>> enabling MSI there with hint.hdac.0.msi=1 in loader.conf.
>>
> 
> I already did unplug all devices to no avail. Afterwards I tried
> to 'kldunload -f' all usb devices, but most refused. However during
> this I recognized that /boot/modules holds an old u3g.ko, back from
> the time when it was not in stable. Apparently modules in
> /boot/modules have preference over those in /boot/kernels. I deleted
> the old u3g.ko and rebooted (because I couldn't get the system to
> unload it). Now everything is fine, the interrupt load is gone.
> 
> I'll monitor this and come back here if it turns out this has just
> been coincidence.
> 
> Thank you for all that help.
> 
> Regards

This has turned out to be an early call. I just happened to start
skype_devel and the whole thing started over, so I think it's
pretty certain, now, that this is a hdac issue.
However after a reboot I tried to reproduce this and now skype
and mpd are both running without causing an irq race.

If the irq race reappears I will use the device hint you suggested
to rule out that this is an interrupt sharing issue.