Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Sep 2006 17:28:24 +0200
From:      Oliver Brandmueller <ob@e-Gitt.NET>
To:        freebsd-stable@freebsd.org
Subject:   Re: 6.2 SHOWSTOPPER - em completely unusable on 6.2
Message-ID:  <20060927152824.GJ22229@e-Gitt.NET>
In-Reply-To: <451A4189.5020906@samsco.org>
References:  <451A1375.5080202@gneto.com> <20060927071538.GF22229@e-Gitt.NET> <451A4189.5020906@samsco.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--6v9BRtpmy+umdQlo
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Scott,

On Wed, Sep 27, 2006 at 03:16:57AM -0600, Scott Long wrote:
> Well, the best I can say at the moment is, "Wow."  =3D-(  I guess the=20
> thing to do here is to figure out if the problem lies with the em=20
> interrupt handler not getting run, or the taskqueue not getting run.
> Since you've stated that it seems to be related to shared interrupts,
> the first possibility is more likely.  However, I'm not sure why the
> symptom would only be showing up now.  The Intel docs say that the
> 82547EI are a bit interesting, and I wonder if assumptions that we
> make about PCI ordering aren't true (or if there are bugs that make
> our assumptions invalid).
>=20
> Does this happen after there has been a lot of disk activity, like a
> large tar extraction?  Are you using the SMBus interface at all, or is
> it sitting completely idle?

Disk activity does not trigger the problem, I hammered the disk with=20
around 85 MB/s (dd) for about half an hour without seeing any effect. A=20
CPU bound thing like a buildworld triggered the problem.

The SMBus Interface is not used at all (it's not even really usable).=20
Anyway, as soon as I unload the ichsmb module I cannot triger the=20
problem anymore. If I load it again, the problem cann again be triggered=20
by a buildworld. Statistical relevance: I did 4 buildworlds, alternating=20
the load/unload of ichsmb - both times with ichsmb loaded I saw 3=20
watchdog timeouts during the buildworld was running, while ichsmb was=20
not loaded I did not see a single watchdog timeout. The use of the=20
interface was around the same during all the time (constant NFS traffic=20
of around 1-2 MBit/s).

Since we all seem to see this on only the interfaces sharing interrupts=20
(as I read the other poster's mails) and the problem can be worked=20
around by using polling, it seems to become pretty clear, that it has to=20
to with interrupt handling.

The UP/SMP idea seems to be only of interest, because on an UP machine=20
it's more likely to share interrupts than on SMP machines, it has=20
nothing to do with the fact of UP or SMP itself.

- Oliver


--=20
| Oliver Brandmueller | Offenbacher Str. 1  | Germany       D-14197 Berlin |
| Fon +49-172-3130856 | Fax +49-172-3145027 | WWW:   http://the.addict.de/ |
|               Ich bin das Internet. Sowahr ich Gott helfe.               |
| Eine gewerbliche Nutzung aller enthaltenen Adressen ist nicht gestattet! |

--6v9BRtpmy+umdQlo
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFGpiYiqtMdzjafykRAoWMAKCmH+zVUeY1R263+zEmQptI0ENY+ACePWhc
VZBmot9E+2WoZoEPM1gL1UY=
=qHLI
-----END PGP SIGNATURE-----

--6v9BRtpmy+umdQlo--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060927152824.GJ22229>