Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Sep 2006 09:15:39 +0200
From:      Oliver Brandmueller <ob@e-Gitt.NET>
To:        freebsd-stable@freebsd.org
Subject:   Re: 6.2 SHOWSTOPPER - em completely unusable on 6.2
Message-ID:  <20060927071538.GF22229@e-Gitt.NET>
In-Reply-To: <451A1375.5080202@gneto.com>
References:  <451A1375.5080202@gneto.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--IMjqdzrDRly81ofr
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi,

On Wed, Sep 27, 2006 at 08:00:21AM +0200, Martin Nilsson wrote:
> I get tons of these:
> em0: watchdog timeout -- resetting
> em0: link state changed to DOWN
> em0: link state changed to UP
>=20
> mailbox# pciconf -lv
> em0@pci13:0:0:  class=3D0x020000 card=3D0x108c15d9 chip=3D0x108c8086 rev=
=3D0x03=20
> hdr=3D0x00
>     vendor   =3D 'Intel Corporation'
>     device   =3D 'PRO/1000 PM'
>     class    =3D network
>     subclass =3D ethernet
> em1@pci14:0:0:  class=3D0x020000 card=3D0x109a15d9 chip=3D0x109a8086 rev=
=3D0x00=20
> hdr=3D0x00
>     vendor   =3D 'Intel Corporation'
>     class    =3D network
>     subclass =3D ethernet
>=20
[...]
> I have only seen them on em0. Yesterday I tried sysutils/cpuburn on=20
> similar boxes that are netbooted with NFS mounted drives and everytime I=
=20
> loaded the two CPU cores the network went down.

I see the same.

Very much on this one, where I workaround the problem by using polling,
it's a UP machine.

FreeBSD nessie 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #3: Fri Sep 15 09:48:3=
6 CEST 2006     root@nessie:/usr/obj/usr/src/sys/NESSIE  i386

em0@pci2:1:0:   class=3D0x020000 card=3D0x10198086 chip=3D0x10198086 rev=3D=
0x00 hdr=3D0x00
    vendor   =3D 'Intel Corporation'
    device   =3D '82547EI Gigabit Ethernet Controller (LOM)'
    class    =3D network
    subclass =3D ethernet

irq18: em0 uhci2                    3319          0


Another machine, also UP, but with two interfaces. The problem is not as=20
apparent as on the first machine, but it's there. This machine is not as=20
loaded usually (CPU wise) as the first machine. The problem is ONLY on=20
em1:

FreeBSD hudson 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #48: Thu Sep 14 10:19:=
46 CEST 2006     root@hudson:/usr/obj/usr/src/sys/NFS-32-FBSD6  i386

em0@pci1:1:0:   class=3D0x020000 card=3D0x10758086 chip=3D0x10758086 rev=3D=
0x00 hdr=3D0x00
    vendor   =3D 'Intel Corporation'
    device   =3D '82547EI Gigabit Ethernet Controller'
    class    =3D network
    subclass =3D ethernet

em1@pci3:2:0:   class=3D0x020000 card=3D0x10768086 chip=3D0x10768086 rev=3D=
0x00 hdr=3D0x00
    vendor   =3D 'Intel Corporation'
    device   =3D '82547EI Gigabit Ethernet Controller'
    class    =3D network
    subclass =3D ethernet

irq17: em1 ichsmb0             950121879        855
irq18: em0                      71437344         64


The problem appeared after the em updates during the last weeks in the
kernel and has not been observed before this. em is always loaded as a=20
module in my kernels. The problem seems to occur more often if the=20
machine's CPU is busy.


I have several SMP machines with the following em interfaces, which=20
DON'T show the problem, but they also have different chipset on the em=20
interface. Most of the kernels were built between Sep 7 and Sep 19.

3 times this:
em0@pci4:5:0:   class=3D0x020000 card=3D0x34248086 chip=3D0x10108086 rev=3D=
0x01 hdr=3D0x00
em1@pci4:5:1:   class=3D0x020000 card=3D0x34248086 chip=3D0x10108086 rev=3D=
0x01 hdr=3D0x00
irq23: em0                     970303432        750



3 times this:
em0@pci4:5:0:   class=3D0x020000 card=3D0x34258086 chip=3D0x100e8086 rev=3D=
0x02 hdr=3D0x00
irq23: em0                     292477376        435


So I can observe at least 3 interesting differences:

- the interface showing the problems shares the interrupt
- for me it happens on UP machines only
- the chips are different

What I can't do: moving the interfaces between machines, these are=20
                 onboard interfaces.

What I could do: I could try to unload the USB driver or the ichsmb=20
driver on the machiens, where the interrupts are shared. Anyway, the USB=20
is not used currently (I have it enabled to be prepared to hook up a USB=20
Mass Storage device, which never happend since the problem occured). The=20
ichsmb also is usually not queried.

Any suggestions on how I could help?

- Olli


--=20
| Oliver Brandmueller | Offenbacher Str. 1  | Germany       D-14197 Berlin |
| Fon +49-172-3130856 | Fax +49-172-3145027 | WWW:   http://the.addict.de/ |
|               Ich bin das Internet. Sowahr ich Gott helfe.               |
| Eine gewerbliche Nutzung aller enthaltenen Adressen ist nicht gestattet! |

--IMjqdzrDRly81ofr
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFGiUaiqtMdzjafykRAle5AJ9OQMWWJMEffZNYLN+z/JrI8OCphQCgxVaH
jb9oTMzYrXEOBjvwenFkhtI=
=GKtS
-----END PGP SIGNATURE-----

--IMjqdzrDRly81ofr--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060927071538.GF22229>