Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Sep 2006 14:47:09 -0500
From:      Alan Amesbury <amesbury@umn.edu>
To:        freebsd-stable@freebsd.org
Subject:   Re: 6.2 SHOWSTOPPER - em completely unusable on 6.2
Message-ID:  <451C26BD.2090807@umn.edu>
In-Reply-To: <20060927215554.C059316A601@hub.freebsd.org>
References:  <20060927215554.C059316A601@hub.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Additional data point:  On 6.1-RELEASE I've observed the same sort of
behavior, but without any noticeable consistency.  It affects bge(4) and
em(4) systems.  In the case of the bge(4)-equipped system, there's a
very weak correlation between heavy disk activity and watchdog timeouts.
 However, on that system, it doesn't look like the network card shares
its PCI bus and interrupt with any other devices:

bgehost % pciconf -l
hostb0@pci0:0:0:        class=0x060000 card=0x00000000 chip=0x00081166
rev=0x23 hdr=0x00
hostb1@pci0:0:1:        class=0x060000 card=0x00000000 chip=0x00081166
rev=0x01 hdr=0x00
hostb2@pci0:0:2:        class=0x060000 card=0x00000000 chip=0x00061166
rev=0x01 hdr=0x00
hostb3@pci0:0:3:        class=0x060000 card=0x00000000 chip=0x00061166
rev=0x01 hdr=0x00
ahc0@pci0:8:0:  class=0x010000 card=0xe2a09005 chip=0x00809005 rev=0x02
hdr=0x00
none0@pci0:14:0:        class=0x030000 card=0x00d11028 chip=0x47521002
rev=0x27 hdr=0x00
isab0@pci0:15:0:        class=0x060100 card=0x02001166 chip=0x02001166
rev=0x50 hdr=0x00
atapci0@pci0:15:1:      class=0x01018a card=0x00000000 chip=0x02111166
rev=0x00 hdr=0x00
ohci0@pci0:15:2:        class=0x0c0310 card=0x02201166 chip=0x02201166
rev=0x04 hdr=0x00
bge0@pci1:8:0:  class=0x020000 card=0x00d11028 chip=0x164414e4 rev=0x12
hdr=0x00
pcib3@pci2:2:0: class=0x060400 card=0x00000068 chip=0x09628086 rev=0x01
hdr=0x01
aac0@pci2:2:1:  class=0x010400 card=0x00d11028 chip=0x00021028 rev=0x01
hdr=0x00
fxp0@pci2:4:0:  class=0x020000 card=0x009b1028 chip=0x12298086 rev=0x08
hdr=0x00
bgehost % grep irq /var/run/dmesg.boot
ioapic0 <Version 1.1> irqs 0-15 on motherboard
ioapic1 <Version 1.1> irqs 16-31 on motherboard
ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0xec00-0xecff mem
0xfe102000-0xfe102fff irq 18 at device 8.0 on pci0
ohci0: <OHCI (generic) USB controller> mem 0xfe100000-0xfe100fff irq 5
at device 15.2 on pci0
bge0: <Broadcom BCM5700 Gigabit Ethernet, ASIC rev. 0x7102> mem
0xfeb00000-0xfeb0ffff irq 17 at device 8.0 on pci1
aac0: <Dell PERC 3/Di> mem 0xf0000000-0xf7ffffff irq 31 at device 2.1 on
pci2
fxp0: <Intel 82559 Pro/100 Ethernet> port 0xccc0-0xccff mem
0xfe900000-0xfe900fff,0xfe700000-0xfe7fffff irq 16 at device 4.0 on pci2
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0


This is an SMP host (a pair of Pentium IIIs).

The em(4)-equipped host emits watchdog timeout warnings far more
frequently, but not with any discernable pattern.  However, it routinely
handles a *lot* more network traffic, and that traffic is unpredictable
and bursty in nature.  Its interfaces also appear to have their own
resources allocated:

emhost %pciconf -l
hostb0@pci0:0:0:        class=0x060000 card=0x00000000 chip=0x25788086
rev=0x02 hdr=0x00
pcib1@pci0:3:0: class=0x060400 card=0x00000000 chip=0x257b8086 rev=0x02
hdr=0x01
pcib2@pci0:28:0:        class=0x060400 card=0x00000050 chip=0x25ae8086
rev=0x02 hdr=0x01
uhci0@pci0:29:0:        class=0x0c0300 card=0x01651028 chip=0x25a98086
rev=0x02 hdr=0x00
uhci1@pci0:29:1:        class=0x0c0300 card=0x01651028 chip=0x25aa8086
rev=0x02 hdr=0x00
none0@pci0:29:4:        class=0x088000 card=0x01651028 chip=0x25ab8086
rev=0x02 hdr=0x00
none1@pci0:29:5:        class=0x080020 card=0x01651028 chip=0x25ac8086
rev=0x02 hdr=0x00
ehci0@pci0:29:7:        class=0x0c0320 card=0x01651028 chip=0x25ad8086
rev=0x02 hdr=0x00
pcib3@pci0:30:0:        class=0x060400 card=0x00000000 chip=0x244e8086
rev=0x0a hdr=0x01
isab0@pci0:31:0:        class=0x060100 card=0x00000000 chip=0x25a18086
rev=0x02 hdr=0x00
atapci0@pci0:31:2:      class=0x01018a card=0x01651028 chip=0x25a38086
rev=0x02 hdr=0x00
none2@pci0:31:3:        class=0x0c0500 card=0x01651028 chip=0x25a48086
rev=0x02 hdr=0x00
em0@pci1:1:0:   class=0x020000 card=0x01651028 chip=0x10758086 rev=0x00
hdr=0x00
em1@pci2:1:0:   class=0x020000 card=0x10128086 chip=0x10108086 rev=0x01
hdr=0x00
em2@pci2:1:1:   class=0x020000 card=0x10128086 chip=0x10108086 rev=0x01
hdr=0x00
em3@pci3:2:0:   class=0x020000 card=0x01651028 chip=0x10768086 rev=0x00
hdr=0x00
amr0@pci3:3:0:  class=0x010400 card=0x05201028 chip=0x19601000 rev=0x01
hdr=0x00
none3@pci3:14:0:        class=0x030000 card=0x01651028 chip=0x47521002
rev=0x27 hdr=0x00
emhost %grep irq /var/run/dmesg.boot
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
em0: <Intel(R) PRO/1000 Network Connection Version - 3.2.18> port
0xece0-0xecff mem 0xfe3e0000-0xfe3fffff irq 18 at device 1.0 on pci1
em1: <Intel(R) PRO/1000 Network Connection Version - 3.2.18> port
0xdcc0-0xdcff mem 0xfe1e0000-0xfe1fffff irq 24 at device 1.0 on pci2
em2: <Intel(R) PRO/1000 Network Connection Version - 3.2.18> port
0xdc80-0xdcbf mem 0xfe1c0000-0xfe1dffff irq 25 at device 1.1 on pci2
uhci0: <UHCI (generic) USB controller> port 0xbce0-0xbcff irq 16 at
device 29.0 on pci0
uhci1: <UHCI (generic) USB controller> port 0xbcc0-0xbcdf irq 19 at
device 29.1 on pci0
ehci0: <Intel 6300ESB USB 2.0 controller> mem 0xfe500000-0xfe5003ff irq
23 at device 29.7 on pci0
em3: <Intel(R) PRO/1000 Network Connection Version - 3.2.18> port
0xccc0-0xccff mem 0xfdee0000-0xfdefffff irq 21 at device 2.0 on pci3
amr0: <LSILogic MegaRAID 1.53> mem 0xfbcf0000-0xfbcfffff irq 22 at
device 3.0 on pci3
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on
acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
sio1: configured irq 3 not in bitmap of probed irqs 0


Note that em1 and em2 are NOT in use on this host, are not configured,
and are not physically connected to anything.  This host is a UP host;
while its CPU has HTT capabilities, they are disabled in the BIOS.

Both hosts are running somewhat customized kernels.  Notable options not
in GENERIC but in these kernels are DEVICE_POLLING (but
kern.polling.enable=0!), HZ=1000, and ZERO_COPY_SOCKETS.  Several
devices were removed, and missing devices (io, mem, isa, and npx) were
added in to counter the breakage caused by the silent inclusion of the
DEFAULTS stuff.  UP and SMP are *identical* except for SMP having
ALTQ_NOPCC and SMP added in.

Also, I've noticed that STP goes nuts for the bge(4) host, but doesn't
seem to notice when the em(4) host watchdog timer goes off.  However, I
don't have direct access to the network equipment, so I can't check for
differences there.


--
Alan Amesbury
University of Minnesota



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?451C26BD.2090807>