From owner-freebsd-stable@FreeBSD.ORG Thu Sep 28 19:47:13 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BB40E16A403 for ; Thu, 28 Sep 2006 19:47:13 +0000 (UTC) (envelope-from amesbury@umn.edu) Received: from mta-a2.tc.umn.edu (mta-a2.tc.umn.edu [134.84.119.206]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0B68D43D76 for ; Thu, 28 Sep 2006 19:47:11 +0000 (GMT) (envelope-from amesbury@umn.edu) Received: from [160.94.247.212] (paulaner.oitsec.umn.edu [160.94.247.212]) by mta-a2.tc.umn.edu with ESMTP for freebsd-stable@freebsd.org; Thu, 28 Sep 2006 14:47:10 -0500 (CDT) X-Umn-Remote-Mta: [N] paulaner.oitsec.umn.edu [160.94.247.212] #+LO+TS+AU+HN Message-ID: <451C26BD.2090807@umn.edu> Date: Thu, 28 Sep 2006 14:47:09 -0500 From: Alan Amesbury User-Agent: Thunderbird 1.5.0.7 (X11/20060915) MIME-Version: 1.0 To: freebsd-stable@freebsd.org References: <20060927215554.C059316A601@hub.freebsd.org> In-Reply-To: <20060927215554.C059316A601@hub.freebsd.org> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: 6.2 SHOWSTOPPER - em completely unusable on 6.2 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Sep 2006 19:47:13 -0000 Additional data point: On 6.1-RELEASE I've observed the same sort of behavior, but without any noticeable consistency. It affects bge(4) and em(4) systems. In the case of the bge(4)-equipped system, there's a very weak correlation between heavy disk activity and watchdog timeouts. However, on that system, it doesn't look like the network card shares its PCI bus and interrupt with any other devices: bgehost % pciconf -l hostb0@pci0:0:0: class=0x060000 card=0x00000000 chip=0x00081166 rev=0x23 hdr=0x00 hostb1@pci0:0:1: class=0x060000 card=0x00000000 chip=0x00081166 rev=0x01 hdr=0x00 hostb2@pci0:0:2: class=0x060000 card=0x00000000 chip=0x00061166 rev=0x01 hdr=0x00 hostb3@pci0:0:3: class=0x060000 card=0x00000000 chip=0x00061166 rev=0x01 hdr=0x00 ahc0@pci0:8:0: class=0x010000 card=0xe2a09005 chip=0x00809005 rev=0x02 hdr=0x00 none0@pci0:14:0: class=0x030000 card=0x00d11028 chip=0x47521002 rev=0x27 hdr=0x00 isab0@pci0:15:0: class=0x060100 card=0x02001166 chip=0x02001166 rev=0x50 hdr=0x00 atapci0@pci0:15:1: class=0x01018a card=0x00000000 chip=0x02111166 rev=0x00 hdr=0x00 ohci0@pci0:15:2: class=0x0c0310 card=0x02201166 chip=0x02201166 rev=0x04 hdr=0x00 bge0@pci1:8:0: class=0x020000 card=0x00d11028 chip=0x164414e4 rev=0x12 hdr=0x00 pcib3@pci2:2:0: class=0x060400 card=0x00000068 chip=0x09628086 rev=0x01 hdr=0x01 aac0@pci2:2:1: class=0x010400 card=0x00d11028 chip=0x00021028 rev=0x01 hdr=0x00 fxp0@pci2:4:0: class=0x020000 card=0x009b1028 chip=0x12298086 rev=0x08 hdr=0x00 bgehost % grep irq /var/run/dmesg.boot ioapic0 irqs 0-15 on motherboard ioapic1 irqs 16-31 on motherboard ahc0: port 0xec00-0xecff mem 0xfe102000-0xfe102fff irq 18 at device 8.0 on pci0 ohci0: mem 0xfe100000-0xfe100fff irq 5 at device 15.2 on pci0 bge0: mem 0xfeb00000-0xfeb0ffff irq 17 at device 8.0 on pci1 aac0: mem 0xf0000000-0xf7ffffff irq 31 at device 2.1 on pci2 fxp0: port 0xccc0-0xccff mem 0xfe900000-0xfe900fff,0xfe700000-0xfe7fffff irq 16 at device 4.0 on pci2 fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 psm0: irq 12 on atkbdc0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 This is an SMP host (a pair of Pentium IIIs). The em(4)-equipped host emits watchdog timeout warnings far more frequently, but not with any discernable pattern. However, it routinely handles a *lot* more network traffic, and that traffic is unpredictable and bursty in nature. Its interfaces also appear to have their own resources allocated: emhost %pciconf -l hostb0@pci0:0:0: class=0x060000 card=0x00000000 chip=0x25788086 rev=0x02 hdr=0x00 pcib1@pci0:3:0: class=0x060400 card=0x00000000 chip=0x257b8086 rev=0x02 hdr=0x01 pcib2@pci0:28:0: class=0x060400 card=0x00000050 chip=0x25ae8086 rev=0x02 hdr=0x01 uhci0@pci0:29:0: class=0x0c0300 card=0x01651028 chip=0x25a98086 rev=0x02 hdr=0x00 uhci1@pci0:29:1: class=0x0c0300 card=0x01651028 chip=0x25aa8086 rev=0x02 hdr=0x00 none0@pci0:29:4: class=0x088000 card=0x01651028 chip=0x25ab8086 rev=0x02 hdr=0x00 none1@pci0:29:5: class=0x080020 card=0x01651028 chip=0x25ac8086 rev=0x02 hdr=0x00 ehci0@pci0:29:7: class=0x0c0320 card=0x01651028 chip=0x25ad8086 rev=0x02 hdr=0x00 pcib3@pci0:30:0: class=0x060400 card=0x00000000 chip=0x244e8086 rev=0x0a hdr=0x01 isab0@pci0:31:0: class=0x060100 card=0x00000000 chip=0x25a18086 rev=0x02 hdr=0x00 atapci0@pci0:31:2: class=0x01018a card=0x01651028 chip=0x25a38086 rev=0x02 hdr=0x00 none2@pci0:31:3: class=0x0c0500 card=0x01651028 chip=0x25a48086 rev=0x02 hdr=0x00 em0@pci1:1:0: class=0x020000 card=0x01651028 chip=0x10758086 rev=0x00 hdr=0x00 em1@pci2:1:0: class=0x020000 card=0x10128086 chip=0x10108086 rev=0x01 hdr=0x00 em2@pci2:1:1: class=0x020000 card=0x10128086 chip=0x10108086 rev=0x01 hdr=0x00 em3@pci3:2:0: class=0x020000 card=0x01651028 chip=0x10768086 rev=0x00 hdr=0x00 amr0@pci3:3:0: class=0x010400 card=0x05201028 chip=0x19601000 rev=0x01 hdr=0x00 none3@pci3:14:0: class=0x030000 card=0x01651028 chip=0x47521002 rev=0x27 hdr=0x00 emhost %grep irq /var/run/dmesg.boot ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard em0: port 0xece0-0xecff mem 0xfe3e0000-0xfe3fffff irq 18 at device 1.0 on pci1 em1: port 0xdcc0-0xdcff mem 0xfe1e0000-0xfe1fffff irq 24 at device 1.0 on pci2 em2: port 0xdc80-0xdcbf mem 0xfe1c0000-0xfe1dffff irq 25 at device 1.1 on pci2 uhci0: port 0xbce0-0xbcff irq 16 at device 29.0 on pci0 uhci1: port 0xbcc0-0xbcdf irq 19 at device 29.1 on pci0 ehci0: mem 0xfe500000-0xfe5003ff irq 23 at device 29.7 on pci0 em3: port 0xccc0-0xccff mem 0xfdee0000-0xfdefffff irq 21 at device 2.0 on pci3 amr0: mem 0xfbcf0000-0xfbcfffff irq 22 at device 3.0 on pci3 fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 atkbd0: irq 1 on atkbdc0 sio1: configured irq 3 not in bitmap of probed irqs 0 Note that em1 and em2 are NOT in use on this host, are not configured, and are not physically connected to anything. This host is a UP host; while its CPU has HTT capabilities, they are disabled in the BIOS. Both hosts are running somewhat customized kernels. Notable options not in GENERIC but in these kernels are DEVICE_POLLING (but kern.polling.enable=0!), HZ=1000, and ZERO_COPY_SOCKETS. Several devices were removed, and missing devices (io, mem, isa, and npx) were added in to counter the breakage caused by the silent inclusion of the DEFAULTS stuff. UP and SMP are *identical* except for SMP having ALTQ_NOPCC and SMP added in. Also, I've noticed that STP goes nuts for the bge(4) host, but doesn't seem to notice when the em(4) host watchdog timer goes off. However, I don't have direct access to the network equipment, so I can't check for differences there. -- Alan Amesbury University of Minnesota