Date: Wed, 28 Feb 2007 01:20:00 -0800 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: Dimuthu Parussalla <dparussalla@baysidegrp.com.au> Cc: freebsd-stable@freebsd.org, 'Glen Van Lehn' <gvanlehn@ccsf.edu> Subject: Re: Intermittent network issues with Freebsd 6.2 Message-ID: <20070228092000.GA51292@icarus.home.lan> In-Reply-To: <000b01c750bd$5a2d55b0$d801a8c0@dimuthu> References: <20070215043533.GA3293@icarus.home.lan> <000b01c750bd$5a2d55b0$d801a8c0@dimuthu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Feb 15, 2007 at 03:54:18PM +1100, Dimuthu Parussalla wrote: > Hi, > > Dmesg output related to bge as follows. > > miibus0: <MII bus> on bge0 > brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus0 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge0: Ethernet address: 00:11:25:e9:7f:58 > bge0: [GIANT-LOCKED] > pcib6: <ACPI PCI-PCI bridge> at device 5.0 on pci0 > pci8: <ACPI PCI bus> on pcib6 > bge1: <Broadcom BCM5750 B1, ASIC rev. 0x4101> mem 0xc6ff0000-0xc6ffffff irq > 16 at device 0.0 on pci8 > miibus1: <MII bus> on bge1 > brgphy1: <BCM5750 10/100/1000baseTX PHY> on miibus1 > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, > 1000baseTX-FDX, auto > bge1: Ethernet address: 00:11:25:e9:7f:59 > bge1: [GIANT-LOCKED] Interestingly enough, this problem just started haunting us too (out of no where), on one of our Supermicro systems. There haven't been any changes to the network in literally months (no one's been to the datacenter since December). Here's our details: * Upstream switch is an HP ProCurve 2626 . All ports used are 100mbit, with auto-select enabled (speed/duplex neg) * Speed/duplex negotiation is being done correctly. We have no throughput problems (either direction) or otherwise * netstat -i -n shows no errors, except for two output errors, which are probably due to the interface being brought down and back up rudely (see below) * Switch shows no errors on either interface * Cabling is good (CAT6 none the less) * Uniprocessor system; kernel not built with SMP What we see: Feb 17 11:22:00 eos kernel: bge0: watchdog timeout -- resetting Feb 17 11:22:00 eos kernel: bge0: link state changed to DOWN Feb 17 11:22:01 eos kernel: bge0: link state changed to UP Feb 24 11:20:56 eos kernel: bge0: watchdog timeout -- resetting Feb 24 11:20:56 eos kernel: bge0: link state changed to DOWN Feb 24 11:20:58 eos kernel: bge0: link state changed to UP These timestamps are awfully suspicious; exactly 7 days apart, almost to the hour? And no, we have no cronjobs or anything else that runs at that time (this box is hardly used for anything). Applicable system information: (I'm including ichsmb/smbus because it shares an IRQ with bge1; nothing shares an IRQ with bge0) bge0: <Broadcom BCM5750 B1, ASIC rev. 0x4101> mem 0xd0100000-0xd010ffff irq 18 at device 0.0 on pci4 miibus0: <MII bus> on bge0 brgphy0: <BCM5750 10/100/1000baseTX PHY> on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:30:48:81:fc:8a pcib5: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0 pci5: <ACPI PCI bus> on pcib5 bge1: <Broadcom BCM5750 B1, ASIC rev. 0x4101> mem 0xd0200000-0xd020ffff irq 19 at device 0.0 on pci5 miibus1: <MII bus> on bge1 brgphy1: <BCM5750 10/100/1000baseTX PHY> on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge1: Ethernet address: 00:30:48:81:fc:8b ichsmb0: <SMBus controller> port 0x500-0x51f irq 19 at device 31.3 on pci0 ichsmb0: [GIANT-LOCKED] smbus0: <System Management Bus> on ichsmb0 smb0: <SMBus generic I/O> on smbus0 Odd that pciconf -lv shows this as a BCM5750 A1 while the kernel shows this as a BCM5750 B1. Is this indicative of anything? bge0@pci4:0:0: class=0x020000 card=0x02c615d9 chip=0x165914e4 rev=0x11 hdr=0x00 vendor = 'Broadcom Corporation' device = 'BCM5750A1 NetXtreme Gigabit Ethernet PCI Express' class = network subclass = ethernet bge1@pci5:0:0: class=0x020000 card=0x02c615d9 chip=0x165914e4 rev=0x11 hdr=0x00 vendor = 'Broadcom Corporation' device = 'BCM5750A1 NetXtreme Gigabit Ethernet PCI Express' class = network subclass = ethernet [jdc@eos ~]$ vmstat -i interrupt total rate irq4: sio0 6 0 irq6: fdc0 14 0 irq14: ata0 520782 0 irq15: ata1 58 0 irq18: bge0 21839717 11 irq19: bge1+ 32914 0 cpu0: timer 3638265059 1968 Total 3660658550 1981 [jdc@eos ~]$ netstat -in Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll bge0 1500 <Link#1> 00:30:48:81:fc:8a 13841423 0 10349370 2 0 bge0 1500 72.20.106/25 72.20.106.2 3590195 - 10348720 - - bge0 1500 72.20.106.3/3 72.20.106.3 2075045 - 0 - - bge0 1500 72.20.106.4/3 72.20.106.4 2003973 - 0 - - bge0 1500 72.20.106.5/3 72.20.106.5 2328549 - 0 - - bge0 1500 72.20.106.6/3 72.20.106.6 2006174 - 0 - - bge1 1500 <Link#2> 00:30:48:81:fc:8b 3888 0 29600 0 0 bge1 1500 10 10.72.0.1 2605 - 2605 - - lo0 16384 <Link#3> 641 0 641 0 0 lo0 16384 127 127.0.0.1 641 - 641 - - bridg 1500 <Link#4> 86:ec:97:73:50:03 26993 0 30885 0 0 tap0 1500 <Link#5> 00:bd:ed:13:00:00 25712 0 1286 0 0 If a developer wants access to this box, I can provide it. No serial console at this time (soon, soon...), but can provide root. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070228092000.GA51292>