Date: Thu, 14 Jan 2010 21:48:56 +0100 From: Floris Bos <info@je-eigen-domein.nl> To: pyunyh@gmail.com Cc: freebsd-net@freebsd.org Subject: Re: kern/92090: [bge] bge: watchdog timeout -- resetting Message-ID: <201001142148.56444.info@je-eigen-domein.nl> In-Reply-To: <20100114201144.GA1228@michelle.cdnetworks.com> References: <201001140140.o0E1e5hr072464@freefall.freebsd.org> <201001142108.02941.info@je-eigen-domein.nl> <20100114201144.GA1228@michelle.cdnetworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 14 January 2010 09:11:44 pm Pyun YongHyeon wrote: > On Thu, Jan 14, 2010 at 09:08:02PM +0100, Floris Bos wrote: > > On Thursday 14 January 2010 06:56:03 pm Pyun YongHyeon wrote: > > > On Thu, Jan 14, 2010 at 04:33:19AM +0100, Floris Bos wrote: > > > > Hi, > > > > > > > > On Thursday 14 January 2010 03:54:52 am Pyun YongHyeon wrote: > > > > > > == > > > > > > bge0: <HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100> mem 0xdf900000-0xdf90ffff irq 16 at device 0.0 on pci32 > > > > > > == > > > > > > > > > > > > After boot, the network works for about 5 seconds, barely enough time to get an IP by DHCP, and sent a ping or 2. > > > > > > Then network connectivity goes down, and after some time there is a "bge0: watchdog timeout -- resetting" message. > > > > > > > > > > > > Then network works again for 5 seconds, and goes down again. All the time, repeatedly. > > > > > > > > > > > > The system works fine under Ubuntu. So I assume the hardware is ok. > > > > > > > > > > > > > > > > I'm not sure but it looks like you have a BCM5784 controller. What is > > > > > the output of "devinfo -rv | grep phy"? > > > > > > > > == > > > > ukphy0 pnpinfo oui=0x50ef model=0x3a rev=0x4 at phyno=1 > > > > ukphy1 pnpinfo oui=0x50ef model=0x3a rev=0x4 at phyno=1 > > > > == > > > > > > Support for the PHY was added in r202269. > > > Please try again after applying the change. Or you can download > > > sys/dev/mii/miidevs and sys/dev/mii/brgphy.c from HEAD and rebuild > > > kernel. > > > > Fetched the latest source using CVS on another computer, and transferred it to the system concerned by USB stick. > > Rebuild the kernel, but the problem is still there. > > > Would you show me full dmesg output including "watchodg timeout" > messages? === Copyright (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 9.0-CURRENT #0: Thu Jan 14 20:12:47 CET 2010 root@db3.xxxxxxx.xx:/usr/obj/usr/src/sys/GENERIC amd64 WARNING: WITNESS option enabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz (2394.00-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x106e5 Stepping = 5 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x98e3fd<SSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT> AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM> AMD Features2=0x1<LAHF> TSC: P-state invariant real memory = 17179869184 (16384 MB) avail memory = 16533999616 (15768 MB) ACPI APIC Table: <HP ProLiant> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 2 cpu2 (AP): APIC ID: 4 cpu3 (AP): APIC ID: 6 ioapic0 <Version 2.0> irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: <HP ProLiant> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> irq 16 at device 3.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pci0: <base peripheral> at device 8.0 (no driver attached) pci0: <base peripheral> at device 8.1 (no driver attached) pci0: <base peripheral> at device 8.2 (no driver attached) pci0: <base peripheral> at device 8.3 (no driver attached) pci0: <base peripheral> at device 16.0 (no driver attached) pci0: <base peripheral> at device 16.1 (no driver attached) ehci0: <Intel PCH USB 2.0 controller USB-B> mem 0xdfd02000-0xdfd023ff irq 16 at device 26.0 on pci0 ehci0: [ITHREAD] usbus0: EHCI version 1.0 usbus0: <Intel PCH USB 2.0 controller USB-B> on ehci0 pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.0 on pci0 pci16: <ACPI PCI bus> on pcib2 pcib3: <ACPI PCI-PCI bridge> irq 17 at device 28.4 on pci0 pci32: <ACPI PCI bus> on pcib3 bge0: <HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100> mem 0xdf900000-0xdf90ffff irq 16 at device 0.0 on pci32 miibus0: <MII bus> on bge0 brgphy0: <BCM5784 10/100/1000baseTX PHY> PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bge0: Ethernet address: f4:ce:46:0f:2a:2c bge0: [FILTER] pcib4: <ACPI PCI-PCI bridge> irq 16 at device 28.5 on pci0 pci34: <ACPI PCI bus> on pcib4 bge1: <HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100> mem 0xdfa00000-0xdfa0ffff irq 17 at device 0.0 on pci34 miibus1: <MII bus> on bge1 brgphy1: <BCM5784 10/100/1000baseTX PHY> PHY 1 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto bge1: Ethernet address: f4:ce:46:0f:2a:2d bge1: [FILTER] pcib5: <ACPI PCI-PCI bridge> irq 18 at device 28.6 on pci0 pci36: <ACPI PCI bus> on pcib5 vgapci0: <VGA-compatible display> mem 0xde000000-0xdeffffff,0xdf800000-0xdf803fff,0xdf000000-0xdf7fffff irq 18 at device 0.0 on pci36 pcib6: <ACPI PCI-PCI bridge> irq 19 at device 28.7 on pci0 pci38: <ACPI PCI bus> on pcib6 ehci1: <Intel PCH USB 2.0 controller USB-A> mem 0xdfd02400-0xdfd027ff irq 23 at device 29.0 on pci0 ehci1: [ITHREAD] usbus1: EHCI version 1.0 usbus1: <Intel PCH USB 2.0 controller USB-A> on ehci1 pcib7: <PCI-PCI bridge> at device 30.0 on pci0 pci48: <PCI bus> on pcib7 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <Intel AHCI controller> port 0x1830-0x1837,0x1824-0x1827,0x1828-0x182f,0x1820-0x1823,0x1800-0x181f mem 0xdfd01000-0xdfd017ff irq 18 at device 31.2 on pci0 atapci0: [ITHREAD] atapci0: AHCI v1.30 controller with 6 3Gbps ports, PM supported ata2: <ATA channel 0> on atapci0 ata2: [ITHREAD] ata3: <ATA channel 1> on atapci0 ata3: [ITHREAD] ata4: <ATA channel 2> on atapci0 ata4: [ITHREAD] ata5: <ATA channel 3> on atapci0 ata5: [ITHREAD] pci0: <serial bus, SMBus> at device 31.3 (no driver attached) acpi_button0: <Power Button> on acpi0 atrtc0: <AT realtime clock> port 0x70-0x71 on acpi0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] cpu0: <ACPI CPU> on acpi0 est0: <Enhanced SpeedStep Frequency Control> on cpu0 p4tcc0: <CPU Frequency Thermal Control> on cpu0 cpu1: <ACPI CPU> on acpi0 est1: <Enhanced SpeedStep Frequency Control> on cpu1 p4tcc1: <CPU Frequency Thermal Control> on cpu1 cpu2: <ACPI CPU> on acpi0 est2: <Enhanced SpeedStep Frequency Control> on cpu2 p4tcc2: <CPU Frequency Thermal Control> on cpu2 cpu3: <ACPI CPU> on acpi0 est3: <Enhanced SpeedStep Frequency Control> on cpu3 p4tcc3: <CPU Frequency Thermal Control> on cpu3 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc8fff,0xdc000-0xdffff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd: unable to set the command byte. atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: unable to set the command byte. ppc0: cannot reserve I/O port range ZFS filesystem version 3 ZFS storage pool version 14 Timecounters tick every 1.000 msec usbus0: 480Mbps High Speed USB v2.0 usbus1: 480Mbps High Speed USB v2.0 ad4: 152627MB <INTEL SSDSA2M160G2GC 2CV102HD> at ata2-master UDMA100 SATA 3Gb/s ad6: 152627MB <INTEL SSDSA2M160G2GC 2CV102HD> at ata3-master UDMA100 SATA 3Gb/s ad8: 152627MB <INTEL SSDSA2M160G2GC 2CV102HD> at ata4-master UDMA100 SATA 3Gb/s ad10: 152627MB <INTEL SSDSA2M160G2GC 2CV102HD> at ata5-master UDMA100 SATA 3Gb/s SMP: AP CPU #3 Launched! SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! WARNING: WITNESS option enabled, expect reduced performance. ugen1.1: <Intel> at usbus1ugen0.1: <Intel> at usbus0 uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1 uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0 Root mount waiting for: usbus1 usbus0 uhub0: 2 ports with 2 removable, self powered uhub1: 2 ports with 2 removable, self powered Root mount waiting for: usbus1 usbus0 ugen1.2: <vendor 0x8087> at usbus1 uhub2: <vendor 0x8087 product 0x0020, class 9/0, rev 2.00/0.00, addr 2> on usbus1 ugen0.2: <vendor 0x8087> at usbus0 uhub3: <vendor 0x8087 product 0x0020, class 9/0, rev 2.00/0.00, addr 2> on usbus0 Root mount waiting for: usbus1 usbus0 uhub3: 6 ports with 6 removable, self powered uhub2: 8 ports with 8 removable, self powered Root mount waiting for: usbus1 usbus0 ugen0.3: <Logitech> at usbus0 ums0: <Logitech USB-PS/2 Optical Mouse, class 0/0, rev 2.00/27.00, addr 3> on usbus0 ums0: 8 buttons and [XYZ] coordinates ID=0 ugen1.3: <ServerEngines> at usbus1 ukbd0: <ServerEngines SE USB Device, class 0/0, rev 1.10/0.01, addr 3> on usbus1 kbd2 at ukbd0 ums1: <ServerEngines SE USB Device, class 0/0, rev 1.10/0.01, addr 3> on usbus1 ums1: 8 buttons and [XYZ] coordinates ID=0 ugen0.4: <SanDisk> at usbus0 umass0: <SanDisk Cruzer Micro, class 0/0, rev 2.00/2.00, addr 4> on usbus0 umass0: SCSI over Bulk-Only; quirks = 0x0000 Root mount waiting for: usbus0 umass0:0:0:-1: Attached to scbus0 Trying to mount root from zfs:zroot da0 at umass-sim0 bus 0 scbus0 target 0 lun 0 da0: <SanDisk Cruzer Micro 8.01> Removable Direct Access SCSI-0 device da0: 40.000MB/s transfers da0: 3839MB (7862911 512 byte sectors: 255H 63S/T 489C) GEOM: da0: partition 1 does not end on a track boundary. lock order reversal: 1st 0xffffff000a372bd8 zfs (zfs) @ /usr/src/sys/kern/vfs_mount.c:1058 2nd 0xffffff000a5bc9f8 devfs (devfs) @ /usr/src/sys/kern/vfs_subr.c:2091 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e __lockmgr_args() at __lockmgr_args+0xd10 vop_stdlock() at vop_stdlock+0x39 VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x47 vget() at vget+0x7b devfs_allocv() at devfs_allocv+0x100 devfs_root() at devfs_root+0x48 vfs_donmount() at vfs_donmount+0xfb2 nmount() at nmount+0x63 syscall() at syscall+0x1ae Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (378, FreeBSD ELF64, nmount), rip = 0x8007afeac, rsp = 0x7fffffffdd28, rbp = 0x800a06048 --- bge0: link state changed to UP bge0: link state changed to DOWN bge0: watchdog timeout -- resetting bge0: link state changed to UP bge0: link state changed to DOWN bge0: watchdog timeout -- resetting bge0: link state changed to UP bge0: watchdog timeout -- resetting bge0: link state changed to DOWN bge0: link state changed to UP === Seconds after the link goes up the connectivity is gone, but it takes minutes before it actually shows up as "link state changed to DOWN" in dmesg. According to the log file of the switch the server is connected to, the link goes up and down every 3 seconds or so. == Log Index Message Text Severity Log Time Component Description 1700 <14> Jan 01 09:27:45 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1701 %% Interface 9 is Link Up Info Jan 01 09:27:45 NIM Interface 9 is Link Up 1701 <14> Jan 01 09:27:48 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1702 %% Interface 9 is Link Down Info Jan 01 09:27:48 NIM Interface 9 is Link Down 1702 <14> Jan 01 09:27:51 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1703 %% Interface 9 is Link Up Info Jan 01 09:27:51 NIM Interface 9 is Link Up 1703 <14> Jan 01 09:27:54 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1704 %% Interface 9 is Link Down Info Jan 01 09:27:54 NIM Interface 9 is Link Down 1704 <14> Jan 01 09:27:57 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1705 %% Interface 9 is Link Up Info Jan 01 09:27:57 NIM Interface 9 is Link Up 1705 <14> Jan 01 09:28:00 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1706 %% Interface 9 is Link Down Info Jan 01 09:28:00 NIM Interface 9 is Link Down 1706 <14> Jan 01 09:28:03 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1707 %% Interface 9 is Link Up Info Jan 01 09:28:03 NIM Interface 9 is Link Up 1707 <14> Jan 01 09:28:06 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1708 %% Interface 9 is Link Down Info Jan 01 09:28:06 NIM Interface 9 is Link Down 1708 <14> Jan 01 09:28:09 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1709 %% Interface 9 is Link Up Info Jan 01 09:28:09 NIM Interface 9 is Link Up 1709 <14> Jan 01 09:28:12 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1710 %% Interface 9 is Link Down Info Jan 01 09:28:12 NIM Interface 9 is Link Down 1710 <14> Jan 01 09:28:15 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1711 %% Interface 9 is Link Up Info Jan 01 09:28:15 NIM Interface 9 is Link Up 1711 <14> Jan 01 09:28:17 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1712 %% Interface 9 is Link Down Info Jan 01 09:28:17 NIM Interface 9 is Link Down 1712 <14> Jan 01 09:28:20 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1713 %% Interface 9 is Link Up Info Jan 01 09:28:20 NIM Interface 9 is Link Up 1713 <14> Jan 01 09:28:24 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1714 %% Interface 9 is Link Down Info Jan 01 09:28:24 NIM Interface 9 is Link Down 1714 <14> Jan 01 09:28:26 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1715 %% Interface 9 is Link Up Info Jan 01 09:28:26 NIM Interface 9 is Link Up 1715 <14> Jan 01 09:28:30 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1716 %% Interface 9 is Link Down Info Jan 01 09:28:30 NIM Interface 9 is Link Down 1716 <14> Jan 01 09:28:32 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1717 %% Interface 9 is Link Up Info Jan 01 09:28:32 NIM Interface 9 is Link Up 1717 <14> Jan 01 09:28:36 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1718 %% Interface 9 is Link Down Info Jan 01 09:28:36 NIM Interface 9 is Link Down 1718 <14> Jan 01 09:28:39 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1719 %% Interface 9 is Link Up Info Jan 01 09:28:39 NIM Interface 9 is Link Up 1719 <14> Jan 01 09:28:42 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1720 %% Interface 9 is Link Down Info Jan 01 09:28:42 NIM Interface 9 is Link Down 1720 <14> Jan 01 09:28:45 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1721 %% Interface 9 is Link Up Info Jan 01 09:28:45 NIM Interface 9 is Link Up 1721 <14> Jan 01 09:28:48 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1722 %% Interface 9 is Link Down Info Jan 01 09:28:48 NIM Interface 9 is Link Down 1722 <14> Jan 01 09:28:51 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(619) 1723 %% Interface 9 is Link Up Info Jan 01 09:28:51 NIM Interface 9 is Link Up 1723 <14> Jan 01 09:28:54 197 192.168.2.10-1 NIM[-2137017720]: nim_events.c(665) 1724 %% Interface 9 is Link Down Info Jan 01 09:28:54 NIM Interface 9 is Link Down == Yours sincerly, Floris Bos
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201001142148.56444.info>