From owner-freebsd-questions@FreeBSD.ORG Wed Mar 7 00:08:19 2007 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E46D116A400 for ; Wed, 7 Mar 2007 00:08:19 +0000 (UTC) (envelope-from mksmith@adhost.com) Received: from mail-in05.adhost.com (mail-in05.adhost.com [216.211.128.133]) by mx1.freebsd.org (Postfix) with ESMTP id C5D7D13C4B5 for ; Wed, 7 Mar 2007 00:08:19 +0000 (UTC) (envelope-from mksmith@adhost.com) Received: from ad-exh01.adhost.lan (unknown [216.211.143.69]) by mail-in05.adhost.com (Postfix) with ESMTP id 8D3FB164841; Tue, 6 Mar 2007 16:08:19 -0800 (PST) (envelope-from mksmith@adhost.com) MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-MimeOLE: Produced By Microsoft Exchange V6.5 x-cr-puzzleid: {DECB3204-2440-4628-8748-27FEE140D0BE} Content-class: urn:content-classes:message Date: Tue, 6 Mar 2007 16:08:12 -0800 Message-ID: <17838240D9A5544AAA5FF95F8D52031601C59D59@ad-exh01.adhost.lan> x-cr-hashedpuzzle: BaM= BfB2 CG9q FErf FgOI I2ot OI5T PnjQ Qefc RULc SyB5 VDLC WJyR WnUD YXRI ZRo5; 2; YwBoAGEAZABAAHMAaABpAHIAZQAuAG4AZQB0ADsAZgByAGUAZQBiAHMAZAAtAHEAdQBlAHMAdABpAG8AbgBzAEAAZgByAGUAZQBiAHMAZAAuAG8AcgBnAA==; Sosha1_v1; 7; {DECB3204-2440-4628-8748-27FEE140D0BE}; bQBrAHMAbQBpAHQAaABAAGEAZABoAG8AcwB0AC4AYwBvAG0A; Wed, 07 Mar 2007 00:08:12 GMT; UgBFADoAIABzAHQAYQByAHQAZQBkACAAZwBlAHQAdABpAG4AZwAgAHIAZQBwAGUAYQB0AGUAZAAgACIAYgBnAGUAMAA6ACAAUABIAFkAIAByAGUAYQBkACAAdABpAG0AZQBkACAAbwB1AHQAIgAgAG0AZQBzAHMAYQBnAGUAcwA= In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: started getting repeated "bge0: PHY read timed out" messages thread-index: AcdgKv0trvTgvgNbSW23H4wbpd5BQQAIMrmA References: From: "Michael K. Smith - Adhost" To: "Chad Leigh -- Shire.Net LLC" , "User Questions" Cc: Subject: RE: started getting repeated "bge0: PHY read timed out" messages X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Mar 2007 00:08:20 -0000 Hello: > -----Original Message----- > From: owner-freebsd-questions@freebsd.org [mailto:owner-freebsd- > questions@freebsd.org] On Behalf Of Chad Leigh -- Shire.Net LLC > Sent: Tuesday, March 06, 2007 12:05 PM > To: User Questions > Subject: Re: started getting repeated "bge0: PHY read timed out" > messages >=20 >=20 > On Mar 6, 2007, at 9:20 AM, Chad Leigh -- Shire.Net LLC wrote: >=20 > > Hi > > > > After running fine for a while, my new server running 6.2-RELEASE > > with latest security patches as of last Thursday or Friday started > > giving the message > > > > bge0: PHY read timed out > > > > and I found the following in the system log >=20 > ok, it started happening again after about 1.5 hours after the last > reboot. bge0 started going down and up a few times over about an > hour, and then the read timedout messages started up again. Previous > to the very first time that this started, the server had run for > about 4 days since it was newly installed. >=20 > Mar 6 09:09:23 server su: chad to root on /dev/ttyp0 > Mar 6 10:26:29 server kernel: bge0: link state changed to DOWN > Mar 6 10:26:31 server kernel: bge0: link state changed to UP > Mar 6 10:42:33 server kernel: bge0: link state changed to DOWN > Mar 6 10:42:35 server kernel: bge0: link state changed to UP > Mar 6 11:31:19 server kernel: bge0: PHY read timed out > Mar 6 11:31:19 server last message repeated 3 times > Mar 6 11:31:19 server kernel: bge0: link state changed to DOWN > Mar 6 11:31:21 server kernel: bge0: PHY read timed out > Mar 6 11:31:52 server last message repeated 116 times > Mar 6 11:33:53 server last message repeated 488 times > Mar 6 11:43:54 server last message repeated 2356 times > Mar 6 11:53:56 server last message repeated 2372 times > Mar 6 12:03:57 server last message repeated 2368 times > Mar 6 12:09:53 server last message repeated 1399 times > Mar 6 12:09:53 server kernel: bge0: watchdog timeout -- resetting > Mar 6 12:09:53 server kernel: bge0: PHY read timed out > Mar 6 12:09:53 server last message repeated 4 times > Mar 6 12:09:53 server kernel: bge0: RX CPU self-diagnostics failed! > Mar 6 12:09:53 server kernel: bge0: flow-through queue init failed > Mar 6 12:09:53 server kernel: bge0: initialization failure > Mar 6 12:09:54 server kernel: bge0: PHY read timed out > Mar 6 12:10:25 server last message repeated 152 times > Mar 6 12:12:27 server last message repeated 616 times > Mar 6 12:22:29 server last message repeated 2540 times > Mar 6 12:32:30 server last message repeated 2452 times > Mar 6 12:42:31 server last message repeated 2524 times > Mar 6 12:46:27 server last message repeated 1127 times > Mar 6 12:46:27 server login: ROOT LOGIN (root) ON ttyv0 > Mar 6 12:46:29 server kernel: bge0: PHY read timed out > Mar 6 12:46:41 server last message repeated 107 times > Mar 6 12:46:40 server reboot: rebooted by root >=20 > here is an ifconfig >=20 > bge0: flags=3D8843 mtu 1500 > options=3D1b > inet 166.70.252.128 netmask 0xffffff00 broadcast > 166.70.252.255 > inet 166.70.252.120 netmask 0xffffffff broadcast > 166.70.252.120 > inet 166.70.252.199 netmask 0xffffffff broadcast > 166.70.252.199 > ether 00:e0:81:61:e9:a0 > media: Ethernet autoselect (1000baseTX ) > status: active >=20 > and uname >=20 > # uname -a > FreeBSD server.shire.net 6.2-RELEASE-p2 FreeBSD 6.2-RELEASE-p2 #1: > Sat Mar 3 13:11:00 UTC 2007 chad@server.shire.net:/usr/obj/usr/ > src/sys/server i386 > # >=20 > It is a TYAN S2850 single opteron system with 2.4ghz single core > opteron. >=20 > Its dmesg ID is seen below in the quoted section. >=20 > I had another machine with this same MB that ran for a long time fine > until I upgraded it to 6.0 or 6.1 last Fall and then I started to > have the same problem (a post about it is in the archives). I > assumed it was a HW issue and turned off the port in the BIOS and > used the other port until I took the machine offline as the customer > using it no longer needed it. >=20 > Now this machine is having the same symptoms and I remember reading > in the lists something about PHY and bge and some driver problems a > while back but cannot fnd it now in the archives. >=20 > Could this be a SW problem or is it a HW issue? Could it be related > to the port it is connected to or the cable or something? The other > machine that had this problem was on a different switch brand. >=20 > Thanks > Chad >=20 > > > > > > This appears to be a HW problem at first look. But when the server > > boots, it works fine for a while (hours, days??) > > > > Here is the id in the boot message > > > > Mar 6 09:01:21 server kernel: bge0: > rev. 0x3003> mem 0xfeab0000-0xfeabffff irq 16 at device 14.0 on pci1 > > Mar 6 09:01:21 server kernel: miibus0: on bge0 > > Mar 6 09:01:21 server kernel: brgphy0: > PHY> on miibus0 > > Mar 6 09:01:21 server kernel: brgphy0: 10baseT, 10baseT-FDX, > > 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto > > Mar 6 09:01:21 server kernel: bge0: Ethernet address: > > 00:e0:81:61:e9:a0 > > > > Is this some sort of SW driver issue or is it a HW issue at first > > glance? I remember kind of reading about some BGE issues a while > > back. > > > > Thanks > > Chad > > >=20 Have you looked at the output of 'netstat -i' to see if there are interface errors? Also, have you looked at the switch-side interface for errors, buffer problems, etc. (if that's possible)? Finally, have you swapped ports/cables on the switch? Regards, Mike