Date: Mon, 27 May 2013 23:49:31 -0700 From: Jeremy Chadwick <jdc@koitsu.org> To: Daniel Braniss <danny@cs.huji.ac.il> Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Message-ID: <20130528064931.GA61056@icarus.home.lan> In-Reply-To: <E1UhDO4-000Dr7-PJ@kabab.cs.huji.ac.il> References: <E1UgsL2-000DBa-El@kabab.cs.huji.ac.il> <20130528052953.GA1457@michelle.cdnetworks.com> <E1UhDO4-000Dr7-PJ@kabab.cs.huji.ac.il>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote: > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, > > > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > > > > > > > > > > bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003> mem > > > 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 > > > bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > miibus2: <MII bus> on bge0 > > > brgphy0: <BCM5714 1000BASE-T media interface> PHY 1 on miibus2 > > > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd > > > bge1: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003> mem > > > 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 > > > bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > miibus3: <MII bus> on bge1 > > > brgphy1: <BCM5714 1000BASE-T media interface> PHY 1 on miibus3 > > > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > bge1: Ethernet address: 00:1b:24:5d:5b:be > > > > > > sf-10> ifconfig bge1 > > > bge1: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500 > > > options=8009b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTA > > > TE> > > > ether 00:1b:24:5d:5b:be > > > nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> > > > media: Ethernet autoselect (100baseTX <full-duplex>) > > > status: active > > > > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events. > > Do you have some network script run by cron? > > no scripts. > this port is shared with the ILO/IPMI, and back in March you fixed a problem > that it was hanging soon after it was initialized by the driver, > (r248226 - but I'm not sure if it was ever MFC'ed). > Initialy I thought it could be caused by connections to it from other > hosts (either via the web, or ssh) so I killed them, but it didn't help. > without that patch the connection fails, and I don't see any DOWN/UP. Two things: 1. r248226 in head was MFC'd to stable/9 as r248858. Validation: http://svnweb.freebsd.org/base/stable/9/sys/dev/bge/if_bge.c?view=log So the answer: whether or not you have that MFC in stable/9 depends on what SVN rev your kernel is. 2. Is there some way to verify that the ASF/iLO/IPMI bits (i.e. the IPMI firmware itself) are not shutting down bge1's PHY intentionally? Unless the IPMI module chooses to log something useful (e.g. "I'm doing this"), I'm not sure how you'd figure that out. Other question: is there any correlation between the amount of time that goes by between events with, say, ARP/MAC address expiry in "arp -a"? I mention this because I know some of the ASF methods have historically shown two MAC addresses on the same physif, and I can see how this might confuse some stacks. <rant> That "piggybacking" crap never should have been invented. All it has done is cause problems for every OS I know of (including Windows) since its inception, and is also exactly why today almost all vendors I've seen provide a dedicated NIC and RJ45 port for the iLO/IPMI interface. It's admission the "piggybacking" method doesn't work. And may it rot in hell for all I care, while simultaneously feeling very sorry for those who have to suffer/deal with it. This is just another reason why I've always been very picky about what hardware I'd buy for server deployments. Vendors never actually disclose this crap until you've shelled out money for the hardware, by which point it's too late and you're suffering. Really great model -- for the pocketbook. :/ </rant> -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130528064931.GA61056>