From owner-freebsd-current@FreeBSD.ORG Thu Jul 20 15:49:22 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2121016A4E1; Thu, 20 Jul 2006 15:49:22 +0000 (UTC) (envelope-from lavalamp@spiritual-machines.org) Received: from mail.digitalfreaks.org (arbitor.digitalfreaks.org [216.151.95.158]) by mx1.FreeBSD.org (Postfix) with ESMTP id A368643D46; Thu, 20 Jul 2006 15:49:21 +0000 (GMT) (envelope-from lavalamp@spiritual-machines.org) Received: by mail.digitalfreaks.org (Postfix, from userid 1022) id 2C8D117CBE; Thu, 20 Jul 2006 11:49:23 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mail.digitalfreaks.org (Postfix) with ESMTP id 13B1A17CB5; Thu, 20 Jul 2006 11:49:23 -0400 (EDT) Date: Thu, 20 Jul 2006 11:49:23 -0400 (EDT) From: "Brian A. Seklecki" X-X-Sender: lavalamp@arbitor.digitalfreaks.org To: Peter Ross In-Reply-To: <43767.150.101.159.26.1140420612.squirrel@mailbox.TU-Berlin.DE> Message-ID: <20060720104238.L8726@arbitor.digitalfreaks.org> References: <43767.150.101.159.26.1140420612.squirrel@mailbox.TU-Berlin.DE> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Geoffrey Giesemann , freebsd-current@freebsd.org, Gleb Smirnoff , wmoran@collaborativefusion.com, David Barbero Subject: Dell PowerEdge 850 bge(4) RELENG_6 (WAS: Re: bge(4) problem) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Jul 2006 15:49:22 -0000 On Mon, 20 Feb 2006, Peter Ross wrote: > I installed a IBM x336, a 1U server. > > it came with two bge interfaces built-in: > (from dmesg) > [...] > which are working fine. > > But I have a problem with two dual port Broadcom cards plugged in into > this box: > > > I cannot connect them to the 1000MBit switch (a Dell Powerswitch, I have a Dell PowerEdge 850 (Generation 3 or 4, hard to tell the way Dell rolls things out). The NICs onboard are: bge0: mem 0xdf9f0000-0xdf9fffff irq 16 at device 0.0 on pci5 bge1: mem 0xdf7f0000-0xdf7fffff irq 17 at device 0.0 on pci6 With the MFC pull up of if_bge.c from 1.24 -> 1.91.2.13 (RELENG_6, RELENG_6_1) compiled in, the link speed negotation / interface link state change problems you describe on this platform persist. Unforunately, I have a RELENG_5_3 box i haven't tried to backport the changes to (I had to hack PCIIDs from -rHEAD just to get 5.3 to recognize this chipset), but it doesn't exhibit this problem. BTW, the problem persists on Dell PowerConnect switches and Cisco Catalyst 3550-48T SMIs. I think this is a driver problem, because, at least on my system, I'm able to run "tcpdump -n -i bge0 -ttt -e" and see incoming packets making it into the NIC cards input buffer. The destination MAC address is the same, and the packets have the SYN flag set, but no TCP socket is ever opened, so it's possible the data is being dropped in the handoff from the layer2 driver to the upper-layer tcp(4) or ip(4) layers: This is: # uname -a FreeBSD tantrum.pitbpa1.priv.collaborativefusion.com 6.1-RELEASE-p3 FreeBSD 6.1-RELEASE-p3 #0: Thu Jul 20 10:31:01 EDT 2006 i386 # ident /boot/kernel/kernel |grep -i bge $FreeBSD: src/sys/dev/bge/if_bge.c,v 1.91.2.13 2006/03/04 09:34:48 oleg Exp $ # ifconfig bge1 bge1: flags=8843 mtu 1500 options=1b inet6 fe80::213:72ff:fe3b:d8bb%bge1 prefixlen 64 scopeid 0x2 inet 206.210.72.75 netmask 0xffffffc0 broadcast 206.210.72.127 ether 00:13:72:3b:d8:bb media: Ethernet 100baseTX status: active # tcpdump -n -i bge1 -ttt -e "port 22" listening on bge1, link-type EN10MB (Ethernet), capture size 96 bytes 000000 00:05:dd:c0:9f:00 > 00:13:72:3b:d8:bb, ethertype IPv4 (0x0800), length 62: 206.210.89.202.55952 > 206.210.72.75.22: S 4262123753:4262123753(0) win 65535 *** But: **** # netstat -tan|grep -i 22 tcp4 0 32 192.168.126.138.22 192.168.2.50.49175 ESTABLISHED tcp4 0 0 *.22 *.* LISTEN tcp6 0 0 *.22 *.* LISTEN There are no errors on the interfaces: [root@tantrum ~]# netstat -i Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll bge0 1500 00:13:72:3b:d8:ba 1060 0 268 0 0 bge0 1500 fe80:1::213:7 fe80:1::213:72ff: 0 - 4 - - bge0 1500 192.168.126.1 tantrum-v14.corp. 469 - 449 - - bge1 1500 00:13:72:3b:d8:bb 175 0 0 0 0 bge1 1500 fe80:2::213:7 fe80:2::213:72ff: 0 - 4 - - bge1 1500 206.210.72.64 unallocated.pitbp 24 - 38 - - Interesting things in "netstat -s": 5 connection accepts This number does not increment on these syn packets. However the following stats do update on connection attempts inbound: 11 syncache entries added 14 retransmitted 12 dupsyn 69 dropped 6 completed 0 bucket overflow 0 cache overflow 0 reset 4 stale 0 aborted 0 badack 0 unreach 0 zone failures The following also increment when a TCP socket does not open as expected: # netstat -s|egrep -i "packets for this host|total packets" ipv4: 2078 total packets received 1800 packets for this host 0 total packets received 0 packets for this host So TCP seems to be affected, ipv4 does not. That may explain part of the problem. Cisco switch interface: as0#sh int fa0/10 FastEthernet0/10 is down, line protocol is down (notconnect) Hardware is Fast Ethernet, address is 0013.7fd2.b88a (bia 0013.7fd2.b88a) Description: Tantrum Mgmnt MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec, reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full-duplex, 100Mb/s, media type is 10/100BaseTX input flow-control is off, output flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input never, output 00:00:25, output hang never Last clearing of "show interface" counters never Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 250653 packets input, 35479549 bytes, 0 no buffer Received 36 broadcasts (0 multicast) 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 10 multicast, 0 pause input 0 input packets with dribble condition detected 1753610 packets output, 260154228 bytes, 0 underruns 0 output errors, 0 collisions, 3 interface resets 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 PAUSE output 0 output buffer failures, 0 output buffers swapped out ---- So in conclusion: 1) Manually ifconfig down, ifconfig up'ing the interface post-boot resolves the problem 2) The Cisco Catalyst switch shows no errors on it's interface 3) The FBSD Interface counters show no errors 4) The kernel seems to see IPV4 packets coming in from the L2 driver bge(4) 5) The TCP/UDP code never seems to recieve the packets 6) Disabling IPMI in the Dell onboard BMC does not resolve the issue 7) Auto negotiate instead of forced full-duplex on both the switch/system does not solve the problem. It causes the system to negotiate up at 100mbs/half-duplex (which is technically impossible), plus Cisco TAC does not support autonegotate configurations. 8) There are no packet filters installed or enabled 9) This is the latest RELENG_6_1 with the bge(4) patch The solution seems to be to simply not use bge(4) hardware. Use Intel instead. Unfortunately, Dell seems to have therr heart set on replacing OEM Intel integrated chipsets with BGE in the new 1950 and 2950s. Not good. ~BAS PS. Also, I just wanted to say that this kind of dry humor wasn't lost on anyone here and much appreciated. >:} >> Are the NICs a copper ones? Do they work properly against each other, >> or against any other equipment? Since you are connecting them to >> a switch that is 1000 km away, I suppose you are using some media >> converters; what equipment is the NIC connected to?