From owner-freebsd-current@FreeBSD.ORG  Thu Jul 20 15:49:22 2006
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
X-Original-To: freebsd-current@freebsd.org
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2121016A4E1;
	Thu, 20 Jul 2006 15:49:22 +0000 (UTC)
	(envelope-from lavalamp@spiritual-machines.org)
Received: from mail.digitalfreaks.org (arbitor.digitalfreaks.org
	[216.151.95.158])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A368643D46;
	Thu, 20 Jul 2006 15:49:21 +0000 (GMT)
	(envelope-from lavalamp@spiritual-machines.org)
Received: by mail.digitalfreaks.org (Postfix, from userid 1022)
	id 2C8D117CBE; Thu, 20 Jul 2006 11:49:23 -0400 (EDT)
Received: from localhost (localhost [127.0.0.1])
	by mail.digitalfreaks.org (Postfix) with ESMTP id 13B1A17CB5;
	Thu, 20 Jul 2006 11:49:23 -0400 (EDT)
Date: Thu, 20 Jul 2006 11:49:23 -0400 (EDT)
From: "Brian A. Seklecki" <lavalamp@spiritual-machines.org>
X-X-Sender: lavalamp@arbitor.digitalfreaks.org
To: Peter Ross <Peter.Ross@alumni.tu-berlin.de>
In-Reply-To: <43767.150.101.159.26.1140420612.squirrel@mailbox.TU-Berlin.DE>
Message-ID: <20060720104238.L8726@arbitor.digitalfreaks.org>
References: <43767.150.101.159.26.1140420612.squirrel@mailbox.TU-Berlin.DE>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Geoffrey Giesemann <geoffwa@idkfa.ath.cx>, freebsd-current@freebsd.org,
	Gleb Smirnoff <glebius@FreeBSD.org>, wmoran@collaborativefusion.com,
	David Barbero <sico@loquefaltaba.com>
Subject: Dell PowerEdge 850 bge(4) RELENG_6 (WAS: Re: bge(4) problem)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Jul 2006 15:49:22 -0000

On Mon, 20 Feb 2006, Peter Ross wrote:

> I installed a IBM x336, a 1U server.
>
> it came with two bge interfaces built-in:
> (from dmesg) <Broadcom BCM5721 Gigabit Ethernet, ASIC rev. 0x4101>
>

[...]

> which are working fine.
>
> But I have a problem with two dual port Broadcom cards plugged in into
> this box:
> <Broadcom BCM5704C Dual Gigabit Ethernet, ASIC rev. 0x2100>
>
> I cannot connect them to the 1000MBit switch (a Dell Powerswitch,

I have a Dell PowerEdge 850 (Generation 3 or 4, hard to tell the way Dell 
rolls things out).  The NICs onboard are:

bge0: <Broadcom BCM5721 Gigabit Ethernet, ASIC rev. 0x4101> mem 
0xdf9f0000-0xdf9fffff irq 16 at device 0.0 on pci5
bge1: <Broadcom BCM5721 Gigabit Ethernet, ASIC rev. 0x4101> mem 
0xdf7f0000-0xdf7fffff irq 17 at device 0.0 on pci6

With the MFC pull up of if_bge.c from 1.24 -> 1.91.2.13 (RELENG_6, 
RELENG_6_1) compiled in, the link speed negotation / interface link state 
change problems you describe on this platform persist.

Unforunately, I have a RELENG_5_3 box i haven't tried to backport the 
changes to (I had to hack PCIIDs from -rHEAD just to get 5.3 to recognize 
this chipset), but it doesn't exhibit this problem.

BTW, the problem persists on Dell PowerConnect switches and Cisco 
Catalyst 3550-48T SMIs.

I think this is a driver problem, because, at least on my system, I'm able 
to run "tcpdump -n -i bge0 -ttt -e" and see incoming packets making it 
into the NIC cards input buffer.  The destination MAC address is the same, 
and the packets have the SYN flag set, but no TCP socket is ever opened, 
so it's possible the data is being dropped in the handoff from the layer2 
driver to the upper-layer tcp(4) or ip(4) layers:

This is:

# uname -a
FreeBSD tantrum.pitbpa1.priv.collaborativefusion.com 6.1-RELEASE-p3 
FreeBSD 6.1-RELEASE-p3 #0: Thu Jul 20 10:31:01 EDT 2006 
i386

# ident /boot/kernel/kernel |grep -i bge
      $FreeBSD: src/sys/dev/bge/if_bge.c,v 1.91.2.13 2006/03/04 09:34:48 oleg Exp $

# ifconfig bge1
bge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
         options=1b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING>
         inet6 fe80::213:72ff:fe3b:d8bb%bge1 prefixlen 64 scopeid 0x2
         inet 206.210.72.75 netmask 0xffffffc0 broadcast 206.210.72.127
         ether 00:13:72:3b:d8:bb
         media: Ethernet 100baseTX <full-duplex>
         status: active

# tcpdump -n -i bge1 -ttt -e "port 22"
listening on bge1, link-type EN10MB (Ethernet), capture size 96 bytes

000000 00:05:dd:c0:9f:00 > 00:13:72:3b:d8:bb, ethertype IPv4 (0x0800), 
length 62: 206.210.89.202.55952 > 206.210.72.75.22: S 
4262123753:4262123753(0) win 65535 <mss 1460,sackOK,eol>

*** But: ****

# netstat -tan|grep -i 22
tcp4       0     32  192.168.126.138.22     192.168.2.50.49175 
ESTABLISHED
tcp4       0      0  *.22                   *.*                    LISTEN
tcp6       0      0  *.22                   *.*                    LISTEN


There are no errors on the interfaces:

[root@tantrum ~]# netstat -i
Name    Mtu Network       Address              Ipkts Ierrs    Opkts Oerrs 
Coll
bge0   1500 <Link#1>      00:13:72:3b:d8:ba     1060     0      268     0 
0
bge0   1500 fe80:1::213:7 fe80:1::213:72ff:        0     -        4     - 
-
bge0   1500 192.168.126.1 tantrum-v14.corp.      469     -      449     - 
-
bge1   1500 <Link#2>      00:13:72:3b:d8:bb      175     0        0     0 
0
bge1   1500 fe80:2::213:7 fe80:2::213:72ff:        0     -        4     - 
-
bge1   1500 206.210.72.64 unallocated.pitbp       24     -       38     - 
-

Interesting things in "netstat -s":

         5 connection accepts

This number does not increment on these syn packets.

However the following stats do update on connection attempts inbound:

         11 syncache entries added
                 14 retransmitted
                 12 dupsyn
                 69 dropped
                 6 completed
                 0 bucket overflow
                 0 cache overflow
                 0 reset
                 4 stale
                 0 aborted
                 0 badack
                 0 unreach
                 0 zone failures


The following also increment when a TCP socket does not open as expected:

# netstat -s|egrep -i "packets for this host|total packets"
ipv4:

         2078 total packets received
         1800 packets for this host
         0 total packets received
         0 packets for this host

So TCP seems to be affected, ipv4 does not.  That may explain part of the 
problem.


Cisco switch interface:

as0#sh int fa0/10
FastEthernet0/10 is down, line protocol is down (notconnect)
   Hardware is Fast Ethernet, address is 0013.7fd2.b88a (bia 
0013.7fd2.b88a)
   Description: Tantrum Mgmnt
   MTU 1500 bytes, BW 100000 Kbit, DLY 100 usec,
      reliability 255/255, txload 1/255, rxload 1/255
   Encapsulation ARPA, loopback not set
   Keepalive set (10 sec)
   Full-duplex, 100Mb/s, media type is 10/100BaseTX
   input flow-control is off, output flow-control is unsupported
   ARP type: ARPA, ARP Timeout 04:00:00
   Last input never, output 00:00:25, output hang never
   Last clearing of "show interface" counters never
   Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
   Queueing strategy: fifo
   Output queue: 0/40 (size/max)
   5 minute input rate 0 bits/sec, 0 packets/sec
   5 minute output rate 0 bits/sec, 0 packets/sec
      250653 packets input, 35479549 bytes, 0 no buffer
      Received 36 broadcasts (0 multicast)
      0 runts, 0 giants, 0 throttles
      0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
      0 watchdog, 10 multicast, 0 pause input
      0 input packets with dribble condition detected
      1753610 packets output, 260154228 bytes, 0 underruns
      0 output errors, 0 collisions, 3 interface resets
      0 babbles, 0 late collision, 0 deferred
      0 lost carrier, 0 no carrier, 0 PAUSE output
      0 output buffer failures, 0 output buffers swapped out


----

So in conclusion:

1) Manually ifconfig down, ifconfig up'ing the interface post-boot
    resolves the problem
2) The Cisco Catalyst switch shows no errors on it's interface
3) The FBSD Interface counters show no errors
4) The kernel seems to see IPV4 packets coming in from the L2 driver
    bge(4)
5) The TCP/UDP code never seems to recieve the packets
6) Disabling IPMI in the Dell onboard BMC does not resolve the issue
7) Auto negotiate instead of forced full-duplex on both the switch/system
    does not solve the problem.  It causes the system to negotiate up at
    100mbs/half-duplex (which is technically impossible), plus Cisco TAC
    does not support autonegotate configurations.
8) There are no packet filters installed or enabled
9) This is the latest RELENG_6_1 with the bge(4) patch

The solution seems to be to simply not use bge(4) hardware.  Use Intel 
instead.  Unfortunately, Dell seems to have therr heart set on replacing 
OEM Intel integrated chipsets with BGE in the new 1950 and 2950s.

Not good.

~BAS

PS. Also, I just wanted to say that this kind of dry humor wasn't lost on 
anyone here and much appreciated. >:}

>> Are the NICs a copper ones? Do they work properly against each other,
>> or against any other equipment? Since you are connecting them to
>> a switch that is 1000 km away, I suppose you are using some media
>> converters; what equipment is the NIC connected to?