From owner-freebsd-stable@FreeBSD.ORG Tue May 29 18:51:36 2007 Return-Path: X-Original-To: stable@freebsd.org Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 20DFA16A476 for ; Tue, 29 May 2007 18:51:36 +0000 (UTC) (envelope-from vinny@tellurian.com) Received: from mail1.tellurian.net (mail1.tellurian.net [216.182.1.23]) by mx1.freebsd.org (Postfix) with ESMTP id C0ADD13C4CC for ; Tue, 29 May 2007 18:51:35 +0000 (UTC) (envelope-from vinny@tellurian.com) Received: from [216.182.1.34] (cactus.tellurian.net [216.182.1.34]) by mail1.tellurian.net ([216.182.1.23] Tellurian Networks Mail Server version v3.8i3-3) with ESMTP id 646058960-1926380 for ; Tue, 29 May 2007 14:41:31 -0400 Message-ID: <465C73DB.3040500@tellurian.com> Date: Tue, 29 May 2007 14:41:31 -0400 From: Vinny Abello Organization: Tellurian Networks User-Agent: Thunderbird 2.0.0.0 (Windows/20070326) MIME-Version: 1.0 To: stable@freebsd.org X-Enigmail-Version: 0.95.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Authenticated-User: vinny@tellurian.com X-Ultimate-Internet-Connection: Tellurian Networks Cc: Subject: Packet Loss w/bge & BCM5703 on Dell PE2650 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 May 2007 18:51:36 -0000 Hello all, I've isolated a problem which appears to be a bug causing packet loss with FreeBSD 6.0 and later on the Dell PowerEdge 2650 servers and the integrated Broadcom BCM5703 NICs. I thought it was the server at first but I did a clean install of FreeBSD 6.2-RELEASE on another 2650 and see the same issue. This issue did not exist in FreeBSD 5.3 or 4.11. Going from FreeBSD 5.3 to 6.0 is when this problem was introduced. I also installed FreeBSD 6.2-RELEASE on a Dell PowerEdge 2950 and don't see any loss at all. This also uses the bge driver but it has a newer chipset, specifically the BCM5708 vs the problem I'm having with the BCM5703 on the 2650. For my tests I am running extended pings from a Cisco router on the same subnet. I have tuned net.inet.icmp.icmplim to -1 to disable ICMP rate limiting. The packet loss doesn't appear to follow any particular pattern and is generally low, but still there. Below is my dmesg output of one of the 2650's in question followed by an example of the loss I am seeing: Copyright (c) 1992-2007 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 6.2-STABLE #0: Sat Jan 27 00:37:31 EST 2007 root@engbox.tellurian.net:/usr/obj/usr/src/sys/ENGBOX Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2781.54-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf29 Stepping = 9 Features=0xbfebfbff Features2=0x4400> real memory = 4026400768 (3839 MB) avail memory = 3942182912 (3759 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 6 ioapic0: Changing APIC ID to 8 ioapic1: Changing APIC ID to 9 ioapic2: Changing APIC ID to 10 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 irqs 0-15 on motherboard ioapic1 irqs 16-31 on motherboard ioapic2 irqs 32-47 on motherboard acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: on acpi0 cpu1: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 4.0 (no driver attached) pci0: at device 4.1 (no driver attached) pci0: at device 4.2 (no driver attached) pci0: at device 14.0 (no driver attached) atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x17 7,0x376,0x8b0-0x8bf at device 15.1 on pci0 ata0: on atapci0 ata1: on atapci0 ohci0: mem 0xfe100000-0xfe100fff irq 5 at device 15.2 on pci0 ohci0: [GIANT-LOCKED] usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: on ohci0 usb0: USB revision 1.0 uhub0: (0x1166) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 4 ports with 4 removable, self powered isab0: at device 15.3 on pci0 isa0: on isab0 pcib1: on acpi0 pci4: on pcib1 pcib2: at device 8.0 on pci4 pci5: on pcib2 aac0: mem 0xf0000000-0xf7ffffff irq 30 at device 8.1 on pci4 aac0: [FAST] aac0: Adaptec Raid Controller 2.0.0-1 pcib3: on acpi0 pci3: on pcib3 bge0: mem 0xfcf10000-0xfcf1ffff irq 28 a t device 6.0 on pci3 miibus0: on bge0 brgphy0: on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX -FDX, auto bge0: Ethernet address: 00:0d:56:ba:73:bf bge1: mem 0xfcf00000-0xfcf0ffff irq 29 a t device 8.0 on pci3 miibus1: on bge1 brgphy1: on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX -FDX, auto bge1: Ethernet address: 00:0d:56:ba:73:c1 pcib4: on acpi0 pci2: on pcib4 pcib5: on acpi0 pci1: on pcib5 fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: model IntelliMouse Explorer, device ID 4 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A pmtimer0 on isa0 orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xcbfff,0xec000-0xeffff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec acd0: CDROM at ata0-master UDMA33 aacd0: on aac0 aacd0: 34712MB (71091456 sectors) SMP: AP CPU #1 Launched! Trying to mount root from ufs:/dev/aacd0s1a bge0: link state changed to UP Router#ping 216.182.1.13 repeat 1000 Type escape sequence to abort. Sending 1000, 100-byte ICMP Echos to 216.182.1.13, timeout is 2 seconds: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!.!!!.!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!! Success rate is 99 percent (997/1000), round-trip min/avg/max = 1/1/8 ms -- Vinny Abello Network Engineer vinny@tellurian.com (973)940-6100 PGP Key Fingerprint: 3BC5 9A48 FC78 03D3 82E0 E935 5325 FBCB 0100 977A Tellurian Networks - The Ultimate Internet Connection http://www.tellurian.com (888)TELLURIAN "Courage is resistance to fear, mastery of fear - not absence of fear" -- Mark Twain