From owner-freebsd-current@FreeBSD.ORG Mon Nov 19 18:54:55 2007 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE0A616A417; Mon, 19 Nov 2007 18:54:55 +0000 (UTC) (envelope-from gwright@antiope.com) Received: from mho-02-bos.mailhop.org (mho-02-bos.mailhop.org [63.208.196.179]) by mx1.freebsd.org (Postfix) with ESMTP id 9323913C46E; Mon, 19 Nov 2007 18:54:55 +0000 (UTC) (envelope-from gwright@antiope.com) Received: from pool-72-88-201-27.nwrknj.fios.verizon.net ([72.88.201.27] helo=mailserver1.18clay.com) by mho-02-bos.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.68) (envelope-from ) id 1IuBUD-000CrU-4K; Mon, 19 Nov 2007 18:36:45 +0000 Received: from [10.1.0.6] (unknown [10.1.0.6]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mailserver1.18clay.com (Postfix) with ESMTP id 571E26EA2F; Mon, 19 Nov 2007 13:36:42 -0500 (EST) X-Mail-Handler: MailHop Outbound by DynDNS X-Originating-IP: 72.88.201.27 X-Report-Abuse-To: abuse@dyndns.com (see http://www.mailhop.org/outbound/abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX18qP259b4wCt04RlOjoHBtmbfuApbM3Wl0= In-Reply-To: <473CDDAC.9020503@freebsd.org> References: <46B41421-3112-40C6-84D9-094FA771F93E@antiope.com> <4735CE3A.7020905@freebsd.org> <473780DB.2040705@freebsd.org> <17995F62-7E9B-42E3-A7FA-30143C704C34@antiope.com> <473CDDAC.9020503@freebsd.org> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Gregory Wright Date: Mon, 19 Nov 2007 13:36:37 -0500 To: Andre Oppermann X-Mailer: Apple Mail (2.752.2) Cc: freebsd-current@freebsd.org Subject: Re: excessive TCP dulplicate acks revisted X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Nov 2007 18:54:55 -0000 > Gregory Wright wrote: >> On Nov 11, 2007, at 5:23 PM, Andre Oppermann wrote: >>> Gregory Wright wrote: >>>> On Nov 10, 2007, at 10:28 AM, Andre Oppermann wrote: >>>>> >>>> Hi Andre, >>>> I also took a look at the bge (4) driver in 7.0-BETA2. As far >>>> as I can tell, >>>> it does not support TSO (there is no ioctl supporting TSO enable/ >>>> disable >>>> as there is for the em(4) driver). >>> >>>> Might the chip --- a BCM5704_B0 --- not be completely >>>> initialized? This >>>> might explain why the machine with the BCM5714_B3 chips works, >>>> while >>>> the other machine shows the duplicate ACK bug. >>> >>> Perhaps. Do you see the duplicate ACKs in a tcpdump on both the >>> sender >>> and the receiver? If you see it on the sender too, then it must >>> be a >>> bug in our network stack or the driver (by requeuing the same packet >>> over and over again). >>> >>> --Andre >> The logs show that the duplicate ACKs are generated only by the >> receiver. I suspect a bug in the driver, perhaps the ACK packet >> is not being removed from the TX buffer ring. Examining the >> transmitted >> packets should be enough to rule out a network stack problem. Is >> there any debugging infrastructure I can use or do I just have to >> hack in on my own? > > We don't have an infrastructure to deal with this kind of driver > problems. You have to instrument the driver code to report stuck > mbufs. > Hi Andre, I have some additional information that indicates this is a driver bug. There was a report to one of the Gentoo linux mailing lists of the same problem with BCM5704s, in which everything worked at 1 Gb/s, but duplicate ACKs were seen at 100 Mb/s. Link to the message: http://forums.gentoo.org/viewtopic-t-530707-highlight-bcm5704.html The report said that the problem was solved by upgrading the linux kernel from 2.6.17 to 2.6.18. I've compared the tg3 drivers in the two releases are were quite a few changes, so it will take a while to track down what the key fix was. So the bug in the bge driver for these chips can likely be fixed. Thanks for your help. Best Wishes, Greg