From owner-freebsd-stable@FreeBSD.ORG Mon Dec 11 14:29:30 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A63AE16A412 for ; Mon, 11 Dec 2006 14:29:30 +0000 (UTC) (envelope-from greg@warprecords.com) Received: from mail9.messagelabs.com (mail9.messagelabs.com [194.205.110.133]) by mx1.FreeBSD.org (Postfix) with SMTP id EA26744350 for ; Mon, 11 Dec 2006 14:07:09 +0000 (GMT) (envelope-from greg@warprecords.com) X-VirusChecked: Checked X-Env-Sender: greg@warprecords.com X-Msg-Ref: server-4.tower-9.messagelabs.com!1165846082!24683654!1 X-StarScan-Version: 5.5.10.7; banners=-,-,- X-Originating-IP: [212.135.210.82] Received: (qmail 22739 invoked from network); 11 Dec 2006 14:08:02 -0000 Received: from dsl-212-135-210-82.dsl.easynet.co.uk (HELO warprecords.com) (212.135.210.82) by server-4.tower-9.messagelabs.com with SMTP; 11 Dec 2006 14:08:02 -0000 Received: from [192.168.100.36] (HELO [192.168.0.10]) by warprecords.com (CommuniGate Pro SMTP 5.0.10) with ESMTPS id 7022169 for freebsd-stable@freebsd.org; Mon, 11 Dec 2006 14:08:02 +0000 Mime-Version: 1.0 (Apple Message framework v752.2) Content-Transfer-Encoding: 7bit Message-Id: Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: freebsd-stable@freebsd.org From: Greg Eden Date: Mon, 11 Dec 2006 14:08:00 +0000 X-Mailer: Apple Mail (2.752.2) Subject: bge Ierr rate increase from 5.3R -> 6.1R X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Dec 2006 14:29:30 -0000 Hello I recently updated two production servers from 5.3 to 6.1 via cvsup and buildworld. Since the upgrade I've seen an increase in the number of Input packet errors reported on the bge cards in on both boxes. One is a HP DL360g3, the other is a HP DL380g3. Both have a pair of 2.8GHz Xeons with a SMP kernel. So - was the old driver previously underreporting, is the new one over-reporting/causing the error rate or is it something else? Cables and cabling have not changed and the pickup in number of errors is quite distinct. I monitor the nightly 'periodic daily' phone homes closely. Example from the DL380g3. The 5.3->6.1 upgrade was on 8 November. 58,000 errors in 1 month compared to 2 errors in 1 year with 5.3. Network interface status: Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll bge0 1500 00:0f:20:f6:**:** 1344182650 58292 2701993948 0 0 bge0 1500 192.168.**/25 ********** 1344176611 - 2701984851 - - bge1* 1500 00:0f:20:f6:**:** 0 0 0 0 0 lo0 16384 69549 0 69549 0 0 lo0 16384 your-net localhost 69549 - 69549 - - A couple of weeks ago I turned off tx and rx check summing on this box as I gathered from googling it might be contributing. That had no effect. Upon further investigation it appears six other boxes with bge ports (mostly HP DL360g4) running 6.1 started reporting errors when moved to 6.1. As they do only a small fraction of the traffic that the above box does I hadn't noticed it. This box (a UP HP DL360g4) is on a completely different network, different switch, cabling etc. Again, prior to 6.1 it had never reported an error in 18 months of service. Network interface status: Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll bge0 1500 00:12:79:3b:**:** 781001814 1980 1056534485 0 0 bge0 1500 192.168.*** 192.168.***.*** 783877018 - 1061029115 - - I don't have a spare box with a bge interface to test 6.2 for the same behaviour, but would be interested if anyone had an explanation. Best wishes. Greg.