From owner-freebsd-current@FreeBSD.ORG Thu Jun 21 16:09:07 2007 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1311616A421 for ; Thu, 21 Jun 2007 16:09:07 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.208.78.105]) by mx1.freebsd.org (Postfix) with ESMTP id ED8AC13C468 for ; Thu, 21 Jun 2007 16:09:06 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.1/8.13.8) with ESMTP id l5LG7htI010297 for ; Thu, 21 Jun 2007 09:07:43 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.1/8.13.8/Submit) id l5LG7hwo010296 for freebsd-current@freebsd.org; Thu, 21 Jun 2007 09:07:43 -0700 (PDT) (envelope-from sgk) Date: Thu, 21 Jun 2007 09:07:43 -0700 From: Steve Kargl To: freebsd-current@freebsd.org Message-ID: <20070621160742.GA10264@troutmask.apl.washington.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.2i Subject: Which GigE NIC for reliable use? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jun 2007 16:09:07 -0000 I've been experiencing problems with the bge device for several weeks. In this time, I've tried tuning every imaginable parameter that I could find. There appear to be several related problems: node10:kargl[203] netstat -I bge1 Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll bge1 9000 00:e0:81:40:48:93 81505160 238721 81933513 9 0 bge1 9000 192.168.0.0 node10 81504878 - 81933689 - - Notice the Ierrs value continuously grows with the MPI application I have runs. In /var/log/messages one finds: Jun 20 23:20:42 node10 kernel: bge1: watchdog timeout -- resetting Jun 20 23:20:42 node10 kernel: bge1: link state changed to DOWN Jun 20 23:20:46 node10 kernel: bge1: link state changed to UP This DOWN/UP breaks the MPI application and leads to several additional messeages of the form. Jun 20 23:22:33 node10 kernel: TCP: [10.208.78.111]:54801 to [10.208.78.111]:49376 tcpflags 0x10; syncache_expand: Segment failed SYNCOOKIE authentication, segment rejected (probably spoofed) So, I plan to replace all of the bge devices with a reliable, robust GigE NIC. Anyone have a suggestion for such a cards? -- Steve