From owner-freebsd-hackers Mon Jan 28 13:27:53 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from root.com (root.com [209.102.106.178]) by hub.freebsd.org (Postfix) with ESMTP id 2610B37B417 for ; Mon, 28 Jan 2002 13:27:47 -0800 (PST) Received: (from dg@localhost) by root.com (8.11.2/8.11.2) id g0SLE1r79683; Mon, 28 Jan 2002 13:14:01 -0800 (PST) (envelope-from dg) Date: Mon, 28 Jan 2002 13:14:01 -0800 From: David Greenman To: Brooks Davis Cc: hackers@freebsd.org Subject: Re: bge + hardware checksum hangs Message-ID: <20020128131401.D64333@nexus.root.com> References: <20020128124050.A13399@Odin.AC.HMC.Edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020128124050.A13399@Odin.AC.HMC.Edu>; from brooks@one-eyed-alien.net on Mon, Jan 28, 2002 at 12:40:50PM -0800 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG >It looks like the TCP recieve checksum issues weren't the only ones we >had to contend with. I've got a couple of new iXsystems 2650's with >3Com 3C996-T's in them and while running cvsup I get long hangs usually >resulting in a lost connection. When the machines recover I see >watchdog timeout messages in /var/log/messages. The current system >configuration is a bit weird in that I've got the nic hooked up to a >10/100 HUB so I'm currently running 100 half-duplex. > >Acting on the theory that HW checksuming had already failed in some >situations, I modified the BGE_CSUM_FEATURES define to 0 and so far things >seem to be working. I'm in the middle of a ports cvsup and I completed >a cvsup over the 4.5 branch and tagging without a hitch. This seems to >imply that at least TCP checksuming is broken across the board. > >The really odd thing is that I haven't had any real problems with local >connections, only cvsups and possiably one hang due to a whole lot of >console output over ssh. I've been able to do 10 minute long netperf >runs in both TCP_STREAM and TCP_RR modes to local hosts without any >hangs. > >Does anyone have any ideas other them the current disabling of hardware >checksuming? That's probably fine for now, but it's really going to >suck on the core NFS server for this cluster once we're up and running. I think the brokeness is chipset revision dependant. We're using SysKonnect cards here at Download Technologies extensively and have only seen the input checksum bug (which we worked around prior to deployment of the servers) - no hangs and this is with typically 30-50Mbps sustained per server out to the Internet over a two month period. -DG David Greenman Co-founder, The FreeBSD Project - http://www.freebsd.org President, TeraSolutions, Inc. - http://www.terasolutions.com President, Download Technologies, Inc. - http://www.downloadtech.com Pave the road of life with opportunities. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message