From owner-freebsd-current@FreeBSD.ORG Sun Sep 12 14:57:58 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5B77716A4CE; Sun, 12 Sep 2004 14:57:58 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id D396443D3F; Sun, 12 Sep 2004 14:57:57 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.13.1/8.13.1) with ESMTP id i8CEvjjH005062; Sun, 12 Sep 2004 10:57:45 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i8CEvjIp005059; Sun, 12 Sep 2004 10:57:45 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Sun, 12 Sep 2004 10:57:45 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Andre Guibert de Bruet In-Reply-To: <20040912025037.Y84468@alpha.siliconlandmark.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: Kris Kennaway cc: current@FreeBSD.ORG Subject: Re: 6-CURRENT Network stack issues w/SMP? (Was: Re: TreeListfailed: Network write failure: ChannelMux.ProtocolError) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Sep 2004 14:57:58 -0000 On Sun, 12 Sep 2004, Andre Guibert de Bruet wrote: > On Sun, 12 Sep 2004, Kris Kennaway wrote: > > > On Sun, Sep 12, 2004 at 02:42:03AM -0400, Andre Guibert de Bruet wrote: > > > >>> I've also noticed data corruption in the form of failed CRCs (And hence > >>> dropped SSH connections) while transferring large amounts of data via SSH > >>> over gige to a machine on its subnet. These problems started occuring > >>> after the giant-less networking megacommit. Older kernels check out > >>> without any such issues. > > > > Does it go away if you turn off debug.mpsafenet? If not, it's > > probably not related to that commit. > > Setting debug.mpsafenet to 0 allows the SSH transfers to complete. The > MD5 checksums and sizes match. Where do we go from here? I think I'd look at the following next: - Does your network interface driver support checksum offload? If so, what happens if you disable that? - Is the network interface driver marked as INTR_MPSAFE and/or not IFF_NEEDSGIANT. If either, try setting the driver to run with Giant by removing INTR_MPSAFE and adding IFF_NEEDSGIANT. After that I think we want to try and produce a non-SSH reproduction scenario using a very simple test program... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research