From owner-freebsd-current@FreeBSD.ORG Sun Oct 17 21:27:30 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B3C4016A4CE; Sun, 17 Oct 2004 21:27:30 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 65FC043D2F; Sun, 17 Oct 2004 21:27:30 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.13.1/8.13.1) with ESMTP id i9HLRPSm034142; Sun, 17 Oct 2004 17:27:25 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i9HLRP81034139; Sun, 17 Oct 2004 17:27:25 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Sun, 17 Oct 2004 17:27:25 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Vlad In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: scottl@freebsd.org cc: current@freebsd.org cc: Marc UBM Bocklet Subject: Re: [BETA7-panic] sodealloc(): so_count 1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Oct 2004 21:27:30 -0000 On Sun, 17 Oct 2004, Vlad wrote: > is there a specific condition when that happens? I tried to simulate > heavy tcp traffic from number of sources but could not induct the panic > by such artificial traffic. It happened to me only in 'natural' way ;) > > so maybe if you know exactly how to trigger it, and share that with us, > we could do some workaround on live production servers so it doesn't > happen, until it's fixed in the code? Yeah -- I'm cleaning up the reproduction pieces and will commit them to the FreeBSD source tree in the regression area this evening or tomorrow. Basically, the race that seems to currently be triggering occurs when there's a simultaneous close() on a socket at the same time as a RST is received on the socket causing the protocol to also try and close the socket. I reproduce it by running the tcpconnect server (in the regression tree) on a TCP port, then running the test tool so that it connects and immediately resets the connection. I.e., it sounds like a reference count issue due to sockets using a slightly aberrant reference model. I'll try to come up with a workaround sometime in the next 12-24 hours, and hopefully also a proper fix. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Principal Research Scientist, McAfee Research > > > > The good news and the bad news: after spending a day or two hacking up an > > IP stack simulator to simulate various nasty combinations of TCP packets, > > I've managed to reproduce the problem, and am able to get a core. I'm > > currently working on tracking down the problem. > > -- > Vlad >