From owner-freebsd-bugs@FreeBSD.ORG Tue Mar 14 12:25:27 2006 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 64BEF16A400; Tue, 14 Mar 2006 12:25:27 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 32CBE43D45; Tue, 14 Mar 2006 12:25:27 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from freefall.freebsd.org (rwatson@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id k2ECPRh8062609; Tue, 14 Mar 2006 12:25:27 GMT (envelope-from rwatson@freefall.freebsd.org) Received: (from rwatson@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id k2ECPRtd062605; Tue, 14 Mar 2006 12:25:27 GMT (envelope-from rwatson) Date: Tue, 14 Mar 2006 12:25:27 GMT From: Robert Watson Message-Id: <200603141225.k2ECPRtd062605@freefall.freebsd.org> To: rwatson@FreeBSD.org, freebsd-bugs@FreeBSD.org, rwatson@FreeBSD.org Cc: Subject: Re: kern/94433: [panic] panic: sbdrop X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Mar 2006 12:25:27 -0000 Synopsis: [panic] panic: sbdrop Responsible-Changed-From-To: freebsd-bugs->rwatson Responsible-Changed-By: rwatson Responsible-Changed-When: Tue Mar 14 12:20:30 UTC 2006 Responsible-Changed-Why: Grab ownership of this PR. I've been tracking a number of related reports for a bug that has the same symptoms as this for over a year now, and it has proved very hard to track down. I have a fairly large set of changes outstanding against CVS HEAD that likely resolve this, or at least, make it easier to track down, but it will probably be a while before they hit the RELENG_6 branch, as they will require extensive testing. The problem is that by the time an assertion fires, the socket buffer memory corruption has long since occurred. If you're able to (relatively) easily reproduce the panic, we can likely make some progress. The first thing to do, performance permitting, is to turn on INVARIANTS, INVARIANT_SUPPORT, and SOCKBUF_DEBUG options in the kernel. These will have a noticeable performance impact, and I don't know if that is compatible with your workload. Also, they change the timing of the socket code significantly, so may cause the race condition to close, meaning we can't reproduce it with the debugging options on. However, it's worth a try. BTW, are you using IPv6 at all on the box? Sorry I can't be more helpful, other than to say that we know there's an issue, and we've invested quite a bit of time trying to track it down (and thus far failing), and now in redesigning the code in quetion to avoid the problems that are likely the cause, but that will take a bit to come to fruition. Further debugging to see if we can identify the specific cause would be good, if it's possible! http://www.freebsd.org/cgi/query-pr.cgi?pr=94433