From owner-freebsd-current@FreeBSD.ORG Wed Feb 16 15:35:11 2005 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3AF6716A4D0 for ; Wed, 16 Feb 2005 15:35:11 +0000 (GMT) Received: from relay03.pair.com (relay03.pair.com [209.68.5.17]) by mx1.FreeBSD.org (Postfix) with SMTP id 60A4343D5E for ; Wed, 16 Feb 2005 15:35:10 +0000 (GMT) (envelope-from pho@holm.cc) Received: (qmail 62495 invoked from network); 16 Feb 2005 15:35:08 -0000 Received: from unknown (HELO peter.osted.lan) (unknown) by unknown with SMTP; 16 Feb 2005 15:35:08 -0000 X-pair-Authenticated: 80.161.118.233 Received: from peter.osted.lan (localhost.osted.lan [127.0.0.1]) by peter.osted.lan (8.13.1/8.13.1) with ESMTP id j1GFZ8aA008637; Wed, 16 Feb 2005 16:35:08 +0100 (CET) (envelope-from pho@peter.osted.lan) Received: (from pho@localhost) by peter.osted.lan (8.13.1/8.13.1/Submit) id j1GFZ8TS008636; Wed, 16 Feb 2005 16:35:08 +0100 (CET) (envelope-from pho) Date: Wed, 16 Feb 2005 16:35:08 +0100 From: Peter Holm To: Robert Watson Message-ID: <20050216153508.GA8590@peter.osted.lan> References: <20050207083459.GA95394@peter.osted.lan> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i cc: current@FreeBSD.org Subject: Re: panic: tcp_input: TCPS_LISTEN in netinet/tcp_input.c:1016 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Feb 2005 15:35:11 -0000 On Wed, Feb 16, 2005 at 02:01:48PM +0000, Robert Watson wrote: > > On Mon, 7 Feb 2005, Peter Holm wrote: > > > While stress testing GENERIC HEAD from Feb 5 09:19 UTC + mpsafe_vfs = 1 > > I got: > > > > panic(c0832f7d,0,0,1,0) at panic+0x14b > > tcp_input(c27fca00,14,c27fca00,0,0) at tcp_input+0xbf6 > > ip_input(c27fca00) at ip_input+0x50d > > netisr_processqueue(c0944cd8) at netisr_processqueue+0x6e > > swi_net(0) at swi_net+0xbe > > ithread_loop(c154d180,cbc90d48,c154d180,c0601f84,0) at ithread_loop+0x120 > > fork_exit(c0601f84,c154d180,cbc90d48) at fork_exit+0xa4 > > fork_trampoline() at fork_trampoline+0x8 > > > > Details at http://www.holm.cc/stress/log/cons115.html > > A KTR dump with KTR_LOCK|KTR_BUF is available. > > > > Related reports: > > > > -rw-r--r-- 1 holm users 10641 Dec 20 16:51 cons96.html > > -rw-r--r-- 1 holm users 9906 Dec 26 15:28 cons98.html > > -rw-r--r-- 1 holm users 15189 Dec 29 22:17 cons99.html > > This would appear to be an inter-layer race between the socket code and > the TCP code. In particular, it looks like a SYN has come in during the > call to listen() on another CPU (or perhaps a preempted thread), after the > TCP state has been set up for the listening tcpcb, but before the > SO_ACCEPTCONN flag is set in the socket state. The TCP code panics > because it expects that if a tcpcb is in TCPS_LISTEN, the matching socket > should be in SO_ACCEPTCONN. I'm working on a patch and hope to put it > together for you today. However, this patch substantially tears up the > current listen code for several protocols, so it will need a fair amount > of testing. > > If this problem should occur again, it would be very interesting to know > if ps, show threads, trace, et al, showed either a preempted thread on the > current CPU, or a thread on another CPU, in the listen() system call. > I had saved a copy of the kernel + core so here's a ps + backtraces: http://www.holm.cc/stress/log/cons115a.html This is a single CPU box, but process 75980 has a listen() in the backtrace. > Thanks for the (as usual) excellent bug report! > > Robert N M Watson You're welcome. It is always great to get feedback on the testing I'm doing. -- Peter Holm