From owner-freebsd-stable@FreeBSD.ORG Wed Sep 17 15:12:07 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8F05E1065670; Wed, 17 Sep 2008 15:12:07 +0000 (UTC) (envelope-from fbsd-ml@scrapper.ca) Received: from idcmail-mo2no.shaw.ca (idcmail-mo2no.shaw.ca [64.59.134.9]) by mx1.freebsd.org (Postfix) with ESMTP id 4A8288FC1A; Wed, 17 Sep 2008 15:12:07 +0000 (UTC) (envelope-from fbsd-ml@scrapper.ca) Received: from pd6ml1no-ssvc.prod.shaw.ca ([10.0.153.160]) by pd5mo1no-svcs.prod.shaw.ca with ESMTP; 17 Sep 2008 09:12:06 -0600 X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.0 c=0 a=JW3AWFW-Re_nHgVHTgcA:9 a=NXKziLqQlZMaNGzEdG4A:7 a=3lBR57Q6_UQEP36Vgd7-vqfOdbYA:4 a=1jPFEaVNb5QA:10 a=g1R10qNEJU8A:10 Received: from s010600121729c74c.vc.shawcable.net (HELO proven.lan) ([24.85.241.34]) by pd6ml1no-dmz.prod.shaw.ca with ESMTP; 17 Sep 2008 09:12:06 -0600 Received: from proven.lan (localhost [127.0.0.1]) by proven.lan (8.14.3/8.14.3) with ESMTP id m8HFC5qW084378; Wed, 17 Sep 2008 08:12:05 -0700 (PDT) (envelope-from fbsd-ml@scrapper.ca) Received: from localhost (localhost [[UNIX: localhost]]) by proven.lan (8.14.3/8.14.3/Submit) id m8HFC5fV084377; Wed, 17 Sep 2008 08:12:05 -0700 (PDT) (envelope-from fbsd-ml@scrapper.ca) X-Authentication-Warning: proven.lan: npapke set sender to fbsd-ml@scrapper.ca using -f From: Norbert Papke Organization: Archaeological Filing To: Robert Watson Date: Wed, 17 Sep 2008 08:12:04 -0700 User-Agent: KMail/1.9.10 References: <200809141219.24943.fbsd-ml@scrapper.ca> <200809151813.58749.fbsd-ml@scrapper.ca> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200809170812.05338.fbsd-ml@scrapper.ca> Cc: Gavin Atkinson , freebsd-stable@freebsd.org Subject: Re: Possible UDP related deadlock in 7.1-PRERELEASE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Sep 2008 15:12:07 -0000 On September 17, 2008, Robert Watson wrote: > On Mon, 15 Sep 2008, Norbert Papke wrote: > > With WITNESS enabled, I now experience panics and could not follow your > > instructions. There is no core dump. The following gets logged to > > /var/log/messages: > > > > shared lock of (rw) udpinp @ /usr/src/sys/netinet/udp_usrreq.c:864 > > while exclusively locked from /usr/src/sys/netinet6/udp6_usrreq.c:940 > > panic: share->excl > > KDB: stack backtrace: > > db_trace_self_wrapper(c06fda7c,f6b96978,c052046a,c06fbb5d,c07695c0,...) > > at db_trace_self_wrapper+0x26 > > kdb_backtrace(c06fbb5d,c07695c0,c06febd1,f6b96984,f6b96984,...) at > > kdb_backtrace+0x29 > > panic(c06febd1,c070c409,3ac,c0709eee,360,...) at panic+0xaa > > witness_checkorder(ccd5209c,1,c0709eee,360,8,...) at > > witness_checkorder+0x17c > > _rw_rlock(ccd5209c,c0709eee,360,c07780e0,cd4652c8,...) at _rw_rlock+0x2a > > udp_send(d3942000,0,c580f400,c68faa00,0,...) at udp_send+0x197 > > udp6_send(d3942000,0,c580f400,c68faa00,0,...) at udp6_send+0x140 > > sosend_generic(d3942000,c68faa00,f6b96be8,0,0,...) at > > sosend_generic+0x50d sosend(d3942000,c68faa00,f6b96be8,0,0,...) at > > sosend+0x3f > > kern_sendit(cd465230,f,f6b96c64,0,0,...) at kern_sendit+0x106 > > sendit(0,871b9fe,0,c68faa00,1c,...) at sendit+0x182 > > sendto(cd465230,f6b96cfc,18,cd465230,c072bab8,...) at sendto+0x4f > > syscall(f6b96d38) at syscall+0x293 > > > > Note that I do not use IPv6, none of my network interfaces is configured > > for it. > To clarify what you're seeing a bit: some applications that are adapted to > use both IPv4 and IPv6 open combined v4/v6 sockets. This is possible > because there is a section of the IPv6 address space that "contains" the v4 > address space. When an application sends to a v4 address using a v6 socket > (wave hands here) the kernel actually calls the v4 UDP code from within the > v6 socket code, and it turns out there's a locking bug in that path. So > likely some application you are running is using this compatibility mode, > and hence triggering this bug. Thank you for this explanation. It helps my peace of mind to understand the context. > I need to think for a bit about the best way to fix it (it's easy to hack > around, but obviously "hacking around" is not the desired solution), and > I'll get back to you later this week with a patch. I am certainly happy to try a patch when it becomes available. > For my reference, it would probably be helpful to know what the application > is, since apparently this didn't arise in our testing. You can type "show > pcpu" at the DDB prompt after this panic to show what thread is currently > running. This may be difficult. I was not entirely clear in my description of the panic. I experience spontaneous reboots when the panic is occurs. DDB is not invoked, nor is a core generated. My suspicion is that "ktorrent", the KDE3 torrent client, is triggering this condition. When I broke into DDB with a non-WITNESS kernel, I observed that one of the "ktorrent" threads was locked on "*udpinp". Additionally, "hald", "ntpd" and the NIC interrupt thread had "*udp" locked. Not sure if this is information is helpful. Cheers, -- Norbert.