Date: Wed, 17 Sep 2008 08:12:04 -0700 From: Norbert Papke <fbsd-ml@scrapper.ca> To: Robert Watson <rwatson@freebsd.org> Cc: Gavin Atkinson <gavin@freebsd.org>, freebsd-stable@freebsd.org Subject: Re: Possible UDP related deadlock in 7.1-PRERELEASE Message-ID: <200809170812.05338.fbsd-ml@scrapper.ca> In-Reply-To: <alpine.BSF.1.10.0809171219270.64176@fledge.watson.org> References: <200809141219.24943.fbsd-ml@scrapper.ca> <200809151813.58749.fbsd-ml@scrapper.ca> <alpine.BSF.1.10.0809171219270.64176@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On September 17, 2008, Robert Watson wrote: > On Mon, 15 Sep 2008, Norbert Papke wrote: > > With WITNESS enabled, I now experience panics and could not follow your > > instructions. There is no core dump. The following gets logged to > > /var/log/messages: > > > > shared lock of (rw) udpinp @ /usr/src/sys/netinet/udp_usrreq.c:864 > > while exclusively locked from /usr/src/sys/netinet6/udp6_usrreq.c:940 > > panic: share->excl > > KDB: stack backtrace: > > db_trace_self_wrapper(c06fda7c,f6b96978,c052046a,c06fbb5d,c07695c0,...) > > at db_trace_self_wrapper+0x26 > > kdb_backtrace(c06fbb5d,c07695c0,c06febd1,f6b96984,f6b96984,...) at > > kdb_backtrace+0x29 > > panic(c06febd1,c070c409,3ac,c0709eee,360,...) at panic+0xaa > > witness_checkorder(ccd5209c,1,c0709eee,360,8,...) at > > witness_checkorder+0x17c > > _rw_rlock(ccd5209c,c0709eee,360,c07780e0,cd4652c8,...) at _rw_rlock+0x2a > > udp_send(d3942000,0,c580f400,c68faa00,0,...) at udp_send+0x197 > > udp6_send(d3942000,0,c580f400,c68faa00,0,...) at udp6_send+0x140 > > sosend_generic(d3942000,c68faa00,f6b96be8,0,0,...) at > > sosend_generic+0x50d sosend(d3942000,c68faa00,f6b96be8,0,0,...) at > > sosend+0x3f > > kern_sendit(cd465230,f,f6b96c64,0,0,...) at kern_sendit+0x106 > > sendit(0,871b9fe,0,c68faa00,1c,...) at sendit+0x182 > > sendto(cd465230,f6b96cfc,18,cd465230,c072bab8,...) at sendto+0x4f > > syscall(f6b96d38) at syscall+0x293 > > > > Note that I do not use IPv6, none of my network interfaces is configured > > for it. > To clarify what you're seeing a bit: some applications that are adapted to > use both IPv4 and IPv6 open combined v4/v6 sockets. This is possible > because there is a section of the IPv6 address space that "contains" the v4 > address space. When an application sends to a v4 address using a v6 socket > (wave hands here) the kernel actually calls the v4 UDP code from within the > v6 socket code, and it turns out there's a locking bug in that path. So > likely some application you are running is using this compatibility mode, > and hence triggering this bug. Thank you for this explanation. It helps my peace of mind to understand the context. > I need to think for a bit about the best way to fix it (it's easy to hack > around, but obviously "hacking around" is not the desired solution), and > I'll get back to you later this week with a patch. I am certainly happy to try a patch when it becomes available. > For my reference, it would probably be helpful to know what the application > is, since apparently this didn't arise in our testing. You can type "show > pcpu" at the DDB prompt after this panic to show what thread is currently > running. This may be difficult. I was not entirely clear in my description of the panic. I experience spontaneous reboots when the panic is occurs. DDB is not invoked, nor is a core generated. My suspicion is that "ktorrent", the KDE3 torrent client, is triggering this condition. When I broke into DDB with a non-WITNESS kernel, I observed that one of the "ktorrent" threads was locked on "*udpinp". Additionally, "hald", "ntpd" and the NIC interrupt thread had "*udp" locked. Not sure if this is information is helpful. Cheers, -- Norbert.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200809170812.05338.fbsd-ml>