From owner-freebsd-stable@FreeBSD.ORG Wed Sep 17 11:23:12 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 315431065679; Wed, 17 Sep 2008 11:23:12 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id E05998FC1D; Wed, 17 Sep 2008 11:23:11 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [65.122.17.41]) by cyrus.watson.org (Postfix) with ESMTP id 64EFF46B2C; Wed, 17 Sep 2008 07:23:11 -0400 (EDT) Date: Wed, 17 Sep 2008 12:23:11 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Norbert Papke In-Reply-To: <200809151813.58749.fbsd-ml@scrapper.ca> Message-ID: References: <200809141219.24943.fbsd-ml@scrapper.ca> <1221471431.49328.5.camel@buffy.york.ac.uk> <200809151813.58749.fbsd-ml@scrapper.ca> User-Agent: Alpine 1.10 (BSF 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Gavin Atkinson , freebsd-stable@freebsd.org Subject: Re: Possible UDP related deadlock in 7.1-PRERELEASE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Sep 2008 11:23:12 -0000 On Mon, 15 Sep 2008, Norbert Papke wrote: > With WITNESS enabled, I now experience panics and could not follow your > instructions. There is no core dump. The following gets logged to > /var/log/messages: > > shared lock of (rw) udpinp @ /usr/src/sys/netinet/udp_usrreq.c:864 > while exclusively locked from /usr/src/sys/netinet6/udp6_usrreq.c:940 > panic: share->excl > KDB: stack backtrace: > db_trace_self_wrapper(c06fda7c,f6b96978,c052046a,c06fbb5d,c07695c0,...) at > db_trace_self_wrapper+0x26 > kdb_backtrace(c06fbb5d,c07695c0,c06febd1,f6b96984,f6b96984,...) at > kdb_backtrace+0x29 > panic(c06febd1,c070c409,3ac,c0709eee,360,...) at panic+0xaa > witness_checkorder(ccd5209c,1,c0709eee,360,8,...) at witness_checkorder+0x17c > _rw_rlock(ccd5209c,c0709eee,360,c07780e0,cd4652c8,...) at _rw_rlock+0x2a > udp_send(d3942000,0,c580f400,c68faa00,0,...) at udp_send+0x197 > udp6_send(d3942000,0,c580f400,c68faa00,0,...) at udp6_send+0x140 > sosend_generic(d3942000,c68faa00,f6b96be8,0,0,...) at sosend_generic+0x50d > sosend(d3942000,c68faa00,f6b96be8,0,0,...) at sosend+0x3f > kern_sendit(cd465230,f,f6b96c64,0,0,...) at kern_sendit+0x106 > sendit(0,871b9fe,0,c68faa00,1c,...) at sendit+0x182 > sendto(cd465230,f6b96cfc,18,cd465230,c072bab8,...) at sendto+0x4f > syscall(f6b96d38) at syscall+0x293 > > Note that I do not use IPv6, none of my network interfaces is configured for > it. Dear Norbert, Thanks for this report -- the additional WITNESS debugging information is very helpful, and the above warning may well be the source of the problem you're experiencing. To clarify what you're seeing a bit: some applications that are adapted to use both IPv4 and IPv6 open combined v4/v6 sockets. This is possible because there is a section of the IPv6 address space that "contains" the v4 address space. When an application sends to a v4 address using a v6 socket (wave hands here) the kernel actually calls the v4 UDP code from within the v6 socket code, and it turns out there's a locking bug in that path. So likely some application you are running is using this compatibility mode, and hence triggering this bug. I need to think for a bit about the best way to fix it (it's easy to hack around, but obviously "hacking around" is not the desired solution), and I'll get back to you later this week with a patch. For my reference, it would probably be helpful to know what the application is, since apparently this didn't arise in our testing. You can type "show pcpu" at the DDB prompt after this panic to show what thread is currently running. Thanks, Robert N M Watson Computer Laboratory University of Cambridge > > Also, since I enabled WITNESS, I get the following logged during system > startup: > > Enabling pf. > lock order reversal: > 1st 0xc09af92c pf task mtx (pf task mtx) > @ /usr/src/sys/modules/pf/../../contri > b/pf/net/pf_ioctl.c:1394 > 2nd 0xc07b4d68 ifnet (ifnet) @ /usr/src/sys/net/if.c:1558 > KDB: stack backtrace: > db_trace_self_wrapper(c06fda7c,f4914a60,c0552c75,c06fed11,c07b4d68,...) at > db_tr > ace_self_wrapper+0x26 > kdb_backtrace(c06fed11,c07b4d68,c0703ca2,c0703ca2,c0703c73,...) at > kdb_backtrace > +0x29 > witness_checkorder(c07b4d68,9,c0703c73,616,572,...) at > witness_checkorder+0x5e5 > _mtx_lock_flags(c07b4d68,0,c0703c73,616,c0104414,...) at _mtx_lock_flags+0x34 > ifunit(c6ef5c20,0,c09adfb5,572,c0703a71,...) at ifunit+0x2f > pfioctl(c566ce00,c0104414,c6ef5c20,3,c60c38c0,...) at pfioctl+0x2b43 > devfs_ioctl_f(c588bb94,c0104414,c6ef5c20,c54bb900,c60c38c0,...) at > devfs_ioctl_f > +0xe6 > kern_ioctl(c60c38c0,3,c0104414,c6ef5c20,1000000,...) at kern_ioctl+0x243 > ioctl(c60c38c0,f4914cfc,c,c0718d59,c072b350,...) at ioctl+0x134 > syscall(f4914d38) at syscall+0x293 > Xint0x80_syscall() at Xint0x80_syscall+0x20 > --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x281ab6f3, esp = 0xbfbfde3c, > ebp > = 0xbfbfde68 --- > pf enabled > > > I tried to unload 'pf' to see if it was the culprit. However, even without pf > loaded, I experience the panic. > > Is there anything else I can try to provide better insight into what might be > going on? > > Cheers, > > -- Norbert. > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >