From owner-freebsd-net@FreeBSD.ORG Fri Jul 2 21:26:56 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B9E1B1065689; Fri, 2 Jul 2010 21:26:56 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from tarsier.geekcn.org (tarsier.geekcn.org [IPv6:2001:470:a803::1]) by mx1.freebsd.org (Postfix) with ESMTP id D47EE8FC12; Fri, 2 Jul 2010 21:26:54 +0000 (UTC) Received: from mail.geekcn.org (tarsier.geekcn.org [211.166.10.233]) by tarsier.geekcn.org (Postfix) with ESMTP id 7F9CBA5A397; Sat, 3 Jul 2010 05:26:53 +0800 (CST) X-Virus-Scanned: amavisd-new at geekcn.org Received: from tarsier.geekcn.org ([211.166.10.233]) by mail.geekcn.org (mail.geekcn.org [211.166.10.233]) (amavisd-new, port 10024) with LMTP id W2Ktkzo7Dq+W; Sat, 3 Jul 2010 05:26:46 +0800 (CST) Received: from delta.delphij.net (drawbridge.ixsystems.com [206.40.55.65]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by tarsier.geekcn.org (Postfix) with ESMTPSA id DA7CDA58F63; Sat, 3 Jul 2010 05:26:42 +0800 (CST) DomainKey-Signature: a=rsa-sha1; s=default; d=delphij.net; c=nofws; q=dns; h=message-id:date:from:reply-to:organization:user-agent: mime-version:to:cc:subject:references:in-reply-to: x-enigmail-version:openpgp:content-type; b=FZkEJRtcTKigXyjKQD25o5iHxnXPZlRo9kdVhOsGeZWo7T19sHQmEGizzx4/GHaB4 KUiIMyc9URYCu44mB6D+w== Message-ID: <4C2E598D.2000201@delphij.net> Date: Fri, 02 Jul 2010 14:26:37 -0700 From: Xin LI Organization: The Geek China Organization User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.10) Gecko/20100629 Thunderbird/3.0.5 ThunderBrowse/3.3 MIME-Version: 1.0 To: "Bjoern A. Zeeb" References: <20100702083902.D14969@maildrop.int.zabbadoz.net> In-Reply-To: <20100702083902.D14969@maildrop.int.zabbadoz.net> X-Enigmail-Version: 1.0.1 OpenPGP: id=3FCA37C1; url=http://www.delphij.net/delphij.asc Content-Type: multipart/mixed; boundary="------------020702050108020907030608" Cc: "freebsd-net@freebsd.org" , Sam Leffler , Chao Shin , Qing Li Subject: Re: panic: rtqkill route really not free on freebsd 8.0-release update X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: d@delphij.net List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Jul 2010 21:26:56 -0000 This is a multi-part message in MIME format. --------------020702050108020907030608 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi, Bjoern, On 2010/07/02 01:39, Bjoern A. Zeeb wrote: > On Sat, 5 Jun 2010, Chao Shin wrote: > > Hey, > >> We add kdb/ddb and extra panic info printing into kernel and catch >> this panic again. >> >> We have instrumented the kernel and found that this panic happens when >> draining == 1, >> but seems to be confused with the fact that all access to radix trees >> are protected >> by locks. Can anyone familiar with these code shed us some light on >> this? >> >> below is url to screenshot in ddb: >> http://www.delphij.net/zhao/1.png >> http://www.delphij.net/zhao/2.png > > Did anyone pick this up? I don't think so. Currently we believe that there is some call paths that would exhibit the following: Thread A Thread B (...) RTLOCK(rt) rt->ref--; [ref drops to 0 now] (obtain rnh_lock) (in in_matroute) saw rt->ref == 0 rt->rt_flags & RTPRF_OURS == 0 (return from in_matroute()) RT_LOCK(rt) <-- blocks here rt->rt_flags |= OURS RT_UNLOCK(rt); RT_LOCK(rt) <-- got a wakeup rt->ref++ (ref == 1 && rt->rt_flags & RTPRF_OURS) With the attached workaround they have not see this type of panics so far but that doesn't seem ideal. Kip and Qing's paper titled "Optimizing the BSD routing system for parallel processing" suggests copying the route entry rather than referencing it but I didn't yet on how should I implement that and do benchmark... Cheers, - -- Xin LI http://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (FreeBSD) iQEcBAEBCAAGBQJMLlmNAAoJEATO+BI/yjfBzvAIANjmEXX54lryJ6Qq37yUFdmd BQqw7r/Q7IYD6gOBU0/iMUySa4x6H3U+8TPUK8Rf+ARkG8CP3JsRMPJtLkFs5Eby lmvQDRcfcKzFCAC40m/FmdlCl0c2Q/mz5H4PYve3zuU+BEDN0NOEIUtnYVmOJK1U 4O5XXZcAzNT1BXKKwbogwQq0t4dhT/3+4inH6vC3w8HpzwDfXS2GogFSOYlSurvC h7b2wjrD7sgTPZZj1DN7qWjGSRNAao+AGzlzvQR6tNCqWV+bn8qF+QaNoFepev+g ITeUh9IXffn646WCRF5whKUjz+M9IvSPhqGiFyWfhcGj8DbDt074XMsHiBLh7nc= =lHSK -----END PGP SIGNATURE----- --------------020702050108020907030608 Content-Type: text/plain; name="in_rmx-v2.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="in_rmx-v2.diff" Index: in_rmx.c =================================================================== --- in_rmx.c (revision 208681) +++ in_rmx.c (working copy) @@ -121,12 +121,12 @@ struct radix_node *rn = rn_match(v_arg, head); struct rtentry *rt = (struct rtentry *)rn; - /*XXX locking? */ - if (rt && rt->rt_refcnt == 0) { /* this is first reference */ - if (rt->rt_flags & RTPRF_OURS) { - rt->rt_flags &= ~RTPRF_OURS; - rt->rt_rmx.rmx_expire = 0; - } + if (rt && rt->rt_refcnt == 0 && /* this is first reference */ + rt->rt_flags & RTPRF_OURS) { + RT_LOCK(rt); + rt->rt_flags &= ~RTPRF_OURS; + rt->rt_rmx.rmx_expire = 0; + RT_UNLOCK(rt); } return rn; } @@ -206,6 +206,7 @@ RADIX_NODE_HEAD_WLOCK_ASSERT(ap->rnh); + RT_LOCK(rt); if (rt->rt_flags & RTPRF_OURS) { ap->found++; @@ -234,6 +235,7 @@ rt->rt_rmx.rmx_expire); } } + RT_UNLOCK(rt); return 0; } --------------020702050108020907030608--