From owner-freebsd-bugs@FreeBSD.ORG Fri Aug 13 05:00:14 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D7AA01065696 for ; Fri, 13 Aug 2010 05:00:14 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8E7A78FC17 for ; Fri, 13 Aug 2010 05:00:14 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o7D50E4J096146 for ; Fri, 13 Aug 2010 05:00:14 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o7D50E0U096145; Fri, 13 Aug 2010 05:00:14 GMT (envelope-from gnats) Resent-Date: Fri, 13 Aug 2010 05:00:14 GMT Resent-Message-Id: <201008130500.o7D50E0U096145@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Chris Luke Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 434D810656A3 for ; Fri, 13 Aug 2010 04:51:42 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id 25DC28FC19 for ; Fri, 13 Aug 2010 04:51:42 +0000 (UTC) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id o7D4pfv7060506 for ; Fri, 13 Aug 2010 04:51:41 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id o7D4pfkN060505; Fri, 13 Aug 2010 04:51:41 GMT (envelope-from nobody) Message-Id: <201008130451.o7D4pfkN060505@www.freebsd.org> Date: Fri, 13 Aug 2010 04:51:41 GMT From: Chris Luke To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: kern/149608: Deadlock with netinet6/raw_ip6.c when passing over a multicast ipv6 packet our raw socket is not interested in X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Aug 2010 05:00:14 -0000 >Number: 149608 >Category: kern >Synopsis: Deadlock with netinet6/raw_ip6.c when passing over a multicast ipv6 packet our raw socket is not interested in >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Fri Aug 13 05:00:14 UTC 2010 >Closed-Date: >Last-Modified: >Originator: Chris Luke >Release: 8.1, 8.0 >Organization: >Environment: FreeBSD castaway.xxx 8.0-RELEASE-p3 FreeBSD 8.0-RELEASE-p3 #1: Thu May 27 13:15:32 EDT 2010 root@castaway.xxx:/usr/src/sys/i386/compile/Castaway i386 FreeBSD chestnut.xxx 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Thu Aug 12 11:40:49 EDT 2010 root@chestnut.xxx:/usr/src/sys/i386/compile/Chestnut i386 Also occured on GENERIC kernel. 8.1 kernel above is the one I patched to cure this issue. >Description: Observed with Quagga and Bird routing daemons running OSPFv3 over tap(4) based tunnels. Process would hang repeatably and within a few seconds. ps or top would indicate process was waiting in a kernel lock with name "rawinp". It never recovers and the process cannot be killed. reboot is the only cure. > ps alxww | grep ospf6d 101 15165 1 0 44 0 2760 2104 rawinp Ls ?? 0:01.84 /usr/local/sbin/ospf6d -d Most of the time the deadlock appears to hang only the process I was observing, however, 1 in 10 occasions the entire system would hang. >How-To-Repeat: Anytime I run either of Quagga or Bird they would deadlock quickly. Based on my analysis, it would require at least one raw socket in INET6 and for the stack to have joined at least one IPv6 multicast group that at least one raw socket has not also joined, and then for a packet to arrive for that group. It is noteworthy that non-root processes can create IPv6 multicast sockets. Thus, if an IPv6 raw socket already exists (there are many valid reasons) then a non-root user can cause a system deadlock by simply joining a multicast group, even if they raw socket does not participate in any multicast. Also, since various fundamental IPv6 mechanisms use multicast, it seems likely this is the reason I observed complete system hangs - for example. I am not sure if there is something racy with running OSPFv3 over tap tunnels (neighbor discovery timing, perhaps), but it has reliably deadlocked on me since 8.0-RELEASE. I assumed it was immature user-land code, but decided to look at the kernel instead. >Fix: Reviewing rip6_input in raw_ip6.c, I noted that in the pcb loop when a multicast datagram is skipped over if (blocked != MCAST_PASS) { IP6STAT_INC(ip6s_notmember); continue; } then the in6p does not get INP_RUNLOCK()'ed. This is normally performed by leaving the last in6p in 'last' and it gets mopped up next time round the loop. The multicast code is 'continue'd before the current in6p ever gets into 'last', and thus never unlocked. The fix I have successfully tested is to add INP_UNLOCK(in6p); before the continue. See the attached patch against 8.1-RELEASE. Patch attached with submission follows: *** raw_ip6.c-original Fri Aug 13 00:33:56 2010 --- raw_ip6.c Fri Aug 13 00:34:10 2010 *************** *** 248,253 **** --- 248,254 ---- } if (blocked != MCAST_PASS) { IP6STAT_INC(ip6s_notmember); + INP_RUNLOCK(in6p); continue; } } >Release-Note: >Audit-Trail: >Unformatted: