From owner-freebsd-net@FreeBSD.ORG Fri Jan 27 02:05:29 2006 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D04B816A420 for ; Fri, 27 Jan 2006 02:05:29 +0000 (GMT) (envelope-from craig@olyun.gank.org) Received: from ion.gank.org (ion.gank.org [69.55.238.164]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9C47343D70 for ; Fri, 27 Jan 2006 02:05:29 +0000 (GMT) (envelope-from craig@olyun.gank.org) Received: by ion.gank.org (mail, from userid 1001) id 330D12AA01; Thu, 26 Jan 2006 20:05:29 -0600 (CST) Date: Thu, 26 Jan 2006 20:05:28 -0600 From: Craig Boston To: freebsd-net@freebsd.org Message-ID: <20060127020528.GA18728@nowhere> References: <20060125152032.GA40581@nowhere> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20060125152032.GA40581@nowhere> User-Agent: Mutt/1.4.2.1i Subject: Re: Race condition in ip6_getpmtu (actually gif)? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Jan 2006 02:05:29 -0000 On Wed, Jan 25, 2006 at 09:20:33AM -0600, craig@olyun.gank.org wrote: > I seem to be running into a race condition in ip6_getpmtu. I've been > having sporadic panics recently -- sometimes the machine will last a > week, sometimes it'll panic twice in a day. The backtrace is always the > same: > > -- snip -- After some more analysis I think this is a problem in in6_gif_output. It keeps a cached route in its softc. After ip6_output completes, if IFF_LINK0 is not set, the cached route is freed. This works fine so long as in6_gif_output is not reentered. My current theory is that a higher priority kernel thread is preempting while we're somewhere in ip6_getmtu. Say, an incoming IPv4 ICMP packet might cause the NIC driver to call ether_input from an ithread. Since IPv4 is marked NETISR_MPSAFE it will be dispatched from the ithread, filter all the way down to icmp_input, which decides that an ICMP reply needs to be sent a host across the tunnel. It goes to icmp_send, which passes it to ip_output. The destination is a gif interface, so into gif_output we go, and BAM! We just re-entered in6_gif_output while still in the ithread. When this happens, the route cached in the sc is still valid, so a new one is not allocated. After ip6_output completes, the route is freed and set to NULL. Later, context returns to the original thread, and ip6_getpmtu (called from ip6_output) has just had its route pulled out from under it... It's a longshot, but I think it is possible and that would certainly explain why it sometimes takes millions of packets to trigger. Attached is a quick hack to protect the cached route with a mutex. A better fix with less overhead would be to allocate the route in a local variable on the stack, and only copy it to the softc if route caching is enabled. I'll run for a couple weeks with the patch and file a PR if that fixes it. If I have time I'll also try to set up a test machine and attempt to detect if ip6_gif_output is indeed reentered, and if so how. I think this should only be a problem for gif when IPv4 is the inner protocol and IPv6 is the outer. Since IPv4 is MPSAFE and v6 is not, gif might sometimes inadvertently cause v6 code that hasn't been fully locked to be re-entered or otherwise called without GIANT held. There may be other problems that are less likely to occur... Craig