From owner-freebsd-net@freebsd.org Thu Dec 10 20:00:31 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3C7DC9D67E7 for ; Thu, 10 Dec 2015 20:00:31 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qg0-x22c.google.com (mail-qg0-x22c.google.com [IPv6:2607:f8b0:400d:c04::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id F05BD10A1; Thu, 10 Dec 2015 20:00:30 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by qgec40 with SMTP id c40so159210862qge.2; Thu, 10 Dec 2015 12:00:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=1hb2JLIk9vV9dROoDNElctLks8yjauiCv945aU0RUpE=; b=rxtsklU7BDmH4Zv/UmQTR6n5zYlv9HGBa6dNm+Y7/WmRR22MbriEtLcv9iDSQsS/QU oEoLBUbYTUYL+oNj6IybgrsZVc8HmZnSv/uS1EJd3Qq13YuVMLROZ2o+d1HHIu0EJbSE 3NJC/5YSG7sbN7PEmZGITBRWVoDDxuJ4TaSu4BTsmdREY5++D1KdMKY2d8imCO6nfQ/7 fIjQmnboTj3pDUysgbVc30hgnevQelEotRATIAdSkgIy4jGjioj/551s3QKN0ygEmFwH 5FldjmpHUYF44iFPCdCPJHMjvW0G4jHoN97xPq8Wl76ZvJMxK667fcggNFmUG+G5qG78 bKcw== X-Received: by 10.55.76.16 with SMTP id z16mr17726196qka.83.1449777626019; Thu, 10 Dec 2015 12:00:26 -0800 (PST) Received: from wkstn-mjohnston.west.isilon.com (c-67-182-131-225.hsd1.wa.comcast.net. [67.182.131.225]) by smtp.gmail.com with ESMTPSA id e134sm6653616qhc.49.2015.12.10.12.00.23 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 10 Dec 2015 12:00:25 -0800 (PST) Sender: Mark Johnston Date: Thu, 10 Dec 2015 12:02:15 -0800 From: Mark Johnston To: "Andrey V. Elsukov" Cc: Jason , freebsd-net@freebsd.org, kevin.bowling@kev009.com, hiren@strugglingcoder.info Subject: Re: Multiple cores/race conditions in IPv6 RA Message-ID: <20151210200215.GB34692@wkstn-mjohnston.west.isilon.com> References: <50cff74ea38f155ae616cf49f5ffb5ae@m.nitrology.com> <5667EA3A.8050200@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5667EA3A.8050200@FreeBSD.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Dec 2015 20:00:31 -0000 On Wed, Dec 09, 2015 at 11:45:46AM +0300, Andrey V. Elsukov wrote: > On 08.12.15 08:32, Jason wrote: > > Hi, > > > > It appears the IPv6 router advertisement code paths were written fairly > > lockless, assuming you would never process multiples concurrently. We > > are seeing multiple page faults in various places processing the > > messages and modifying the routing table. We have multiple L3 devices > > and multiple v6 blocks broadcasting these messages to hardware with dual > > uplinks in the same VLAN, which I believe is making us susceptible to > > this. Though I believe the dual uplink is all that's required for this, > > as it can be seen in configurations with a single v6 block. > > > > We are running stable/10 @ r285800, and it doesn't appear anything > > relevant has changed since then. Our other widely deployed version is > > 8.3-RELEASE, which does not see this issue. Upon bumping a machine from > > 8.3 -> 10 we can see it start to exhibit this behavior. The only change > > I see that might be relevant is r243148, but these cores are relatively > > rare, so testing is tough without a considerable deployment. So > > basically I'm hoping someone with a trained eye can send us in the right > > direction before we go down that road. > > Hi, > > some time ago Mark Johnston has published there the patch related to > this problem: > https://lists.freebsd.org/pipermail/freebsd-net/2013-February/034682.html > > Maybe Mark has something to say about it. I started trying to push this work in with D2254, which fixes some of the global IPv6 addr list locking. It turns out to be pretty tricky to lock both NDP and the global address list properly; the patch in my public directory fixes only some of the issues with NDP and is thus incomplete, which is why I'd been reluctant to push it in. I'll return to D2254 and try and make some further progress on this.