From owner-freebsd-net@FreeBSD.ORG Sat Jul 3 03:53:36 2010 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97D3C1065670; Sat, 3 Jul 2010 03:53:36 +0000 (UTC) (envelope-from yanefbsd@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id 1C7D28FC13; Sat, 3 Jul 2010 03:53:35 +0000 (UTC) Received: by qyk30 with SMTP id 30so774158qyk.13 for ; Fri, 02 Jul 2010 20:53:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=SuwLmyOrBZTrsrVYXW47k/sf2WZHfs3pcyjMg+X1oEI=; b=Dx7I9z8llEkeIOKcMp38pJIWJGONLRdnUeD/t/UrH+pPyf84fAFmQ6LZ3BdLelbuw7 Za4iwzGrgi4/esXJ7U8XYq4NDZE4Q8o3bAm+E7z6MRs43krJMQwiN650B7yrs8HMGntD HmUX2bbW1SqFvdUuMS+OsZAvJxKyFp0ZcmSV8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=HkfDqU7hnXzinTn2XCwrW2Bfv+OZtWGpvXh2IyNHs449aCBts7Jl4IR6Uo6BgEB+MV w1tJdfEJU01NSLvCrulcUqpSHMI2aQnN3XEqlQdAWqZtUAB2Yqq2mh8P5g3kUkNgybd6 aFOtgxVY0X8GZJaGDD1Lq5DHGujg6zzUvigH0= MIME-Version: 1.0 Received: by 10.224.19.198 with SMTP id c6mr967380qab.127.1278129205105; Fri, 02 Jul 2010 20:53:25 -0700 (PDT) Received: by 10.229.192.201 with HTTP; Fri, 2 Jul 2010 20:53:25 -0700 (PDT) In-Reply-To: <20100703125804.P54166@sola.nimnet.asn.au> References: <20100702234212.B54166@sola.nimnet.asn.au> <20100703125804.P54166@sola.nimnet.asn.au> Date: Fri, 2 Jul 2010 20:53:25 -0700 Message-ID: From: Garrett Cooper To: Ian Smith Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Luigi Rizzo , net@freebsd.org Subject: Re: Deterministic lockup / panic in networking stack with ipfw / natd enabled on recent amd64 STABLE / CURRENT X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Jul 2010 03:53:36 -0000 On Fri, Jul 2, 2010 at 8:26 PM, Ian Smith wrote: > On Sat, 3 Jul 2010, Ian Smith wrote: > =A0> On Tue, 15 Jun 2010, Garrett Cooper wrote: > =A0> =A0> Hi, > =A0> =A0> =A0 =A0 I'm experiencing a deterministic situation on a develop= ment box I > =A0> =A0> manage when I do the following to enable ipfw and natd to bridg= e a > =A0> =A0> network with two bce(4) enabled NICs, where if I do the followi= ng > =A0> =A0> steps below, then try to push a few tcp frames through, the ker= nel > =A0> =A0> either hardlocks, or panics in the bce(4) code, ipfw(4) code or > =A0> =A0> networking stack code. > =A0> =A0> =A0 =A0 My kernel is relatively vanilla (I just turned off a nu= mber of > =A0> =A0> drivers that I don't use because the hardware support isn't the= re), > =A0> =A0> and all of the networking options available in GENERIC are enab= led as > =A0> =A0> well. I have ipfw, ipfw_nat, and libalias built as modules, alo= ng with > =A0> =A0> bce and em. > =A0> =A0> =A0 =A0 I've included the stats on the machine. Note that it is= a dual > =A0> =A0> SMT-enabled quad core machine with 8GB of RAM. I haven't done a= nything > =A0> =A0> to pimp the box settings via make.conf whatsoever. I would prov= ide a > =A0> =A0> crashdump, but dumpon is broken on the box (which is extremely > =A0> =A0> annoying). Please note that pf doesn't have any issues pushing = packets > =A0> =A0> with similar rules. > =A0> =A0> =A0 =A0 This has occurred on both 8-STABLE (r209169), and 9-CUR= RENT (r208809). > =A0> =A0> =A0 =A0 Here's the manual procedure for reproducing the issue: > =A0> =A0> > =A0> =A0> # Do the following steps (this isn't automated apparently as it > =A0> =A0> completely blocks off a running box, when using ipfw restart is= run). > =A0> =A0> > =A0> =A0> # Copy the 8.0-RELEASE copy of rc.firewall over > =A0> =A0> cp -p /usr/src/etc/rc.firewall /etc > =A0> =A0> > =A0> =A0> # Make sure you have access via ssh being redirected via natd. > =A0> =A0> echo "redirect_port tcp 192.168.10.1:22 22" > /etc/natd.conf > =A0> =A0> > =A0> =A0> # Enable all of the required services and knobs > =A0> =A0> cat >> /etc/rc.conf < =A0> =A0> firewall_enable=3D"YES" > =A0> =A0> firewall_logging=3D"YES" > =A0> =A0> firewall_nat_enable=3D"YES" > =A0> =A0> firewall_nat_interface=3D"bce1" > =A0> =A0> firewall_type=3D"open" > =A0> =A0> gateway_enable=3D"YES" > =A0> =A0> ipfw_enable=3D"YES" > =A0> =A0> natd_enable=3D"YES" > =A0> =A0> natd_interface=3D"bce1" > =A0> =A0> natd_flags=3D"-dynamic -d -m" > =A0> =A0> EOF > =A0> > =A0> Garrett I missed this earlier; here from your ref in the TSO thread. > =A0> > =A0> If you enable both firewall_nat and natd as above, on that config yo= u > =A0> should have wound up with two of ipfw rule 50, like > =A0> > =A0> =A050 divert 8668 ip4 from any to any via bce1 > =A0> =A050 nat 123 ip4 from any to any via bce1 > =A0> > =A0> but I don't think you really wanted to run natd then firewall_nat ag= ain > =A0> like that? > > Oh, sorry .. that's not right; I quite forgot the discussions in ipfw@ > about this a while ago, until I re-browsed natd(8): > > =A0 =A0 =A0 =A0 =A0After translation by natd, packets re-enter the firewa= ll at the rule > =A0 =A0 =A0 =A0 =A0number following the rule number that caused the diver= sion (not the > =A0 =A0 =A0 =A0 =A0next rule if there are several at the same number). > > so in this case only natd should be invoked and the ipfw nat skipped. > > =A0> Also I'm pretty sure you'd need to include '-f /etc/natd.conf' in yo= ur > =A0> natd_flags for your redirect_port config, here's no default configfi= le > =A0> for natd (AFAIK) > > I think that's right - or you can specify -redirect_port in natd_flags. > > =A0> I guess rc.firewall ought to be checking that natd_enable and > =A0> firewall_nat_enable aren't both YES .. > > .. and that becomes irrelevant, though it's still an ambiguous config. I'll look into this more closely on Sunday when I come in to repro the issue. Thanks for the feedback :)! -Garrett