From owner-freebsd-hackers@FreeBSD.ORG Sat Aug 22 10:45:56 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6E9E6106568F; Sat, 22 Aug 2009 10:45:56 +0000 (UTC) (envelope-from prvs=147868619d=brian@FreeBSD.org) Received: from idcmail-mo1so.shaw.ca (idcmail-mo1so.shaw.ca [24.71.223.10]) by mx1.freebsd.org (Postfix) with ESMTP id 172218FC08; Sat, 22 Aug 2009 10:45:55 +0000 (UTC) Received: from pd2ml1so-ssvc.prod.shaw.ca ([10.0.141.139]) by pd2mo1so-svcs.prod.shaw.ca with ESMTP; 22 Aug 2009 04:45:55 -0600 X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.0 c=1 a=BN5Fd0iOSU8A:10 a=MJPcHhXccCG8eBs0us8XwA==:17 a=6I5d2MoRAAAA:8 a=MMwg4So0AAAA:8 a=P8dKc2PLvlz1ZIgMnPUA:9 a=9F4otH7t7ER6egtWpK8A:7 a=Za4Oqx47EjDomjW2aHXlsWOaN8IA:4 a=SV7veod9ZcQA:10 a=WJ3hkfHDukgA:10 a=qpoLnDooHhmQrEM_:21 a=L1stWGbGfJ4UFiL2:21 a=ZmYtWbBkNySDe_TuI5kA:9 a=qp8cfZTJBEHZnvW2cXDrL_p3AtQA:4 Received: from unknown (HELO store.lan.Awfulhak.org) ([70.79.162.198]) by pd2ml1so-dmz.prod.shaw.ca with ESMTP; 22 Aug 2009 04:45:55 -0600 Received: from store.lan.Awfulhak.org (localhost.localdomain [127.0.0.1]) by localhost (Email Security Appliance) with SMTP id B3D6FC433AA_A8FCC5FB; Sat, 22 Aug 2009 10:45:51 +0000 (GMT) Received: from gw.Awfulhak.org (gw.lan.Awfulhak.org [172.16.0.1]) by store.lan.Awfulhak.org (Sophos Email Appliance) with ESMTP id 37C7AC460F6_A8FCC5BF; Sat, 22 Aug 2009 10:45:47 +0000 (GMT) Received: from dev.lan.Awfulhak.org (brian@dev.lan.Awfulhak.org [172.16.0.5]) by gw.Awfulhak.org (8.14.3/8.14.3) with ESMTP id n7MAjnSR005267; Sat, 22 Aug 2009 03:45:50 -0700 (PDT) (envelope-from brian@FreeBSD.org) Date: Sat, 22 Aug 2009 03:45:37 -0700 From: Brian Somers To: Kip Macy Message-ID: <20090822034537.76b16271@dev.lan.Awfulhak.org> In-Reply-To: <20090821232313.21a9a7f9@dev.lan.Awfulhak.org> References: <20090821164312.641fe2bd@dev.lan.Awfulhak.org> <3c1674c90908211713j36415b96q58b0ed66cc82713f@mail.gmail.com> <20090821215503.3eec9a15@dev.lan.Awfulhak.org> <20090821224134.11d9a2a1@dev.lan.Awfulhak.org> <20090821232313.21a9a7f9@dev.lan.Awfulhak.org> X-Mailer: Claws Mail 3.7.2 (GTK+ 2.16.5; i386-portbld-freebsd8.0) Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/gLiWnsbK4uMkpc41Y3jyX3S"; protocol="application/pgp-signature" Cc: freebsd-hackers@FreeBSD.org, Brian Somers Subject: Re: kernel panics in in_lltable_lookup (with INVARIANTS) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Aug 2009 10:45:56 -0000 --Sig_/gLiWnsbK4uMkpc41Y3jyX3S Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Fri, 21 Aug 2009 23:23:13 -0700 Brian Somers wrote: > On Fri, 21 Aug 2009 22:41:34 -0700 Brian Somers wrot= e: > > On Fri, 21 Aug 2009 21:55:03 -0700 Brian Somers wro= te: > > > On Fri, 21 Aug 2009 17:13:45 -0700 Kip Macy wrote: > > > > Try this: > > > >=20 > > > > Index: sys/net/flowtable.c > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > > > --- sys/net/flowtable.c (revision 196382) > > > > +++ sys/net/flowtable.c (working copy) > > > > @@ -688,6 +688,12 @@ > > > > struct rtentry *rt =3D ro->ro_rt; > > > > struct ifnet *ifp =3D rt->rt_ifp; > > > >=20 > > > > + if (ifp->if_flags & IFF_POINTOPOINT) { > > > > + RTFREE(rt); > > > > + ro->ro_rt =3D NULL; > > > > + return (ENOENT); > > > > + } > > > > + > > > > if (rt->rt_flags & RTF_GATEWAY) > > > > l3addr =3D rt->rt_gateway; > > > > else > > > >=20 > > > > You'll need to apply this by hand as gmail munges the formatting. > > > >=20 > > > > -Kip > > >=20 > > > Hi, > > >=20 > > > That certainly stops the panic, however data routed to the tun > > > interface doesn't come out the back end and data written > > > to the back end doesn't come out the tun interface. > > [.....] > > > Maybe this problem isn't a routing problem. I'll > > > look into it further and figure out if the packet is getting to the t= un > > > driver and if so, what it thinks it's doing with it. > >=20 > > I wasn't correct - the data *IS* being read out of the back of > > the tunnel device. When I send the ICMP, it goes into the tun > > device and comes out the back end as an AF_LINK packet. ppp > > silently discards this (ironically I have a comment noting > > that I should really track unidentified packet counts). > >=20 > > I'll try to figure out what in if_tun.c is corrupting the family next... >=20 > if_tun.c is fine. The data passed from if_output() has family > AF_LINK - hence the original panic from flowtable_lookup(). >=20 > So the question is "why is ip_output() sending AF_LINK traffic > instead of AF_INET traffic?". >=20 > Still looking.... =46rom what I can tell, this is what is happening: ip_output() is called with ro =3D=3D NULL. ip_output() calls flowtable_lookup() with a zeroed 'ro'. flowtable_lookup() calls ft->ft_rtalloc() (really rtalloc1_fib()) to initialise 'ro' and ends up with ro->ro_rt->rt_gateway->sa_family set to AF_LINK. Your original patch frees ro->ro_rt and fails before calling llentry_update() with ro->ro_rt->rt_gateway->sa_family !=3D AF_INET. Now, when flowtable_lookup() fails, ro->ro_rt is NULL and ip_output()s 'dst' gets set up with family AF_INET. Unfortunately, right after this, after checking for IP_SENDONES, IP_ROUTETOIF and IN_MULTICAST, the ip_output() code decides to call in_rtalloc_ign() (which eventually just calls rtalloc1_fib()) to initialise ro->ro_rt and then sets dst to be ro->ro_rt->rt_gateway -- which is *still* an AF_LINK address! Finally ip_output() calls ifp->if_output() (really tunoutput()) with dst's family set to AF_LINK, tunoutput() queues it to the tun character device, ppp reads it and drops it on the floor 'cos it doesn't know what to do with AF_LINK. The tun driver is more or less the same as the -stable version, so it seems that ip_output() is to blame. The only relevant part that seems substantially different is rtalloc1_fib(), so right now I'm guessing that the RTF_CLONING code in -stable always clones the route with a gw family of AF_INET and expectations are met after that. I'll look some more on the weekend... > > > > On Fri, Aug 21, 2009 at 16:43, Brian Somers wrot= e: > > > > > Hi, > > > > > > > > > > I've been working on a fix to address an issue that came up with > > > > > our update of openssh-5. =C2=A0The issue is that openssh-5 now us= es > > > > > pipe() to create stdin/stdout channels between sshd and the server > > > > > side program where it used to use socketpair(). =C2=A0Because it = uses > > > > > pipe(), stdin is no longer bi-directional and cannot be used for = both > > > > > input and output by a child process. =C2=A0This breaks the use of= ssh > > > > > as a tunnel with ppp on either end (set device "!ssh -e none host > > > > > ppp -direct label") > > > > > > > > > > I talked with des@ for a while and then with the openssh folks and > > > > > have not been able to resolve the issues in openssh that made them > > > > > choose to enforce the use of pipe() over socketpair(). =C2=A0I no= w have a > > > > > patch to ppp that makes ppp detect that it's connected via pipe()= and > > > > > causes it to use stdin for input and stdout for output (usually i= t expects > > > > > just one descriptor). =C2=A0Although I'm happy with the patch and= planned on > > > > > requesting permission to commit, I've bumped into a show-stopper > > > > > that seems unrelated, so I thought I'd ask here if anyone has seen > > > > > this or has any suggestions as to what the problem might be. > > > > > > > > > > The issue.... > > > > > > > > > > I'm seeing a panic when I send traffic through a ppp link: > > > > > > > > > > panic string is: sin_family 18 > > > > > Stack trace starts: > > > > > =C2=A0 =C2=A0in_lltable_lookup() > > > > > =C2=A0 =C2=A0llentry_update() > > > > > =C2=A0 =C2=A0flowtable_lookup() > > > > > =C2=A0 =C2=A0ip_output() > > > > > =C2=A0 =C2=A0.... > > > > > > > > > > The panic is due to a KASSERT in in_lltable_lookup() that expects= the > > > > > sockaddr to be AF_INET. =C2=A0Number 18 is AF_LINK. > > > > > > > > > > AFAICT this is happening while setting up a temporary route for t= he > > > > > first outbound packet. =C2=A0I haven't been able to do much inves= tigation > > > > > yet due to other patches in my tree that seem to have broken all = my > > > > > kernel symbols, but once I get a clean rebuild I should be back in > > > > > business. > > > > > > > > > > If anyone has any suggestions, I'm all ears! > > > > > > > > > > Cheers. --=20 Brian Somers Don't _EVER_ lose your sense of humour ! --Sig_/gLiWnsbK4uMkpc41Y3jyX3S Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.12 (FreeBSD) iQCVAwUBSo/MXA7tvOdmanQhAQImqQP7Bp+8ggpe247WlLnucfB/T4lsJoaiPhWi gV3gbGvCEyy5WP1d2lZFQzcMx/JacteL40GivXlhuzdF4NrovYWPTRGVINF4W+cf lzFC7UsECuXwyDIJrRLTQHHe0zFjpxu9fazpWma44HXE76XJwIiis6jVmai7flAl rc5kuMOLuQI= =V+gY -----END PGP SIGNATURE----- --Sig_/gLiWnsbK4uMkpc41Y3jyX3S--