Date: Wed, 12 Oct 2016 15:24:17 -0400 From: Zaphod Beeblebrox <zbeeble@gmail.com> To: Donald Baud <donaldbaud@yahoo.com> Cc: "net@freebsd.org" <net@freebsd.org> Subject: Re: FreeBSD10.3-RELEASE. Kernel panic. Message-ID: <CACpH0McW4KkDbCnfL4DKc4aQiOhnuMYC0q%2B8ELJn6dtDs0HW3A@mail.gmail.com> In-Reply-To: <86183ea5-5855-5fb3-22f6-d25454859186@yahoo.com> References: <CAAFYNruF4gFAiTCAhyRUQzcovW2osrKn4ehiuNR0btJCZbnOGg@mail.gmail.com> <57FC859F.5000200@grosbein.net> <CAJajdNUXOrzWDKVmSB1Xm_G6zqBhMsZ2vesDcAw2CPGFBU0xtg@mail.gmail.com> <2033449965.65391.1476244568309@mail.yahoo.com> <a450f0eb-378a-2bd5-2f24-a0eb6b941856@freebsd.org> <86183ea5-5855-5fb3-22f6-d25454859186@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
While my mp5 servers are possibly less busy (I havn't had common crashes), I have noticed a "group" of problems. 1. The carrier dropping communication (ie: fiber cut or l2 switch breakage) of the L2TP streams can leave mpd5 in a state where it will not die and will not destroy interfaces (requires reboot to clear). 2. There are race conditions between quagga and mpd5 for adding/dropping routes. 3. if A is a pppoe client and B is the mpd5 server, A cannot access TCP services on B. It can access tcp services _beyond_ B, but not on B. (there is a ticket open for this). On Wed, Oct 12, 2016 at 10:51 AM, Donald Baud via freebsd-net < freebsd-net@freebsd.org> wrote: > > On 10/12/16 1:13 AM, Julian Elischer wrote: > >> On 11/10/2016 8:56 PM, Donald Baud via freebsd-net wrote: >> >>> I've been plagued with these =3Ddaily=3D panics until I tried the follo= wing >>> recipes and the server has been up for 30 days so far: >>> >>> Normally I should expermient more to see which one of the receipes is >>> really the fix, but I'm just glad that the server is stable for now. >>> >> >> this is really great information. >> It makes debugging a lot more possible. >> I know it is a hard question, but do you have a way to simulate this >> workload? >> >> I have no real way to simulate this kind of workload >> > > Sadly, I don't have a way to simulate the workload but I am very > interested to help fix these crashes since as Cassiano said, this makes > mpd5/freebsd useless for pppoe/l2tp termination. > > At this point, I would suggest that Cassiano and =D0=90=D0=BD=D0=B4=D1=80= =D0=B5=D0=B9 confirm that they > don't get panics when they apply the recipes that I am using. > > I am still running many other cisco-vpdn gateways that I would convert > into mpd5/freebsd but my plan was stalled with the daily crashes. > I'll wait a couple of weeks to be sure that my recipes are a valid > workaround before converting my remaining cisco gateways to mpd5. > > -Dbaud > > >>> >>> recipe-1: Don't let mpd5 start automatically when server boots: >>> i.e. in: /etc/rc.conf >>> mpd5_enable=3D"NO" >>> and wait about 5 minutes after server boots then issue: >>> /usr/local/etc/rc.d/mpd5 onestart >>> >>> >>> recipe-2: recompile the kernel with the NETGRAPH_DEBUG option: >>> options NETGRAPH >>> options NETGRAPH_DEBUG >>> options NETGRAPH_KSOCKET >>> options NETGRAPH_L2TP >>> options NETGRAPH_SOCKET >>> options NETGRAPH_TEE >>> options NETGRAPH_VJC >>> options NETGRAPH_PPP >>> options NETGRAPH_IFACE >>> options NETGRAPH_MPPC_COMPRESSION >>> options NETGRAPH_MPPC_ENCRYPTION >>> options NETGRAPH_TCPMSS >>> options IPFIREWALL >>> >>> recipe-3: recompile the kernel and disable the IPv6 and SCTP options: >>> nooptions INET6 >>> nooptions SCTP >>> >>> recipe-4: Don't use any of the sysctl optimizations >>> in other words I commented out all values in sysctl.conf: >>> # net.graph.maxdgram=3D20480 (this is the default) >>> # net.graph.recvspace=3D20480 (this is the default) >>> >>> recipe-5: Don't use any of the loader.conf optimizations >>> in other words I commented out all values in loader.conf >>> # net.graph.maxdata=3D4096 (this is the default) >>> # net.graph.maxalloc=3D4096 (this is the default) >>> >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D >>> In my case, I had the panics with 10.3 and 11-PRERELEASE >>> 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #2 r305587 >>> >>> With those recipes, I have been running without any crash for a month >>> and counting. Thats' 300 l2tp tunnels and 1400 l2tp sessions generatin= g >>> 700Mbit/s. >>> >>> >>> -DBaud >>> >>> >>> On Tuesday, October 11, 2016 7:30 AM, Cassiano Peixoto < >>> peixotocassiano@gmail.com> wrote: >>> Hi, >>> >>> There are many users complaining about this: >>> >>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D186114 >>> >>> I've been dealing with this issue for one year with no solution. mpd5 a= s >>> pppoe server on FreeBSD is useless with this bug. >>> >>> I really would like to see it working again, i think it's quite importa= nt >>> to both project and many users. >>> >>> Thanks. >>> >>> On Tue, Oct 11, 2016 at 3:24 AM, Eugene Grosbein <eugen@grosbein.net> >>> wrote: >>> >>> 11.10.2016 11:02, =D0=90=D0=BD=D0=B4=D1=80=D0=B5=D0=B9 =D0=9B=D0=B5=D1= =83=D1=88=D0=BA=D0=B8=D0=BD =D0=BF=D0=B8=D1=88=D0=B5=D1=82: >>>> >>>> Hello. I have problem with "FreeBSD nas 10.3-RELEASE FreeBSD >>>>> 10.3-RELEASE >>>>> #0: Fri Oct 7 21:12:56 YEKT 2016 nas@nas:/usr/obj/usr/src/sys/nasv3 >>>>> amd64" >>>>> >>>>> Kernel panic is repeated at intervals of 2-3 days. At first I thought >>>>> that >>>>> the problem is in the hardware, but the problem did not go away after >>>>> replacing the server platform. >>>>> >>>>> Coredumps and more info on link >>>>> https://drive.google.com/open?id=3D0BxciMy2q7ZjTTkIxem9wTE1tM2M >>>>> >>>>> Sorry for my english. >>>>> I'll wait for an answer. >>>>> >>>>> This is known and long-stanging problem in the FreeBSD network stack. >>>> It shows up when you have lots of network interfaced created/removed >>>> frequently >>>> like in your case of Network Access Server (PPtP, PPPoE etc). >>>> >>>> Generally, people run into this problem using mpd5 network daemon. >>>> mpd5 uses NETGRAPH kernel subsystem to process traffic and >>>> if an interface disappears (f.e., ,user disconnected) >>>> while kernel still processes traffic obtained from this interface, it >>>> panices. >>>> >>>> There were lots of reports of this problem. Noone seems to be working = on >>>> it at the moment. >>>> You should fill a PR using Bugzilla and attach your logs to it. >>>> >>>> Eugene Grosbein >>>> >>>> > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACpH0McW4KkDbCnfL4DKc4aQiOhnuMYC0q%2B8ELJn6dtDs0HW3A>