Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 12 Oct 2016 15:24:17 -0400
From:      Zaphod Beeblebrox <zbeeble@gmail.com>
To:        Donald Baud <donaldbaud@yahoo.com>
Cc:        "net@freebsd.org" <net@freebsd.org>
Subject:   Re: FreeBSD10.3-RELEASE. Kernel panic.
Message-ID:  <CACpH0McW4KkDbCnfL4DKc4aQiOhnuMYC0q%2B8ELJn6dtDs0HW3A@mail.gmail.com>
In-Reply-To: <86183ea5-5855-5fb3-22f6-d25454859186@yahoo.com>
References:  <CAAFYNruF4gFAiTCAhyRUQzcovW2osrKn4ehiuNR0btJCZbnOGg@mail.gmail.com> <57FC859F.5000200@grosbein.net> <CAJajdNUXOrzWDKVmSB1Xm_G6zqBhMsZ2vesDcAw2CPGFBU0xtg@mail.gmail.com> <2033449965.65391.1476244568309@mail.yahoo.com> <a450f0eb-378a-2bd5-2f24-a0eb6b941856@freebsd.org> <86183ea5-5855-5fb3-22f6-d25454859186@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
While my mp5 servers are possibly less busy (I havn't had common crashes),
I have noticed a "group" of problems.

1. The carrier dropping communication (ie: fiber cut or l2 switch breakage)
of the L2TP streams can leave mpd5 in a state where it will not die and
will not destroy interfaces (requires reboot to clear).
2. There are race conditions between quagga and mpd5 for adding/dropping
routes.
3. if A is a pppoe client and B is the mpd5 server, A cannot access TCP
services on B.  It can access tcp services _beyond_ B, but not on B. (there
is a ticket open for this).

On Wed, Oct 12, 2016 at 10:51 AM, Donald Baud via freebsd-net <
freebsd-net@freebsd.org> wrote:

>
> On 10/12/16 1:13 AM, Julian Elischer wrote:
>
>> On 11/10/2016 8:56 PM, Donald Baud via freebsd-net wrote:
>>
>>> I've been plagued with these =3Ddaily=3D panics until I tried the follo=
wing
>>> recipes and the server has been up for 30 days so far:
>>>
>>> Normally I should expermient more to see which one of the receipes is
>>> really the fix, but I'm just glad that the server is stable for now.
>>>
>>
>> this is really great information.
>> It makes debugging a lot more possible.
>> I know it is a hard question, but do you have a way to simulate this
>> workload?
>>
>> I have no real way to simulate this kind of workload
>>
>
> Sadly, I don't have a way to simulate the workload but I am very
> interested to help fix these crashes since as Cassiano said, this makes
> mpd5/freebsd useless for pppoe/l2tp termination.
>
> At this point, I would suggest that Cassiano and =D0=90=D0=BD=D0=B4=D1=80=
=D0=B5=D0=B9 confirm that they
> don't get panics when they apply the recipes that I am using.
>
> I am still running many other cisco-vpdn gateways that I would convert
> into mpd5/freebsd but my plan was stalled with the daily crashes.
> I'll wait a couple of weeks to be sure that my recipes are a valid
> workaround before converting my remaining cisco gateways to mpd5.
>
> -Dbaud
>
>
>>>
>>> recipe-1: Don't let mpd5 start automatically when server boots:
>>> i.e. in: /etc/rc.conf
>>> mpd5_enable=3D"NO"
>>> and wait about 5 minutes after server boots then issue:
>>> /usr/local/etc/rc.d/mpd5 onestart
>>>
>>>
>>> recipe-2: recompile the kernel with the NETGRAPH_DEBUG option:
>>> options         NETGRAPH
>>> options         NETGRAPH_DEBUG
>>> options         NETGRAPH_KSOCKET
>>> options         NETGRAPH_L2TP
>>> options         NETGRAPH_SOCKET
>>> options         NETGRAPH_TEE
>>> options         NETGRAPH_VJC
>>> options         NETGRAPH_PPP
>>> options         NETGRAPH_IFACE
>>> options         NETGRAPH_MPPC_COMPRESSION
>>> options         NETGRAPH_MPPC_ENCRYPTION
>>> options         NETGRAPH_TCPMSS
>>> options         IPFIREWALL
>>>
>>> recipe-3: recompile the kernel and disable the IPv6 and SCTP options:
>>> nooptions       INET6
>>> nooptions       SCTP
>>>
>>> recipe-4: Don't use any of the sysctl optimizations
>>> in other words I commented out all values in sysctl.conf:
>>> # net.graph.maxdgram=3D20480  (this is the default)
>>> # net.graph.recvspace=3D20480  (this is the default)
>>>
>>> recipe-5: Don't use any of the loader.conf optimizations
>>> in other words I commented out all values in loader.conf
>>> # net.graph.maxdata=3D4096  (this is the default)
>>> # net.graph.maxalloc=3D4096 (this is the default)
>>>
>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>> In my case, I had the panics with 10.3 and 11-PRERELEASE
>>> 11.0-PRERELEASE FreeBSD 11.0-PRERELEASE #2 r305587
>>>
>>> With those recipes, I have been running without any crash for a month
>>> and counting.  Thats' 300 l2tp tunnels and 1400 l2tp sessions generatin=
g
>>> 700Mbit/s.
>>>
>>>
>>> -DBaud
>>>
>>>
>>> On Tuesday, October 11, 2016 7:30 AM, Cassiano Peixoto <
>>> peixotocassiano@gmail.com> wrote:
>>> Hi,
>>>
>>> There are many users complaining about this:
>>>
>>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D186114
>>>
>>> I've been dealing with this issue for one year with no solution. mpd5 a=
s
>>> pppoe server on FreeBSD is useless with this bug.
>>>
>>> I really would like to see it working again, i think it's quite importa=
nt
>>> to both project and many users.
>>>
>>> Thanks.
>>>
>>> On Tue, Oct 11, 2016 at 3:24 AM, Eugene Grosbein <eugen@grosbein.net>
>>> wrote:
>>>
>>> 11.10.2016 11:02, =D0=90=D0=BD=D0=B4=D1=80=D0=B5=D0=B9 =D0=9B=D0=B5=D1=
=83=D1=88=D0=BA=D0=B8=D0=BD =D0=BF=D0=B8=D1=88=D0=B5=D1=82:
>>>>
>>>> Hello. I have problem with "FreeBSD nas 10.3-RELEASE FreeBSD
>>>>> 10.3-RELEASE
>>>>> #0: Fri Oct  7 21:12:56 YEKT 2016 nas@nas:/usr/obj/usr/src/sys/nasv3
>>>>>    amd64"
>>>>>
>>>>> Kernel panic is repeated at intervals of 2-3 days. At first I thought
>>>>> that
>>>>> the problem is in the hardware, but the problem did not go away after
>>>>> replacing the server platform.
>>>>>
>>>>> Coredumps and more info on link
>>>>> https://drive.google.com/open?id=3D0BxciMy2q7ZjTTkIxem9wTE1tM2M
>>>>>
>>>>> Sorry for my english.
>>>>> I'll wait for an answer.
>>>>>
>>>>> This is known and long-stanging problem in the FreeBSD network stack.
>>>> It shows up when you have lots of network interfaced created/removed
>>>> frequently
>>>> like in your case of Network Access Server (PPtP, PPPoE etc).
>>>>
>>>> Generally, people run into this problem using mpd5 network daemon.
>>>> mpd5 uses NETGRAPH kernel subsystem to process traffic and
>>>> if an interface disappears (f.e., ,user disconnected)
>>>> while kernel still processes traffic obtained from this interface, it
>>>> panices.
>>>>
>>>> There were lots of reports of this problem. Noone seems to be working =
on
>>>> it at the moment.
>>>> You should fill a PR using Bugzilla and attach your logs to it.
>>>>
>>>> Eugene Grosbein
>>>>
>>>>
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CACpH0McW4KkDbCnfL4DKc4aQiOhnuMYC0q%2B8ELJn6dtDs0HW3A>