FreeBSD Mail Archives

Date:      Wed, 10 Jan 2007 20:21:07 +0100
From:      Thomas Herrlin <junics-fbsdstable@atlantis.maniacs.se>
To:        freebsd-stable@freebsd.org
Cc:        "Bruce A. Mah" <bmah@freebsd.org>
Subject:   Re: FreeBSD 6.2-RC2 Available - networking zoneli freeze problem still exist.
Message-ID:  <45A53CA3.7070302@atlantis.maniacs.se>
In-Reply-To: <45A3BB4E.3@freebsd.org>
References:  <1167247246.96863.23.camel@opus.cse.buffalo.edu>	<4593DC34.2030308@atlantis.maniacs.se>	<1167320870.52842.20.camel@opus.cse.buffalo.edu>	<4593E790.8080509@delphij.net> <45A3BB4E.3@freebsd.org>

index | next in thread | previous in thread | raw e-mail

Bruce A. Mah wrote:
> If memory serves me right, LI Xin wrote:
>> Ken Smith wrote:
>>> On Thu, 2006-12-28 at 16:01 +0100, Thomas Herrlin wrote:
>>>> It still runs networking daemons into a frozen zoneli state on
>>>> heavy/(D)DOS network loads. Such processes cant be kill-9ed so there is
>>>> no way to recover from it. (think frozen sshd and a very remote/headless
>>>> server).
>>>> See the stress test panic called 'Ran out of "128 Bucket"
>>>> <http://people.FreeBSD.org/%7Epho/stress/log/cons210.html>' on the 6.2
>>>> todo list and my own latest test here:
>>>> http://www.maniacs.se/~junics/temp/vmstat-z.txt
>>>> This test was on a new 6.2-RC2 install with no zone limit tweaks nor any
>>>> sbsize limits in /etc/login.conf.
>>>> I just made a vm disk image with replication instructions, however Peter
>>>> Holm have replicated it with his own tools so i have not bothered with
>>>> it until now. 
>>> That problem is being worked on but won't be fixed for 6.2-REL.
>>> Depending on how complex the fix winds up being it may be an Errata
>>> candidate when the time comes.
>> Perhaps we should mention some known workarounds in the errata
>> documentation.  E.g. raising nmbclusters limit, etc.?
> 
> That's a good idea.  Do you have more specifics (e.g. any particular
> nmbclusters value, other workarounds, etc.)?
> 
> Thanks,
> 
> Bruce.
> 

The most reliable way of avoiding zoneli according to my tests is
setting an sbsize limit in /etc/login.conf to a value lower than the
mbuf_cluster zone size limitation, note that there are 2048 bytes per
cluster. (See vmstat -z for details)
Or set the login.conf sbsize to a fraction of available RAM and combine
this with the 0/unlimited setting as some recommend.
Combining these two workarounds would probably be best, as setting mbuf
to use unlimited ram for networking would cause a panic or freeze sooner
or later anyway. I have not tested combining this yet as my system has
been running stable for some time now with my current workarounds.

Problems with sbsize limit:
Setting sbsize in login.conf will lead to that some processes will run
into a problem that they cannot allocate socket buffers in some extreme
cases, however this will not affect overall system stability and that is
my first priority.

I have also thrown together a small executable that attempts local
connection to its sshd with a the preliminary ssh handshake and that can
be used with watchdogd -e parameter to reboot the box. This is mainly
for headless/remote servers that MUST NOT have its sshd frozen.

You can also read my mail to the fbsd-current list with the subject "Re:
zonelimit livelock, some possable workarounds"

/Thomas Herrlin

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45A53CA3.7070302>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation