Date: Wed, 10 Jan 2007 20:21:07 +0100 From: Thomas Herrlin <junics-fbsdstable@atlantis.maniacs.se> To: freebsd-stable@freebsd.org Cc: "Bruce A. Mah" <bmah@freebsd.org> Subject: Re: FreeBSD 6.2-RC2 Available - networking zoneli freeze problem still exist. Message-ID: <45A53CA3.7070302@atlantis.maniacs.se> In-Reply-To: <45A3BB4E.3@freebsd.org> References: <1167247246.96863.23.camel@opus.cse.buffalo.edu> <4593DC34.2030308@atlantis.maniacs.se> <1167320870.52842.20.camel@opus.cse.buffalo.edu> <4593E790.8080509@delphij.net> <45A3BB4E.3@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Bruce A. Mah wrote: > If memory serves me right, LI Xin wrote: >> Ken Smith wrote: >>> On Thu, 2006-12-28 at 16:01 +0100, Thomas Herrlin wrote: >>>> It still runs networking daemons into a frozen zoneli state on >>>> heavy/(D)DOS network loads. Such processes cant be kill-9ed so there is >>>> no way to recover from it. (think frozen sshd and a very remote/headless >>>> server). >>>> See the stress test panic called 'Ran out of "128 Bucket" >>>> <http://people.FreeBSD.org/%7Epho/stress/log/cons210.html>' on the 6.2 >>>> todo list and my own latest test here: >>>> http://www.maniacs.se/~junics/temp/vmstat-z.txt >>>> This test was on a new 6.2-RC2 install with no zone limit tweaks nor any >>>> sbsize limits in /etc/login.conf. >>>> I just made a vm disk image with replication instructions, however Peter >>>> Holm have replicated it with his own tools so i have not bothered with >>>> it until now. >>> That problem is being worked on but won't be fixed for 6.2-REL. >>> Depending on how complex the fix winds up being it may be an Errata >>> candidate when the time comes. >> Perhaps we should mention some known workarounds in the errata >> documentation. E.g. raising nmbclusters limit, etc.? > > That's a good idea. Do you have more specifics (e.g. any particular > nmbclusters value, other workarounds, etc.)? > > Thanks, > > Bruce. > The most reliable way of avoiding zoneli according to my tests is setting an sbsize limit in /etc/login.conf to a value lower than the mbuf_cluster zone size limitation, note that there are 2048 bytes per cluster. (See vmstat -z for details) Or set the login.conf sbsize to a fraction of available RAM and combine this with the 0/unlimited setting as some recommend. Combining these two workarounds would probably be best, as setting mbuf to use unlimited ram for networking would cause a panic or freeze sooner or later anyway. I have not tested combining this yet as my system has been running stable for some time now with my current workarounds. Problems with sbsize limit: Setting sbsize in login.conf will lead to that some processes will run into a problem that they cannot allocate socket buffers in some extreme cases, however this will not affect overall system stability and that is my first priority. I have also thrown together a small executable that attempts local connection to its sshd with a the preliminary ssh handshake and that can be used with watchdogd -e parameter to reboot the box. This is mainly for headless/remote servers that MUST NOT have its sshd frozen. You can also read my mail to the fbsd-current list with the subject "Re: zonelimit livelock, some possable workarounds" /Thomas Herrlin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45A53CA3.7070302>