From owner-freebsd-net  Mon Apr  2  3: 4:38 2001
Delivered-To: freebsd-net@freebsd.org
Received: from germes.levi.spb.ru (ip65.levi.spb.ru [212.119.175.65])
	by hub.freebsd.org (Postfix) with ESMTP
	id CEF4737B720; Mon,  2 Apr 2001 03:04:27 -0700 (PDT)
	(envelope-from dms@wplus.net)
Received: from wplus.net (IDENT:dms@pike.levi.spb.ru [10.246.8.43])
	by germes.levi.spb.ru (8.11.1/8.11.1) with ESMTP id f32A3b725793;
	Mon, 2 Apr 2001 14:03:38 +0400
Message-ID: <3AC84E79.12762A22@wplus.net>
Date: Mon, 02 Apr 2001 14:03:37 +0400
From: Dmitry Samersoff <dms@wplus.net>
Organization: LeviSoft
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.2.18 i686)
X-Accept-Language: en, ru
MIME-Version: 1.0
To: "Daniel O'Connor" <doconnor@gsoft.com.au>
Cc: freebsd-hackers@FreeBSD.ORG, freebsd-net@FreeBSD.ORG
Subject: Dynamic routing table (problem solved, was: server continue dies)
References: <XFMail.010327190918.doconnor@gsoft.com.au>
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-net@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

My servers had died every 12h and I spend lots of time to solve problem,
I hope the result of my work is interesting for community.

The main reason of server fault is overloading of dynamic routing table
(netstat -nra | grep W3)

Another point -  the same software running on non-Intel server 
(no-name PC with AHA SCSI and DEC net card) works without problems.


I.
Behavior of dynamic routing table controlled by sysctl variables:
 net.inet.ip.rtexpire
 net.inet.ip.rtminexpire
 net.inet.ip.rtmaxcache

IMHO, default values of this variables should be changed to make heavy
loaded
servers more reliable or at least it should be documented.

  1. net.inet.ip.rtexpire should be set to 10 not to 3600 by default

  This value slow down a bit intranet servers, but 
make heavy loaded www servers more reliable. I check this variable 
on some www servers around me and find that all really loaded ones have
net.inet.ip.rtexpire=10

  2. net.inet.ip.rtmaxcache should depend to maxusers.

  3. kernel should drop first entries of DR independently of it's age,
     and rise appropriate error message to console if the table
overloaded.

II.     
 I'm not sure whether or not my problem depends of fxp driver,
 but it's possible. 

  
Daniel O'Connor wrote:
> 
> On 27-Mar-01 Dmitry Samersoff wrote:
> >  I also have a kernel crash dump and could post it here if no one can
> >  give me a good advice without it ;-)))
> 
> If you haven't compiled the kernel with debugging symbols then you should do so..
> 
> After that get a crash dump and do..
> 
> cd /var/crash
> gdb -k kernel.0 vmcore.0
> bt
> 
> And post the output.
> 
> When you do post info like your dmesg output and hardware specs.
> 
> >  I'm terribly sorry to waste your time but this is critical problem
> >  and unfortunately I have no ideas how to solve it or at least
> >  find reason of such behavior.
> 
> It does seem odd given the machien doesn't look _too_ busy.
> 
> What sort of processes are you running on it?
> Web server, ftp server, etc?
> 
> Can you run top or ps and find out what particular processes are running at the time
> it crashes?
> 
> ---
> Daniel O'Connor software and network engineer
> for Genesis Software - http://www.gsoft.com.au
> "The nice thing about standards is that there
> are so many of them to choose from."
>   -- Andrew Tanenbaum
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-hackers" in the body of the message

-- 
Dmitry Samersoff, dms@wplus.net, ICQ:3161705
http://devnull.wplus.net
* There will come soft rains ...

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message