From owner-freebsd-stable  Sat Sep  4 15: 2: 8 1999
Delivered-To: freebsd-stable@freebsd.org
Received: from dingo.cdrom.com (castles519.castles.com [208.214.165.83])
	by hub.freebsd.org (Postfix) with ESMTP id C297015208
	for <stable@freebsd.org>; Sat,  4 Sep 1999 15:02:04 -0700 (PDT)
	(envelope-from mike@dingo.cdrom.com)
Received: from dingo.cdrom.com (LOCALHOST [127.0.0.1])
	by dingo.cdrom.com (8.9.3/8.8.8) with ESMTP id OAA07496;
	Sat, 4 Sep 1999 14:54:44 -0700 (PDT)
	(envelope-from mike@dingo.cdrom.com)
Message-Id: <199909042154.OAA07496@dingo.cdrom.com>
X-Mailer: exmh version 2.0.2 2/24/98
To: Paul Saab <paul@mu.org>
Cc: stable@freebsd.org
Subject: Re: analyzing a crash of 3.2-RELEASE 
In-reply-to: Your message of "Sat, 04 Sep 1999 16:41:12 CDT."
             <19990904164112.A47315@elvis.mu.org> 
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sat, 04 Sep 1999 14:54:44 -0700
From: Mike Smith <mike@smith.net.au>
Sender: owner-freebsd-stable@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> I have been trying to figure out why exactly this panic is occurring
> and I am stumped.  The scenero is like this:
> 
> We currently have 40 machines to serve up small graphics for our
> web site and we are currently evaluating other solutions to retire
> the current architecture, and one of them is NT.  I for one do not
> want to let NT onto the site, so I build a 3.2-RELEASE machine
> running thttpd to replace all 40 servers, and not surprisingly,
> the machine was able to handle the entire load.  Lets just say this
> pissed off our NT people a lot and has scared them, because this
> was their attempt to get it onto the site.

8)  Paul, I'm happy to hear you're doing battle for us on this one.  
You can count on our support for this.

> Now onto the problem..  After about an hour or so in production
> (20-30 minutes if running in dual-proc) the machine panics with
> "free: multiple free".  Below is the backtrace.  If someone can
> help me out, or point me in the right direction I'd appreciate it.
> This is really the only thing stopping us from putting it into
> production across the site.  I also looked at the commit logs and
> mailing lists and I could not find if this problem has already been
> fixed.

This looks like a race in the route handling code that was fixed a 
while back; you can either disable path MTU discovery (which prevents 
the massive routing table growth that you may also see in your 
application) or update to 3.2-stable in which I _believe_ that this has 
been fixed.  You could also search the list archives for other threads 
referring to this problem; it has been discussed at some length a while 
back.

> (kgdb) bt
> #0  boot (howto=256) at ../../kern/kern_shutdown.c:285
> #1  0xc015bab9 in panic (fmt=0xc020f3a3 "free: multiple frees")
>     at ../../kern/kern_shutdown.c:446
> #2  0xc0158863 in free (addr=0xc40b6600, type=0xc024f270)
>     at ../../kern/kern_malloc.c:333
> #3  0xc01955d2 in ifafree (ifa=0xc40b6600) at ../../net/route.c:262
> #4  0xc0195556 in rtfree (rt=0xc4ea9d00) at ../../net/route.c:236
> #5  0xc0195960 in rtrequest (req=2, dst=0xc4ea8de0, gateway=0xc4ea8df0, 
>     netmask=0x0, flags=393223, ret_nrt=0x0) at ../../net/route.c:536
> #6  0xc019a031 in in_rtqkill (rn=0xc4ea9d00, rock=0xcd7d2f74)
>     at ../../netinet/in_rmx.c:242
> #7  0xc0194d64 in rn_walktree (h=0xc409b080, f=0xc0199fe0 <in_rtqkill>, 
>     w=0xcd7d2f74) at ../../net/radix.c:956
> #8  0xc019a0de in in_rtqtimo (rock=0xc409b080) at ../../netinet/in_rmx.c:283
> #9  0xc015fea3 in softclock () at ../../kern/kern_timeout.c:124
> #10 0xc01db813 in doreti_swi ()
> #11 0x8049761 in ?? ()
> 
> paul
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-stable" in the body of the message
> 

-- 
\\  The mind's the standard       \\  Mike Smith
\\  of the man.                   \\  msmith@freebsd.org
\\    -- Joseph Merrick           \\  msmith@cdrom.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message