Date: Sat, 2 Nov 2002 11:27:12 +0100 (CET) From: Michal Mertl <mime@traveller.cz> To: Terry Lambert <tlambert2@mindspring.com> Cc: Bill Fenner <fenner@research.att.com>, <current@freebsd.org> Subject: Re: crash with network load (in tcp syncache ?) Message-ID: <Pine.BSF.4.41.0211020937210.87031-100000@prg.traveller.cz> In-Reply-To: <3DC32598.A0D0909A@mindspring.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 1 Nov 2002, Terry Lambert wrote:
> Bill Fenner wrote:
> > >I think this can still crash (just like my patch); the problem is in
> > >what happens when it fails to allocate memory. Unless you set one of
> > >the flags, it's still going to panic in the same place, I think, when
> > >you run out of memory.
> >
> > No. The flags are only checked when so_head is not NULL. sonewconn()
> > was handing sofree() an inconsistent struct so (so_head was set without
> > being on either queue), i.e. sonewconn() was creating an invalid data
> > structure.
>
> You're right... I missed that; I was thinking too hard on the other
> situations (e.g. soabort()) that could trigger that code, and no
> enough on the code itself.
>
> > The call in sonewconn() used to be to sodealloc(), which didn't care
> > about whether or not the data structure was self-consistent. The code
> > was refactored to do reference counting, but the fact that the socket
> > was inconsistent at that point wasn't noticed until now.
>
> Yeah; I looked at doing a ref() of the thing as a partial fix,
> but the unref() did the sotryfree() anyway.
>
>
> > The problem is not at all based on what happens in the allocation or
> > protocol attach failure cases. The SYN cache is not involved, this is
> > a bug in sonewconn(), plain and simple.
>
> I still think there is a potential failure case, but the amount of
> code you'd have to read through to follow it is immense. It has to
> do with the conection completing at NETISR, instead of in a process
> context, in the allocation failure case. I ran into the same issue
> when trying to run connections to completion up to the accept() at
> interrupt, in the LRP case. The SYN cache case is very similar, in
> the case of a cookie that hits when there are no resources remaining.
> He might be able to trigger it with his setup, by setting the cache
> size way, way don, and thus relying on cookies, and then flooding it
> with conection requests until he runs it out of resources.
Do I read you correctly that Bill's patch is probably better than yours
(I tested both, both fix the problem)?
If you still believe there's a problem (bug) I may trigger with some
setting please tell me. I don't know how to make syncookies kick in - I
set net.inet.tcp.cachelimit to 100 but it doesn't seem to make a
difference but I don't know what am I doing :-). I imagine syncache
doesn't grow much when I'm connecting from signle IP and connections are quickly
eastablished. I'll be able to do some tests on monday - this is a computer
at work.
FWIW netstat -m during the benchmark run shows (I read it that it doesn't
have problem - even just before the crash):
mbuf usage:
GEN list: 0/0 (in use/in pool)
CPU #0 list: 71/160 (in use/in pool)
CPU #1 list: 79/160 (in use/in pool)
Total: 150/320 (in use/in pool)
Maximum number allowed on each CPU list: 512
Maximum possible: 34560
Allocated mbuf types:
80 mbufs allocated to data
70 mbufs allocated to packet headers
0% of mbuf map consumed
mbuf cluster usage:
GEN list: 0/0 (in use/in pool)
CPU #0 list: 38/114 (in use/in pool)
CPU #1 list: 41/104 (in use/in pool)
Total: 79/218 (in use/in pool)
Maximum number allowed on each CPU list: 128
Maximum possible: 17280
1% of cluster map consumed
516 KBytes of wired memory reserved (37% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines
--
Michal Mertl
mime@traveller.cz
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.41.0211020937210.87031-100000>
