From owner-freebsd-current Sat Nov 2 2:27:16 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 044A537B401 for ; Sat, 2 Nov 2002 02:27:14 -0800 (PST) Received: from prg.traveller.cz (prg.traveller.cz [193.85.2.77]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4BD6443E42 for ; Sat, 2 Nov 2002 02:27:13 -0800 (PST) (envelope-from mime@traveller.cz) Received: from prg.traveller.cz (localhost [127.0.0.1]) by prg.traveller.cz (8.12.2[KQ/pukvis]/8.12.2-prg) with ESMTP id gA2ARCiT095400; Sat, 2 Nov 2002 11:27:12 +0100 (CET) Received: from localhost (mime@localhost) by prg.traveller.cz (8.12.2[KQ/pukvis]/8.12.2-prg/submit) with ESMTP id gA2ARCuU095397; Sat, 2 Nov 2002 11:27:12 +0100 (CET) Date: Sat, 2 Nov 2002 11:27:12 +0100 (CET) From: Michal Mertl To: Terry Lambert Cc: Bill Fenner , Subject: Re: crash with network load (in tcp syncache ?) In-Reply-To: <3DC32598.A0D0909A@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 1 Nov 2002, Terry Lambert wrote: > Bill Fenner wrote: > > >I think this can still crash (just like my patch); the problem is in > > >what happens when it fails to allocate memory. Unless you set one of > > >the flags, it's still going to panic in the same place, I think, when > > >you run out of memory. > > > > No. The flags are only checked when so_head is not NULL. sonewconn() > > was handing sofree() an inconsistent struct so (so_head was set without > > being on either queue), i.e. sonewconn() was creating an invalid data > > structure. > > You're right... I missed that; I was thinking too hard on the other > situations (e.g. soabort()) that could trigger that code, and no > enough on the code itself. > > > The call in sonewconn() used to be to sodealloc(), which didn't care > > about whether or not the data structure was self-consistent. The code > > was refactored to do reference counting, but the fact that the socket > > was inconsistent at that point wasn't noticed until now. > > Yeah; I looked at doing a ref() of the thing as a partial fix, > but the unref() did the sotryfree() anyway. > > > > The problem is not at all based on what happens in the allocation or > > protocol attach failure cases. The SYN cache is not involved, this is > > a bug in sonewconn(), plain and simple. > > I still think there is a potential failure case, but the amount of > code you'd have to read through to follow it is immense. It has to > do with the conection completing at NETISR, instead of in a process > context, in the allocation failure case. I ran into the same issue > when trying to run connections to completion up to the accept() at > interrupt, in the LRP case. The SYN cache case is very similar, in > the case of a cookie that hits when there are no resources remaining. > He might be able to trigger it with his setup, by setting the cache > size way, way don, and thus relying on cookies, and then flooding it > with conection requests until he runs it out of resources. Do I read you correctly that Bill's patch is probably better than yours (I tested both, both fix the problem)? If you still believe there's a problem (bug) I may trigger with some setting please tell me. I don't know how to make syncookies kick in - I set net.inet.tcp.cachelimit to 100 but it doesn't seem to make a difference but I don't know what am I doing :-). I imagine syncache doesn't grow much when I'm connecting from signle IP and connections are quickly eastablished. I'll be able to do some tests on monday - this is a computer at work. FWIW netstat -m during the benchmark run shows (I read it that it doesn't have problem - even just before the crash): mbuf usage: GEN list: 0/0 (in use/in pool) CPU #0 list: 71/160 (in use/in pool) CPU #1 list: 79/160 (in use/in pool) Total: 150/320 (in use/in pool) Maximum number allowed on each CPU list: 512 Maximum possible: 34560 Allocated mbuf types: 80 mbufs allocated to data 70 mbufs allocated to packet headers 0% of mbuf map consumed mbuf cluster usage: GEN list: 0/0 (in use/in pool) CPU #0 list: 38/114 (in use/in pool) CPU #1 list: 41/104 (in use/in pool) Total: 79/218 (in use/in pool) Maximum number allowed on each CPU list: 128 Maximum possible: 17280 1% of cluster map consumed 516 KBytes of wired memory reserved (37% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines -- Michal Mertl mime@traveller.cz To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message