From owner-freebsd-current Fri Nov 1 13:24: 2 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 649C037B401 for ; Fri, 1 Nov 2002 13:23:59 -0800 (PST) Received: from avocet.mail.pas.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id BC7FE43E77 for ; Fri, 1 Nov 2002 13:23:58 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0005.cvx22-bradley.dialup.earthlink.net ([209.179.198.5] helo=mindspring.com) by avocet.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 187jGg-0000HF-00; Fri, 01 Nov 2002 13:23:50 -0800 Message-ID: <3DC2F094.D9C117DA@mindspring.com> Date: Fri, 01 Nov 2002 13:22:28 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Michal Mertl Cc: current@freebsd.org Subject: [PATCH]Re: crash with network load (in tcp syncache ?) References: Content-Type: multipart/mixed; boundary="------------D2E3FDA2D4CFBE7C101BE3E0" Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG This is a multi-part message in MIME format. --------------D2E3FDA2D4CFBE7C101BE3E0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Michal Mertl wrote: > I'm getting panics on SMP -CURRENT while running apachebench (binary ab > from apache distribution, not the Perl one) against httpd on the machine. > > The panics don't occur when I have WITNESS and INVARIANTS turned on. [ ... ] > #10 0xc01bd46f in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:503 > #11 0xc01f7e1e in sofree (so=0xc58f05d0) at > /usr/src/sys/kern/uipc_socket.c:312 > #12 0xc01fa508 in sonewconn (head=0xc43874d8, connstatus=2) > at /usr/src/sys/kern/uipc_socket2.c:208 > #13 0xc023f42f in syncache_socket (sc=0x2, lso=0xc43874d8, m=0xc1662200) > at /usr/src/sys/netinet/tcp_syncache.c:564 > #14 0xc023f748 in syncache_expand (inc=0xd6a62b3c, th=0xc1f6c834, > sop=0xd6a62b10, m=0xc1662200) > /usr/src/sys/netinet/tcp_syncache.c:783 > #15 0xc0239978 in tcp_input (m=0xc1f6c834, off0=20) > at /usr/src/sys/netinet/tcp_input.c:713 soreserve is called to get mbufs reserved to the socket, and sbreserve is called, and this fails, because you have too few mbufs in your system for the number of connections you have configured. This is a problem because the sotryfree() in sonewconn() (see the definition in sys/socketvar.h) sees a so_count of zero, and calls sofree() directly. The sofree() fails because the socket is not enqueued as being an incomplete connection, and not enqueued as being a complete connection (not on a queue, and so_state does not have SS_INCOMP or SS_COMP flags set). Basically, this code dies not expect to be called in this case, and the call occurs because the SYN cache code runs at NETISR. Personally, I do not understand why a prereservation for mbufs is necessary in this particular case: if you are out of mbufs, the packets should end up dropped, in any case, so it should not matter. I guess it's an attempt to "protect you" from massive connection attempts acting as a denial of service attack. One "fix" would be to reference the socket before making the call, in syncache_socket(). The basically correct way to do this would be to invert the order of the "if" test in sonewconn() (see attached patch). This can also fail, though: if the protocol attach fails, then it will still panic. Also, if the protocol attach doesn't fail, and there's an soabort(), if the protocol detach fails, it will still call sotryfree() in the abort... and, once again, panic. My suggestion: 1) Try the attached patch; it will probably cover up the problem for you. 2) Make sure you don't put the number of connections you allow to be larger than the number of mbufs, divided by 2, divided by the number of mbufs you have set in the net.inet.tcp.recvspace (i.e.: Do Not Overcommit Mbufs). 3) Disable the use of "SYN cookies", e.g.: sysctl net.inet.tcp.syncookies=0 SYN cookies are incredibly evil, and will put pressure on your resources by drastically increasing pool retention time, if they end up being invoked. -- Terry --------------D2E3FDA2D4CFBE7C101BE3E0 Content-Type: text/plain; charset=us-ascii; name="uipc.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="uipc.diff" Index: uipc_socket2.c =================================================================== RCS file: /cvs/src/sys/kern/uipc_socket2.c,v retrieving revision 1.104 diff -c -r1.104 uipc_socket2.c *** uipc_socket2.c 18 Sep 2002 19:44:11 -0000 1.104 --- uipc_socket2.c 1 Nov 2002 17:16:39 -0000 *************** *** 203,210 **** #ifdef MAC mac_create_socket_from_socket(head, so); #endif ! if (soreserve(so, head->so_snd.sb_hiwat, head->so_rcv.sb_hiwat) || ! (*so->so_proto->pr_usrreqs->pru_attach)(so, 0, NULL)) { sotryfree(so); return ((struct socket *)0); } --- 203,210 ---- #ifdef MAC mac_create_socket_from_socket(head, so); #endif ! if ((*so->so_proto->pr_usrreqs->pru_attach)(so, 0, NULL) || ! soreserve(so, head->so_snd.sb_hiwat, head->so_rcv.sb_hiwat)) { sotryfree(so); return ((struct socket *)0); } --------------D2E3FDA2D4CFBE7C101BE3E0-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message