From owner-freebsd-hackers Mon Feb 24 13:53:10 2003 Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B10D137B401 for ; Mon, 24 Feb 2003 13:53:07 -0800 (PST) Received: from bluejay.mail.pas.earthlink.net (bluejay.mail.pas.earthlink.net [207.217.120.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id 26DBA43F3F for ; Mon, 24 Feb 2003 13:53:07 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0300.cvx22-bradley.dialup.earthlink.net ([209.179.199.45] helo=mindspring.com) by bluejay.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18nQWx-0007Bk-00; Mon, 24 Feb 2003 13:53:00 -0800 Message-ID: <3E5A93EC.E8083957@mindspring.com> Date: Mon, 24 Feb 2003 13:51:40 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: G-der Cc: freebsd-hackers@freebsd.org Subject: Re: Properly reaping children from a fork() References: <20030224183953.GB80651@gder.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a409041a3bcf13789aecab6dc5bd236428548b785378294e88350badd9bab72f9c350badd9bab72f9c Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG G-der wrote: > This is a first attempt for me but I seem to have problems when it comes > to ensuring that all the children exit like they should. What happens is > that each child process remains in a zombied state (as seen through ps). > Also if you check sockstat you can see that each zombied process still has > a connection open. Unless you explicitly close the connection, it's open. In the case of a zombie, though, there is a zombie status structure, and the files are in fact closed. Probably what you are seeing is the TCP connection is not being properly closed by the client, or the connection rate is high enough that the 2MSL timer has not expired by the time you are looking at it, so they appear open. > I'm not sure if the problem is how I am trying to close the socket or if > it is with my signal handling. I'm currently reading through intro(2) to > see if there is something simple I've missed. Who is closing the socket, the client or the server? If the client is closing the socket, it must call "shutdown(2)" on the socket, and then explicitly call close. This is because certain user space TCP implementations, such as those in Windows, do not have proper resource tracking for sockets, which are implemented in user space, instead of being implemented in kernel space. There are also a number fo "test" applications, which are nothing more than "SYN-guns", i.e. they do not implement the full handshake, and their only purpose is to connection-load the server. Many of these send RST, and then drop all knowledge of the connection. This will, over time, result in a large number of outstanding open sockets on the server, since RST packets do not time out and repeat, since they do not require ACK'ing: because of that, if they get lost, then they are lost forever. This can happen even on a local wire, if you get a collision or some other event (for example). > /* The commented code was supposed to reap the children in but it didn't > the call to signal() is supposed to do the same thing, but it doesn't */ > > /* sa.sa_handler = sigchld_handler; > sigemptyset(&sa.sa_mask); > sa.sa_flags = SA_RESTART; > if (sigaction(SIGCHLD, &sa, NULL) == -1) { > perror("main:sigaction"); > exit(-1); > } */ > > signal(SIGCHLD, sigchld_handler); This is bogus. You don't want SA_RESTART behaviour here, since it's not like you have to worry about a system call. Calling the old "signal" code overrides pretty much everything you do with the sigaction() in any case. Your main() exits, instead of hanging in wait() or sleep(), and looping forever, in order to reap connections. Your fork() returns the wrong way to main() (i.e. child exits and parent stays around forever), so you have some identity confusion. NB: An alternate way of setting up automatic reaping, since your handler seems to not care about exit status, is to set SIG_IGN as the handler for SIGCLD. Actually, you'd do well to get a copy of "UNIX Network Programming" by Stevens, or, minimally, download the example source code from the publisher's web site. It would help you out by providing you working "simple server" example source code. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message