From owner-freebsd-stable Tue Mar 28 21:13:38 2000 Delivered-To: freebsd-stable@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by hub.freebsd.org (Postfix) with ESMTP id D456D37B95B for ; Tue, 28 Mar 2000 21:13:30 -0800 (PST) (envelope-from bright@fw.wintelcom.net) Received: (from bright@localhost) by fw.wintelcom.net (8.10.0/8.10.0) id e2T5bsL26138; Tue, 28 Mar 2000 21:37:54 -0800 (PST) Date: Tue, 28 Mar 2000 21:37:54 -0800 From: Alfred Perlstein To: gerti-freebsds@bitart.com Cc: freebsd-stable@FreeBSD.ORG Subject: Re: Random signal 9 (SIGKILL), please help! Message-ID: <20000328213754.L21029@fw.wintelcom.net> References: <20000329041104.3028.qmail@camelot.bitart.com> <20000328204948.K21029@fw.wintelcom.net> <20000329043747.3094.qmail@camelot.bitart.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: <20000329043747.3094.qmail@camelot.bitart.com>; from gerti@bitart.com on Tue, Mar 28, 2000 at 10:37:46PM -0600 Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG * Gerd Knops [000328 21:03] wrote: > Alfred Perlstein wrote: > > * Gerd Knops [000328 20:36] wrote: > > > Only on the FreeBSD systems I see that child processes occasionaly > > > get killed bya signal 9, and I just can't figure out why. > > > > > > Syslog does not give any indication. The machines do not swap (I > > > know processes mayget killed when the systems run out of swap > > > space). The times at which the processesare killed does seem to > > > be random, meaning it does not seem dome house keeping codethat > > > causes it. > > > > > > The processes are spawned from various daemons, and are killed > > > at different pointsin their existence, even when just barely > > > started and no resources to mention areconsumed yet. > > > > > > All processes run as root, so 'limit' should not be the cause. > > > > > > Is there anything else but the swapper that can trigger a 'signal > > > 9' to be sent toprocesses? > > > > > > The systems in question run a variety of versions, starting from > > > 3.2 Release to afairly recent (4 weeks) 3.4 stable. > > > > > This is on all the FreeBSD systems? This is really confusing I've > > _never_ heard of this happening, do you have any machines built > > with the same _exact_ hardware exibiting the same problems or not? > > > Nope, different hardware, all Intel CPUs, some Pentium Pro, some > Pentium II, ASUS and Gigabyte motherboards. > > > Have you tried 4.0? Without some sample code this is going to > > be very hard to reproduce. > > > The code is >50k lines of perl... No I have not tried 4.0 yet. And > I can not reproduce the problem either, it just randomly appears > at a very low rate. 23 machines running FreeBSD, and I see about 1 > to 3 of those a day. > > > Are you sure you aren't running out of process slots? What is > > maxusers set to in the kernel? > > 64. Try maybe 128? > > > How many processes typically run at the same time? > > > Varying, the busiest machine peaks at about 100 processes, but I > have seen it on machines running only 50 processes. > > Thanks for responding! I've never heard of signal 9 "by accident" and since this problem happens on a variety of 3.x systems 3.2-3.4 (3.4-stable also?) it seems really weird. I don't think I can be of very much help without access to the code and the machines running it, as well as how it is being run, apache+cgi? -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message