From owner-freebsd-current Tue Nov 10 19:02:38 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id TAA27331 for freebsd-current-outgoing; Tue, 10 Nov 1998 19:02:38 -0800 (PST) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from allegro.lemis.com (allegro.lemis.com [192.109.197.134]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id TAA27315 for ; Tue, 10 Nov 1998 19:02:31 -0800 (PST) (envelope-from grog@freebie.lemis.com) Received: from freebie.lemis.com (freebie.lemis.com [192.109.197.137]) by allegro.lemis.com (8.9.1/8.9.0) with ESMTP id NAA20772; Wed, 11 Nov 1998 13:32:13 +1030 (CST) Received: (from grog@localhost) by freebie.lemis.com (8.9.1/8.9.0) id NAA20401; Wed, 11 Nov 1998 13:32:12 +1030 (CST) Message-ID: <19981111133212.B20374@freebie.lemis.com> Date: Wed, 11 Nov 1998 13:32:12 +1030 From: Greg Lehey To: Alexander Litvin Cc: current@FreeBSD.ORG Subject: Re: The infamous dying daemons bug References: <199811101456.QAA28210@grape.carrier.kiev.ua> <199811110038.CAA01861@grape.carrier.kiev.ua> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 0.91.1i In-Reply-To: <199811110038.CAA01861@grape.carrier.kiev.ua>; from Alexander Litvin on Wed, Nov 11, 1998 at 02:38:38AM +0200 WWW-Home-Page: http://www.lemis.com/~grog Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-41-739-7062 Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Wednesday, 11 November 1998 at 2:38:38 +0200, Alexander Litvin wrote: > In article <199811101456.QAA28210@grape.carrier.kiev.ua> you wrote: > >>>> Nov 10 03:15:34 grape /kernel: swap_pager: suggest more swap space: 61 MB >>>> Nov 10 03:16:26 grape /kernel: pid 310 (sendmail), uid 0: exited on signal 11 >>>> Nov 10 03:17:26 grape /kernel: pid 311 (sendmail), uid 0: exited on signal 11 >>>> Nov 10 03:18:25 grape /kernel: pid 313 (sendmail), uid 0: exited on signal 11 >>>> Nov 10 03:19:25 grape /kernel: pid 353 (sendmail), uid 0: exited on signal 11 >>>> Nov 10 03:20:26 grape /kernel: pid 394 (sendmail), uid 0: exited on signal 11 > > GL>> Ah, now that's one that I've been getting without exhausting memory. > GL>> I'm assuming that these dying sendmails are children of the daemon. > GL>> What happens when you kill -1 the daemon ("accepting connections on > GL>> port 25 (sendmail)")? In my experience, it *always* dies with a > GL>> SIGSEGV after these messages have occurred. > > AL> Well, as I understand, 'swap_pager: suggest more swap space' does > AL> not mean that memory is exhausted, but only that it is about to > AL> be exhausted. At least, in this case it didn't come to any processes > AL> being killed by kernel. > > AL> You're right -- that sendmails were childs of a daemon (queue runners). > AL> I'm not sure about what happens if I send SIGHUP to the daemon. I > AL> think it may or may not restart -- it depends. Last time I examined > AL> a 'deseased' daemon (it was not sendmail, but a dummy daemon written > AL> specially for testing), it appeared that some range of process memory, > AL> where code of dynamic library lives, was corrupt (zeroed in that case). > > AL> I'll try later to kill -1 such daemon. Now I'm in the process of testing > AL> Dima's kludge. Until now I was unable to reproduce a problem. Daemons > AL> keep living ;) > > Brought up old kernel without kludge. > > It appears that memory corruption leading to 'daemons dying' may take > different forms. E.g., once it appears that sendmail continues to > fork for queue runs successfully, but when I do 'telnet localhost 25', > it just accepts connection, forks, changes proctitle ('startup with ...'), > and goes into some strange state -- no EHLO, just accepts all I type > in telnet and that's all. In that state kill -1 restarts sendmail ok. > Other time I exhaust memory, sendmail segfaults every child forked > for queue run, again restarts ok on SIGHUP. Once I even got in responce > to 'telnet localhost 25': > > Trying 127.0.0.1... > Connected to localhost.carrier.kiev.ua. > Escape character is '^]'. > archer... Recipient names must be specified > > As if I started sendmail without arguments on command prompt! > > I think it is ehough evidence that 'daemons dying' is caused by > memory corruption. Well, no, I had an alternative explanation: for me, this problem started with sendmail 8.9. I think I even went back and tried sendmail 8.8. and it didn't cause any problems. It could be a bug in sendmail, possibly related to the config I'm using (it often refuses connections because it thinks some test on the domain name succeeds, when in fact it should have failed). Greg -- See complete headers for address, home page and phone numbers finger grog@lemis.com for PGP public key To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message