Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Nov 1998 02:38:38 +0200 (EET)
From:      Alexander Litvin <archer@lucky.net>
To:        Alexander Litvin <archer@lucky.net>
Cc:        current@FreeBSD.ORG
Subject:   Re: The infamous dying daemons bug
Message-ID:  <199811110038.CAA01861@grape.carrier.kiev.ua>
In-Reply-To: <199811101456.QAA28210@grape.carrier.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
In article <199811101456.QAA28210@grape.carrier.kiev.ua> you wrote:

>>> Nov 10 03:15:34 grape /kernel: swap_pager: suggest more swap space: 61 MB
>>> Nov 10 03:16:26 grape /kernel: pid 310 (sendmail), uid 0: exited on signal 11
>>> Nov 10 03:17:26 grape /kernel: pid 311 (sendmail), uid 0: exited on signal 11
>>> Nov 10 03:18:25 grape /kernel: pid 313 (sendmail), uid 0: exited on signal 11
>>> Nov 10 03:19:25 grape /kernel: pid 353 (sendmail), uid 0: exited on signal 11
>>> Nov 10 03:20:26 grape /kernel: pid 394 (sendmail), uid 0: exited on signal 11

GL>> Ah, now that's one that I've been getting without exhausting memory.
GL>> I'm assuming that these dying sendmails are children of the daemon.
GL>> What happens when you kill -1 the daemon ("accepting connections on
GL>> port 25 (sendmail)")?  In my experience, it *always* dies with a
GL>> SIGSEGV after these messages have occurred.

AL> Well, as I understand, 'swap_pager: suggest more swap space' does
AL> not mean that memory is exhausted, but only that it is about to
AL> be exhausted. At least, in this case it didn't come to any processes
AL> being killed by kernel.

AL> You're right -- that sendmails were childs of a daemon (queue runners).
AL> I'm not sure about what happens if I send SIGHUP to the daemon. I
AL> think it may or may not restart -- it depends. Last time I examined
AL> a 'deseased' daemon (it was not sendmail, but a dummy daemon written
AL> specially for testing), it appeared that some range of process memory,
AL> where code of dynamic library lives, was corrupt (zeroed in that case).

AL> I'll try later to kill -1 such daemon. Now I'm in the process of testing
AL> Dima's kludge. Until now I was unable to reproduce a problem. Daemons
AL> keep living ;)

Brought up old kernel without kludge.

It appears that memory corruption leading to 'daemons dying' may take
different forms. E.g., once it appears that sendmail continues to
fork for queue runs successfully, but when I do 'telnet localhost 25',
it just accepts connection, forks, changes proctitle ('startup with ...'),
and goes into some strange state -- no EHLO, just accepts all I type
in telnet and that's all. In that state kill -1 restarts sendmail ok.
Other time I exhaust memory, sendmail segfaults every child forked
for queue run, again restarts ok on SIGHUP. Once I even got in responce
to 'telnet localhost 25':

Trying 127.0.0.1...
Connected to localhost.carrier.kiev.ua.
Escape character is '^]'.
archer... Recipient names must be specified

As if I started sendmail without arguments on command prompt!

I think it is ehough evidence that 'daemons dying' is caused by
memory corruption.

GL>> Greg
GL>> --
GL>> See complete headers for address, home page and phone numbers
GL>> finger grog@lemis.com for PGP public key

--- 
I really hate this damned machine
I wish that they would sell it.
It never does quite what I want
But only what I tell it.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199811110038.CAA01861>