Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 5 Dec 1998 01:41:33 -0600 (CST)
From:      Joel Ray Holveck <joelh@gnu.org>
To:        freebsd-current@FreeBSD.ORG
Subject:   Cron getting malloc warning in grandchild
Message-ID:  <199812050741.BAA01578@detlev.UUCP>

next in thread | raw e-mail | index | archive | help
I'm having a problem with cron.  Is anybody else seeing this?

Summary:

After swap space was filled, a cron on an 18 Oct -current system
started getting free() warnings, always in the same segment of code.
A reboot, an upgrade to yesterday's -current, and a zeroed-out swap
partition have failed to have any effect.

Details:

While I was away over Thanksgiving holidays and the early part of this
week, a box of mine (heimdall by name) somehow filled up its swap.  (I
don't particularly see why, since nobody could log in, so it was just
acting as a small ppp gateway and doing its normal cron jobs, the most
taxing of which is a cvs that it's been doing every night for months).
(This is the only time in this narrative that any swap issues were
either noticed by my own swapinfo checks, or logged in
/var/log/messages.)  I left on 25 Nov, and swap filled up on the 28th.
Over the next six hours, several small cron subjobs were killed, as
well as ppp.  One cron dying message was apparently doubled:

Nov 28 11:00:02 heimdall /kernel: pid 8401 (cron), uid 0, was killed: out of swap space
Nov 28 11:00:04 heimdall /kernel: pid 8403 (cron), uid 0, was killed: out of swap space
Nov 28 11:00:04 heimdall /kernel: pid 8402 (cron), uid 0, was killed: out of swap space
Nov 28 11:00:04 heimdall last message repeated 2 times
Nov 28 11:00:04 heimdall /kernel: pid 8399 (cron), uid 0, was killed: out of swap space

By the time I heard about this, it was 1 Dec at 18:00.  Calling
somebody with access to the console, I discovered that they could not
log in; no prompt.  No shock.  I asked them to give it a 120 reset.
The machine rebooted apparently normally, with no fsck problems.  At
02:05 on 2 Dec, I started receiving messages from cron:

  X-From-Line: daemon Wed Dec  2 02:05:04 1998
  Received: (from root@localhost)
	  by heimdall.UUCP (8.9.1/8.9.1) id CAA02259;
	  Wed, 2 Dec 1998 02:05:03 -0600 (CST)
	  (envelope-from root)
  Date: Wed, 2 Dec 1998 02:05:03 -0600 (CST)
  Message-Id: <199812020805.CAA02259@heimdall.UUCP>
  From: root (Cron Daemon)
  To: root
  Subject: Cron <root@heimdall> /usr/libexec/atrun
  X-Cron-Env: <SHELL=/bin/sh>
  X-Cron-Env: <PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin>
  X-Cron-Env: <HOME=/root>
  X-Cron-Env: <LOGNAME=root>
  X-Cron-Env: <USER=root>

  CRON in malloc(): warning: pointer to wrong page.

I arrived ten hours later and was greetd by several hundred of these
messages.  Checking the logs, I saw that no swap space warnings had
been issued since the cold boot on 1 Dec.  Since I didn't have time to
troubleshoot it immediately, and no hugely important cron tasks were
on that machine, I killed cron.  I don't know precisely when I did
this; the last such messages were sent at 15:00.  Meanwhile, every
cron task scheduled from 02:05 to 15:00 issued such a message.  All of
these tasks were scheduled from /etc/crontab.  (The only user crontab
entry runs at 02:00.)

Today, (4 Dec, 13:30), I installworld'd and made a new kernel from a
-current based on sources from cvsup2, as of 3 Dec 05:11.

This evening, at 22:02, I rebooted into single user mode, dd'd over
the swap partition, and rebooted again.  In short order, the same
messages began to appear again.  Again, I killed cron.

I've now instrumented cron to provide all relevant pids its mail, and
am running it with MALLOC_OPTIONS=AX and with -x proc, so if it fails
tonight I can start tracking down the problem.  Stay tuned for more
information as news breaks.

Since the first reboot, I've only seen three unexplained problems.
The first was a sendmail error that got logged night before last, and
never recurred:

Dec  3 00:38:19 heimdall sendmail[7858]: SAA22698: SYSERR: putoutmsg (mail.camalott.com.): error on output channel sending "451 fill_fd: detlev-ip@gnu.org... end of deliver(relay): fd 0 not open: Bad file descriptor": Input/output error
Dec  3 00:38:19 heimdall sendmail[7858]: SAA22698: SYSERR(joelh): fill_fd: detlev-ip@gnu.org... end of deliver(relay): fd 0 not open: Bad file descriptor
Dec  3 00:38:20 heimdall sendmail[7858]: SAA22698: SYSERR(joelh): fill_fd: detlev-ip@gnu.org... end of deliver(relay): fd 1 not open: Bad file descriptor
Dec  3 00:38:20 heimdall sendmail[7858]: SAA22698: SYSERR(joelh): fill_fd: detlev-ip@gnu.org... end of deliver(relay): fd 2 not open: Bad file descriptor

The other was while I was instrumenting cron, I had a shell script
segfault on me:

Dec  4 22:53:04 heimdall /kernel: pid 606 (sh), uid 0: exited on signal 11 (core dumped)

The third unexplained problem looks like I've misconfigured ppp.  Ever
since the installworld, ppp starts up with an error:

Dec  4 22:17:21 heimdall ppp[234]: Error: iface_inAdd: ioctl(SIOCAIFADDR): 0.0.0.0: Destination address required 

This doesn't appear to affect operation, so I haven't yet investigated
it.  I only mention it for completeness.

Happy hacking,
joelh

-- 
Joel Ray Holveck - joelh@gnu.org
   Fourth law of programming:
   Anything that can go wrong wi
sendmail: segmentation violation - core dumped

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199812050741.BAA01578>