Date: Sat, 5 Dec 1998 01:41:33 -0600 (CST) From: Joel Ray Holveck <joelh@gnu.org> To: freebsd-current@FreeBSD.ORG Subject: Cron getting malloc warning in grandchild Message-ID: <199812050741.BAA01578@detlev.UUCP>
next in thread | raw e-mail | index | archive | help
I'm having a problem with cron. Is anybody else seeing this? Summary: After swap space was filled, a cron on an 18 Oct -current system started getting free() warnings, always in the same segment of code. A reboot, an upgrade to yesterday's -current, and a zeroed-out swap partition have failed to have any effect. Details: While I was away over Thanksgiving holidays and the early part of this week, a box of mine (heimdall by name) somehow filled up its swap. (I don't particularly see why, since nobody could log in, so it was just acting as a small ppp gateway and doing its normal cron jobs, the most taxing of which is a cvs that it's been doing every night for months). (This is the only time in this narrative that any swap issues were either noticed by my own swapinfo checks, or logged in /var/log/messages.) I left on 25 Nov, and swap filled up on the 28th. Over the next six hours, several small cron subjobs were killed, as well as ppp. One cron dying message was apparently doubled: Nov 28 11:00:02 heimdall /kernel: pid 8401 (cron), uid 0, was killed: out of swap space Nov 28 11:00:04 heimdall /kernel: pid 8403 (cron), uid 0, was killed: out of swap space Nov 28 11:00:04 heimdall /kernel: pid 8402 (cron), uid 0, was killed: out of swap space Nov 28 11:00:04 heimdall last message repeated 2 times Nov 28 11:00:04 heimdall /kernel: pid 8399 (cron), uid 0, was killed: out of swap space By the time I heard about this, it was 1 Dec at 18:00. Calling somebody with access to the console, I discovered that they could not log in; no prompt. No shock. I asked them to give it a 120 reset. The machine rebooted apparently normally, with no fsck problems. At 02:05 on 2 Dec, I started receiving messages from cron: X-From-Line: daemon Wed Dec 2 02:05:04 1998 Received: (from root@localhost) by heimdall.UUCP (8.9.1/8.9.1) id CAA02259; Wed, 2 Dec 1998 02:05:03 -0600 (CST) (envelope-from root) Date: Wed, 2 Dec 1998 02:05:03 -0600 (CST) Message-Id: <199812020805.CAA02259@heimdall.UUCP> From: root (Cron Daemon) To: root Subject: Cron <root@heimdall> /usr/libexec/atrun X-Cron-Env: <SHELL=/bin/sh> X-Cron-Env: <PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin> X-Cron-Env: <HOME=/root> X-Cron-Env: <LOGNAME=root> X-Cron-Env: <USER=root> CRON in malloc(): warning: pointer to wrong page. I arrived ten hours later and was greetd by several hundred of these messages. Checking the logs, I saw that no swap space warnings had been issued since the cold boot on 1 Dec. Since I didn't have time to troubleshoot it immediately, and no hugely important cron tasks were on that machine, I killed cron. I don't know precisely when I did this; the last such messages were sent at 15:00. Meanwhile, every cron task scheduled from 02:05 to 15:00 issued such a message. All of these tasks were scheduled from /etc/crontab. (The only user crontab entry runs at 02:00.) Today, (4 Dec, 13:30), I installworld'd and made a new kernel from a -current based on sources from cvsup2, as of 3 Dec 05:11. This evening, at 22:02, I rebooted into single user mode, dd'd over the swap partition, and rebooted again. In short order, the same messages began to appear again. Again, I killed cron. I've now instrumented cron to provide all relevant pids its mail, and am running it with MALLOC_OPTIONS=AX and with -x proc, so if it fails tonight I can start tracking down the problem. Stay tuned for more information as news breaks. Since the first reboot, I've only seen three unexplained problems. The first was a sendmail error that got logged night before last, and never recurred: Dec 3 00:38:19 heimdall sendmail[7858]: SAA22698: SYSERR: putoutmsg (mail.camalott.com.): error on output channel sending "451 fill_fd: detlev-ip@gnu.org... end of deliver(relay): fd 0 not open: Bad file descriptor": Input/output error Dec 3 00:38:19 heimdall sendmail[7858]: SAA22698: SYSERR(joelh): fill_fd: detlev-ip@gnu.org... end of deliver(relay): fd 0 not open: Bad file descriptor Dec 3 00:38:20 heimdall sendmail[7858]: SAA22698: SYSERR(joelh): fill_fd: detlev-ip@gnu.org... end of deliver(relay): fd 1 not open: Bad file descriptor Dec 3 00:38:20 heimdall sendmail[7858]: SAA22698: SYSERR(joelh): fill_fd: detlev-ip@gnu.org... end of deliver(relay): fd 2 not open: Bad file descriptor The other was while I was instrumenting cron, I had a shell script segfault on me: Dec 4 22:53:04 heimdall /kernel: pid 606 (sh), uid 0: exited on signal 11 (core dumped) The third unexplained problem looks like I've misconfigured ppp. Ever since the installworld, ppp starts up with an error: Dec 4 22:17:21 heimdall ppp[234]: Error: iface_inAdd: ioctl(SIOCAIFADDR): 0.0.0.0: Destination address required This doesn't appear to affect operation, so I haven't yet investigated it. I only mention it for completeness. Happy hacking, joelh -- Joel Ray Holveck - joelh@gnu.org Fourth law of programming: Anything that can go wrong wi sendmail: segmentation violation - core dumped To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199812050741.BAA01578>