From owner-freebsd-questions@FreeBSD.ORG Mon Aug 23 15:37:08 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 191B516A4E2 for ; Mon, 23 Aug 2004 15:37:08 +0000 (GMT) Received: from smtp3.adl2.internode.on.net (smtp3.adl2.internode.on.net [203.16.214.203]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3102843D55 for ; Mon, 23 Aug 2004 15:37:07 +0000 (GMT) (envelope-from malcolm.kay@internode.on.net) Received: from beta.home (ppp137-206.lns1.adl2.internode.on.net [150.101.137.206])i7NFb4HY084195 for ; Tue, 24 Aug 2004 01:07:05 +0930 (CST) From: Malcolm Kay Organization: at home To: freebsd-questions@freebsd.org Date: Tue, 24 Aug 2004 01:07:22 +0930 User-Agent: KMail/1.5.4 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200408240107.22461.malcolm.kay@internode.on.net> Subject: cron and vfork X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Aug 2004 15:37:08 -0000 I'm having a problem with cron and vfork. Here are a couple of samples from the cron log:- Aug 22 16:13:00 central /usr/sbin/cron[89748]: (root) CMD ( /usr/local/sbin/nwmail2linux) Aug 22 16:15:00 central /usr/sbin/cron[89749]: (CRON) error (can't vfork) Aug 22 16:15:00 central /usr/sbin/cron[89750]: (CRON) error (can't vfork) Aug 22 16:17:00 central /usr/sbin/cron[89752]: (root) CMD ( /usr/local/sbin/nwmail2linux) Aug 22 16:19:00 central /usr/sbin/cron[89754]: (root) CMD ( /usr/local/sbin/nwmail2linux) Aug 22 16:20:00 central /usr/sbin/cron[89756]: (root) CMD (/usr/libexec/atrun) Aug 22 18:59:00 central /usr/sbin/cron[89952]: (root) CMD ( /usr/local/sbin/nwmail2linux) Aug 22 19:00:01 central /usr/sbin/cron[89953]: (CRON) error (can't vfork) Aug 22 19:00:01 central /usr/sbin/cron[89954]: (CRON) error (can't vfork) Aug 22 19:01:00 central /usr/sbin/cron[89956]: (root) CMD ( /usr/local/sbin/nwmail2linux) The problem started with the first error entry shown here. Note that this is a work machine and the problem appears at 4:15pm on a Sunday when all should be quiet which seems to make any of the vfork failure mechanisms mentioned in the man pages vfork(2) and fork(2) unlikely. The presence of a problem became evident when when I was unable to log into the machine Monday morning, neither from a console or through ssh on the LAN. At the console it puts up a login prompt and accepts the name entry but that is all -- no password prompt and no further activity. I've had to reboot the machine with a brutal physical reset. The same thing happened at the quiet part of the previous weekend. The cron jobs are all in /etc/crontab and the problem when it occurs is always at a time when 2 jobs clash: /usr/local/sbin/nwmail2linux and /usr/libexec/atrun or newsyslog and /usr/libexec/atrun /usr/local/sbin/nwmail2linux is a process that picks up e-mail for three people from a Novell system and and delivers it to 3 linux machines via ssh and mail.local. This is programmed to activate at each odd minute in the hour. The machine is running FreeBSD 4.9 with vinum disk mirroring. Its main reason for being is to manage backup of a number of FreeBSD and Linux machines all of which happens in the late evening and early morning hours and in any case not on Sunday night/Monday morning. It is also used as a gateway between 2 physically isolated LAN networks and between IPX and TCP/IP and as mentioned above some mail management. The problem seems to have begun with a change to the e-mail management. Previously The machine was used to retrieve mail from Novell for only two people - one delivered locally to a conventional unix mailbox (and acessed via ssh and kmail) and the other passed on via rsh to a HP-UX machine. I've not found anomalies in any other log files. The only explanation I can think of sounds rather far fetched, that the linux machines somehow take a long time to waken from slumber in the quiet of Sunday afternoon and the e-mail jobs begin overlapping until too many processes exist -- but surely it could not be that many. Even when not accessable through login the machine seems for the most part to be carrying out its intended roles. Does anyone have any ideas? Help would be appreciated. Malcolm