Date: Thu, 2 Mar 2000 12:40:45 -0800 (PST) From: Todd Hansen <tshansen@amp.nlanr.net> To: FreeBSD-gnats-submit@freebsd.org Subject: bin/17134: problem with cron forgetting jobs Message-ID: <200003022040.MAA08259@amp.nlanr.net>
next in thread | raw e-mail | index | archive | help
>Number: 17134 >Category: bin >Synopsis: problem with 3.0-RELEASE cron forgetting jobs >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Mar 2 12:50:02 PST 2000 >Closed-Date: >Last-Modified: >Originator: Todd Hansen >Release: FreeBSD 3.0-RELEASE i386 >Organization: National Laboratory for Applied Network Research (NLANR) >Environment: host info: > uname -a FreeBSD amp.nlanr.net 3.2-RELEASE FreeBSD 3.2-RELEASE #4: Wed Jul 28 21:22:56 PDT 1999 hwb@amp.nlanr.net:/usr/src/sys/compile/NAI-AMP i386 > crontab entrys affected: joule.nlanr.net.actmon> crontab -l # DO NOT EDIT THIS FILE - edit the master and reinstall. # (/tmp/crontab.eTKvHXl788 installed on Fri Dec 31 11:14:06 1999) # (Cron version -- $Id: crontab.c,v 1.11 1997/09/15 06:39:15 charnier Exp $) # DO NOT EDIT THIS FILE - edit the master and reinstall. # (/tmp/crontab.ZOyLKT3056 installed on Tue Oct 27 16:51:55 1998) # (Cron version -- $Id: crontab.c,v 1.6.2.3 1998/03/09 11:42:00 jkh Exp $) * * * * * cd $HOME/src/pinger/vBNS ; sleep `jot -r 1 1 15` ; ./docollector 0,10,20,30,40,50 * * * * cd $HOME/src/pinger/vBNS ; sleep `jot -r 1 1 15` ; nice ./dogentrace ; cd $HOME/src/pinger/ ; ./watchdog -ko -t 600 -w watchdog.file "./am_master -n 10 -w watchdog.file amp volt &" 10 2 * * * find $HOME/src/pinger/vBNS/data -type f -mtime +5 -exec rm {} \; joule.nlanr.net.actmon> >Description: We run a distributed system of currently 102 active measurement probes around the internet (all running freeebsd 3.0). Basically we are noticing that periodically (almost regularly) the cron daemon will forget about some of our jobs, even though it lists them with the crontab -l command. This happens on about 10 systems in about 2 months. Anyway, the problem is related to what was mentioned in bin/6004. Except we have more information and a greater need to work with you to figure this out. Unfortunatly we are still running 3.0 until we can figure out if this is fixed in 3.4 since it is a big deal to upgrade 102 sites remotely. Eventually when cron forgets about a job, it still trys to execute the job, but instead of actually, executing the job we see something like this in the log: Mar 2 12:20:00 nai-a-odun /USR/SBIN/CRON[6248]: (actmon) CMD () Where the cmd is blank but the command is run at the correct time. The interesting thing is other commands are still run fine while this command is not. The line that is affected the most by this problem is the line in the above crontab that runs ./dogentrace every 10 minutes. thanks. Todd >How-To-Repeat: It seems to repeat within a reasonable amount of time on our systems, probably becuase we have so many. >Fix: We would love one, if it can be found. >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200003022040.MAA08259>