Date: Sun, 10 Dec 2000 23:48:18 -0500 From: James FitzGibbon <james@ehlo.com> To: David O'Brien <obrien@FreeBSD.org> Cc: Robert Watson <rwatson@FreeBSD.org>, cvs-committers@FreeBSD.org, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/etc crontab Message-ID: <20001210234818.A73780@ehlo.com> In-Reply-To: <20001210165229.A84706@dragon.nuxi.com>; from obrien@FreeBSD.org on Sun, Dec 10, 2000 at 04:52:29PM -0800 References: <rwatson@FreeBSD.org> <200012110043.eBB0hYV06366@hak.lan.Awfulhak.org> <20001210165229.A84706@dragon.nuxi.com>
next in thread | previous in thread | raw e-mail | index | archive | help
* David O'Brien (obrien@FreeBSD.org) [001210 20:23]: > I don't understand what your position is. I think queueing the jobs is > the correct thing to do. My daily run can take 2+ hours occastionally. > If that happens at the end of the week, I don't want to skip the weekly > run, just delay it until after the dailly run. This can lead to problems based on the local programs that the site runs. CVSup is notorious for this -- if the server it is attempting to connect to is unavailable, it will just sit and retry for hours. I've had systems that were found to have 20+ copies of cvsup running, all retrying a server whose name had inadvertently been removed from local DNS. Thankfully, it was the highest-numbered script in the daily directory, so it wasn't preventing anything else from running (save the mailing of the entire daily output to root). I'm not sure what the happy medium is here. I agree that running lockf with a 'lock-or-quit' behaviour is bad, but if a program in the middle of the daily sequence gets buggered, you could have major problems. Take 320.rdist for example. If it hangs, then 450.status-security never runs, so you would end up getting x (where x is the number of days you took to learn of the problem) copies of the security output, but all based upon the same state without any interim reports being sent out. Granted, any good admin should notice that a machine isn't sending reports, but at large sites there are always a few machines not properly monitored that slip through the cracks. What about putting a variable in rc.conf (is that the right place for non-startup-related variables?) that represents the timeout value. Have it default to 0 (or some other acceptable limit). That gives acceptable behaviour plus an easily accessible place for admins to increase or even remove the timeout. To implement this, a shell variable extractor util would be handy (i.e. something along the lines of "variable=`confvar periodic_daily_timeout`" which would return the string "-t 3600 " in a default install, but a null string in your environment to queue concurrent runs of the script. The crontab entry would then become something like this: lockf `confvar periodic_daily_timeout` periodic daily 2>&! sendmail root lockf `confvar periodic_weekly_timeout` periodic weekly 2>&! sendmail root modulo the other timing changes we are discussing. Thoughts ? -- j. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe cvs-all" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001210234818.A73780>