From owner-freebsd-hackers  Thu Jan 28 10:26:31 1999
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id KAA22262
          for freebsd-hackers-outgoing; Thu, 28 Jan 1999 10:26:31 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from home.dragondata.com (home.dragondata.com [204.137.237.2])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id KAA22252
          for <hackers@FreeBSD.ORG>; Thu, 28 Jan 1999 10:26:26 -0800 (PST)
          (envelope-from toasty@home.dragondata.com)
Received: (from toasty@localhost)
	by home.dragondata.com (8.9.2/8.9.2) id MAA06446;
	Thu, 28 Jan 1999 12:26:13 -0600 (CST)
From: Kevin Day <toasty@home.dragondata.com>
Message-Id: <199901281826.MAA06446@home.dragondata.com>
Subject: Re: High Load cron patches - comments?
In-Reply-To: <199901281817.KAA09891@apollo.backplane.com> from Matthew Dillon at "Jan 28, 1999 10:17:37 am"
To: dillon@apollo.backplane.com (Matthew Dillon)
Date: Thu, 28 Jan 1999 12:26:12 -0600 (CST)
Cc: dyson@iquest.net, wes@softweyr.com, hackers@FreeBSD.ORG
X-Mailer: ELM [version 2.4ME+ PL43 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> :Here's my problem. 
> :
> :Cron turned into a massive forkbomb every minute, and especially every 10
> :minutes. Not only did the system nearly go dead at those points, but at
> :times, it took 5 minutes to catch up.
> :
> :Supposed you have to run 60 jobs per minute, and they all take around a
> :second to execute. If you run them one second at a time, you're likely to
> :...
> :
> :My only goal was to spread cron's jobs out a bit, so I didn't saturate my
> :nfs server's ethernet every 10 mins. When users are allowed to submit their
> :...
> :While I think a way that took how busy the CPU is, rather than how busy cron
> :is would be a better metric to go by, it's obviously not as simple as it
> :...
> :My patches have a feature where they'll continually increasing the fork
> :speed, if it's obvious that the backlog is getting to some silly
> :proportions. Perhaps this is wrong, and it should just drop new jobs. In my
> :case this probably wouldn't be bad, but I think that's definately 'breaking'
> :cron, and should be an optional feature.
> :...
> :What I came up with, sounds a lot like John Dyson's sample piece of code,
> :except I used integer math, and he's using floating point. (He's also using
> :...
> :Kevin
> 
>     I think a rate limited cron is a good solution, but I would also ( if you
>     haven't already ) supply a max-parallel-jobs option.  Increasing the
>     fork rate works to a degree, but you also have to make sure that cron
>     (A) cannot kill the machine, and (B) cannot fall into a fork cascade 
>     failure by overloading the machine so much that the jobs can't be
>     retired faster then new jobs are queued.
> 
>     So, for example, you might have a feedback parameter X but you should
>     also have an absolute limit Y, which you set relatively high. 
> 
>     Lets see... here's a good example.  Lets say that every 10 minutes cron
>     decides to fork off 50 jobs simultaniously, but at midnight and noon
>     cron wants to fork off 200 jobs simultaniously.  
> 
>     Lets say that every 10 minutes, with nominal delaying tactics and no hard 
>     limits, you are able to limit the maximum number of parallel jobs to,
>     say, 35.  Say you want a relatively sharp feedback to bump up the fork
>     rate to get the jobs done before the next 10 minute period occurs.
> 
>     These same parameters, however, could fail utterly at noon and midnight.
>     At noon and midnight the rate parameters that worked for the 10 minute
>     jobs might result, say, in 120 parallel jobs.
> 
>     This is where the hard limit comes in.  If you specified a hard limit
>     that was nominally greater then the 10 minute parallel job load, but
>     less then the midnight and noon job load, you effectively allow your
>     nominal case through but force the jobs that get run at midnight
>     and noon to 'spread out' a little more.  
> 
>     You might specify a hard limit of, for example, 60 parallel jobs.  This
>     is well within the 35 parallel jobs that the fork-rate limit produces
>     on the 10 minute jobs but prevents the midnight and noon jobs from
>     overloading the system.
> 
>     In effect, your feedback parameter solves your NFS burstiness problem
>     under 'normal' load conditions and the absolute limit handles the more 
>     severe noon & midnight cases.
> 
> 					-Matt
> 					Matthew Dillon 
> 					<dillon@backplane.com>
> 

I considered a 'maximum children' limit.

How do you prevent a user from breaking cron by executing 100 shell scripts
that have 'sleep 10000' in them?

Kevin

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message