From owner-freebsd-hackers Thu Jan 28 10:26:31 1999 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id KAA22262 for freebsd-hackers-outgoing; Thu, 28 Jan 1999 10:26:31 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from home.dragondata.com (home.dragondata.com [204.137.237.2]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id KAA22252 for ; Thu, 28 Jan 1999 10:26:26 -0800 (PST) (envelope-from toasty@home.dragondata.com) Received: (from toasty@localhost) by home.dragondata.com (8.9.2/8.9.2) id MAA06446; Thu, 28 Jan 1999 12:26:13 -0600 (CST) From: Kevin Day Message-Id: <199901281826.MAA06446@home.dragondata.com> Subject: Re: High Load cron patches - comments? In-Reply-To: <199901281817.KAA09891@apollo.backplane.com> from Matthew Dillon at "Jan 28, 1999 10:17:37 am" To: dillon@apollo.backplane.com (Matthew Dillon) Date: Thu, 28 Jan 1999 12:26:12 -0600 (CST) Cc: dyson@iquest.net, wes@softweyr.com, hackers@FreeBSD.ORG X-Mailer: ELM [version 2.4ME+ PL43 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > :Here's my problem. > : > :Cron turned into a massive forkbomb every minute, and especially every 10 > :minutes. Not only did the system nearly go dead at those points, but at > :times, it took 5 minutes to catch up. > : > :Supposed you have to run 60 jobs per minute, and they all take around a > :second to execute. If you run them one second at a time, you're likely to > :... > : > :My only goal was to spread cron's jobs out a bit, so I didn't saturate my > :nfs server's ethernet every 10 mins. When users are allowed to submit their > :... > :While I think a way that took how busy the CPU is, rather than how busy cron > :is would be a better metric to go by, it's obviously not as simple as it > :... > :My patches have a feature where they'll continually increasing the fork > :speed, if it's obvious that the backlog is getting to some silly > :proportions. Perhaps this is wrong, and it should just drop new jobs. In my > :case this probably wouldn't be bad, but I think that's definately 'breaking' > :cron, and should be an optional feature. > :... > :What I came up with, sounds a lot like John Dyson's sample piece of code, > :except I used integer math, and he's using floating point. (He's also using > :... > :Kevin > > I think a rate limited cron is a good solution, but I would also ( if you > haven't already ) supply a max-parallel-jobs option. Increasing the > fork rate works to a degree, but you also have to make sure that cron > (A) cannot kill the machine, and (B) cannot fall into a fork cascade > failure by overloading the machine so much that the jobs can't be > retired faster then new jobs are queued. > > So, for example, you might have a feedback parameter X but you should > also have an absolute limit Y, which you set relatively high. > > Lets see... here's a good example. Lets say that every 10 minutes cron > decides to fork off 50 jobs simultaniously, but at midnight and noon > cron wants to fork off 200 jobs simultaniously. > > Lets say that every 10 minutes, with nominal delaying tactics and no hard > limits, you are able to limit the maximum number of parallel jobs to, > say, 35. Say you want a relatively sharp feedback to bump up the fork > rate to get the jobs done before the next 10 minute period occurs. > > These same parameters, however, could fail utterly at noon and midnight. > At noon and midnight the rate parameters that worked for the 10 minute > jobs might result, say, in 120 parallel jobs. > > This is where the hard limit comes in. If you specified a hard limit > that was nominally greater then the 10 minute parallel job load, but > less then the midnight and noon job load, you effectively allow your > nominal case through but force the jobs that get run at midnight > and noon to 'spread out' a little more. > > You might specify a hard limit of, for example, 60 parallel jobs. This > is well within the 35 parallel jobs that the fork-rate limit produces > on the 10 minute jobs but prevents the midnight and noon jobs from > overloading the system. > > In effect, your feedback parameter solves your NFS burstiness problem > under 'normal' load conditions and the absolute limit handles the more > severe noon & midnight cases. > > -Matt > Matthew Dillon > > I considered a 'maximum children' limit. How do you prevent a user from breaking cron by executing 100 shell scripts that have 'sleep 10000' in them? Kevin To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message