From owner-freebsd-hackers Thu Jan 28 10:36:09 1999 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id KAA23228 for freebsd-hackers-outgoing; Thu, 28 Jan 1999 10:36:09 -0800 (PST) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id KAA23223 for ; Thu, 28 Jan 1999 10:36:06 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.2/8.9.1) id KAA10067; Thu, 28 Jan 1999 10:36:04 -0800 (PST) (envelope-from dillon) Date: Thu, 28 Jan 1999 10:36:04 -0800 (PST) From: Matthew Dillon Message-Id: <199901281836.KAA10067@apollo.backplane.com> To: Kevin Day Cc: dyson@iquest.net, wes@softweyr.com, hackers@FreeBSD.ORG Subject: Re: High Load cron patches - comments? References: <199901281826.MAA06446@home.dragondata.com> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG :> : :I considered a 'maximum children' limit. : :How do you prevent a user from breaking cron by executing 100 shell scripts :that have 'sleep 10000' in them? : :Kevin By closing his account. No, really... by closing his account. If a user abuses his privilage there isn't much you can do about it no matter what kind of rate limiting you have. All you can do is try to set the limits such that you can still login as root and turn off the account. About once a month, some user on some BEST machine makes a mistake and does something that causes a huge load. It is usually NOT intentional. Sometimes it's a CGI runaway on a heavily-accessed site, sometimes it's a shell script gone awry. We've seen loads of 600. The funny thing is that even with a load of 600, people can still login to the machine and do stuff. This is because either the user or the subsystem involved has hit a hard limit. Without hard limits, such screwups would take down the machine. Given the choice between a machine going down and being able to login and fix the problem, I'll choose the latter every time. I would rather the web server slow down for 10 minutes while we fix the problem then have the machine take 20 minutes to die and then have to reboot it. It is not possible to handle these situations automatically... no amount of load balancing or rate limiting software will prevent a user's mistake from loading down the system or interfering with other users. One has alarm points - if the load goes over a hundred something is obviously wrong and bells start ringing :-). -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message