From owner-freebsd-hackers  Thu Jan 28 10:36:09 1999
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id KAA23228
          for freebsd-hackers-outgoing; Thu, 28 Jan 1999 10:36:09 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id KAA23223
          for <hackers@FreeBSD.ORG>; Thu, 28 Jan 1999 10:36:06 -0800 (PST)
          (envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.2/8.9.1) id KAA10067;
	Thu, 28 Jan 1999 10:36:04 -0800 (PST)
	(envelope-from dillon)
Date: Thu, 28 Jan 1999 10:36:04 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199901281836.KAA10067@apollo.backplane.com>
To: Kevin Day <toasty@home.dragondata.com>
Cc: dyson@iquest.net, wes@softweyr.com, hackers@FreeBSD.ORG
Subject: Re: High Load cron patches - comments?
References:  <199901281826.MAA06446@home.dragondata.com>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:> 
:
:I considered a 'maximum children' limit.
:
:How do you prevent a user from breaking cron by executing 100 shell scripts
:that have 'sleep 10000' in them?
:
:Kevin

    By closing his account.

    No, really... by closing his account.  If a user abuses his privilage
    there isn't much you can do about it no matter what kind of rate limiting
    you have.  All you can do is try to set the limits such that you can
    still login as root and turn off the account.

    About once a month, some user on some BEST machine makes a mistake and
    does something that causes a huge load.  It is usually NOT intentional.
    Sometimes it's a CGI runaway on a heavily-accessed site, sometimes it's 
    a shell script gone awry.

    We've seen loads of 600.

    The funny thing is that even with a load of 600, people can still login
    to the machine and do stuff.  This is because either the user or the
    subsystem involved has hit a hard limit.

    Without hard limits, such screwups would take down the machine.  Given
    the choice between a machine going down and being able to login and fix
    the problem, I'll choose the latter every time.  I would rather the
    web server slow down for 10 minutes while we fix the problem then have
    the machine take 20 minutes to die and then have to reboot it.

    It is not possible to handle these situations automatically... no amount
    of load balancing or rate limiting software will prevent a user's mistake
    from loading down the system or interfering with other users.

    One has alarm points - if the load goes over a hundred something
    is obviously wrong and bells start ringing :-).

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message