From owner-freebsd-arch Tue Nov 7 15:15:50 2000 Delivered-To: freebsd-arch@freebsd.org Received: from mailtoaster1.pipeline.ch (mailtoaster1.pipeline.ch [62.48.0.70]) by hub.freebsd.org (Postfix) with SMTP id F1D4B37B6C3 for ; Tue, 7 Nov 2000 15:15:43 -0800 (PST) Received: (qmail 81029 invoked from network); 7 Nov 2000 23:14:11 -0000 Received: from unknown (HELO telehouse.ch) ([62.48.0.53]) (envelope-sender ) by mailtoaster1.pipeline.ch (qmail-ldap-1.03) with SMTP for ; 7 Nov 2000 23:14:11 -0000 Message-ID: <3A088D1B.96157C0D@telehouse.ch> Date: Wed, 08 Nov 2000 00:15:39 +0100 From: Andre Oppermann X-Mailer: Mozilla 4.74 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: Poul-Henning Kamp Cc: arch@freebsd.org Subject: Re: Green/Yellow/Red state for the VM system. References: <28041.973635706@critter> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG I don't think this is necessary at all after Matt's and Paul's code being commited. Why? Because the box would not wedge solid or panic anymore. Also, what point does it make to free up some KVM space if some user- land app is going crazy? As far as I know there is no dynamic sharing of KVM/User memory. The (no longer) problem you try to solve here would have been a hack to work around dead-locking due to 'bad' code-paths in the VM-I/O system until the real fix (fixing dead-locks) can be made. Now that fix to avoid dead-locking has been made and is under testing. This obsoletes your here proposed VM emergency states entierly. The only optimizations IMO that would make sense would be only local to a sub-system. Lets take the IP route code as an example. The moment it see's getting problems allocating more space for new route entries it would early-expire the cloned entries to make room for new ones. In no case it would ever stop forwarding packets. This would lead to non-determistic behaviour and making DoS attacks easy and the figuring out why-the-heck-this-god-damn-system-isn't-forwarding-any-packets-any- more *really* hard. Instead the right fix would be to use something like RED or BLUE to drop packets like we do on any overloaded link except in this case it's the system itself which is the bottlenck. To summarize: 1) IMO your approach here is very flawed and the wrong solution to the right problem 2) any detection of low-KVM situations should be done on a sub-system level and handled in-the-right-way on a local sub-system basis not on a global basis 3) IMO the actions you propose are not-the-right-thing most of the time, especially in the IP and TCP cases my 2c -- Andre Poul-Henning Kamp wrote: > > Varios people have heard about this in private conversations before, > so it's probably time for me to pull the plug and plug it publically. > > To all of you northern californians, this proposal should come as > nothing new at all, you've seen your electrical grid do this again > this summer: You cut some load entirely in order to keep the > majority of the grid intact. > > We need to do the same thing about resource-shortages in the kernel. > > The easiest way to deal with overload and DoS attacks is to recognize > them as such, go into defensive mode and weather it. > > If we implement a generic facility for this, we will have a lot less > worries about future DoS attacks, because our generic mechanism will > deal with a lot of them on its own. > > The facility I hinted at in an earlier email is this: > > The VM system maintains an enum variables: > > enum {GREEN, YELLOW, RED} kvm_light; > > When kvm_light changes value, a kernel event-handler list is activiated. > > GREEN means "No worries". And signals that all kernel code can go > about their business as usual. > > YELLOW means "Don't make it any worse". This state is set as a high > water mark, and tries to prevent us from ever entering: > > RED which means "Disaster on our hands.". > > Various pieces of code can use these state changes to modify their > behaviour according to the state of the kvm_light. Here are > some straw-man examples, just to show the concept: > > vfs_name_cache: > > Yellow: > Drop two entries every time we make one. > > Red: > Drop all entries, make no new ones. > > Vnode system: > > Yellow: > Allocate no new vnodes. > > Red: > Garbage collect and free vnodes. > (yes, this *is* possible, but only as a last resort thing.) > > IP: > Yellow: > Expire cloned routes faster. > Stop generating ICMP packets. > Stop forwarding packets. > > Red: > Expire all cloned routes now. > > TCP: > Yellow: > Accept no new TCP connections. > Reduce outgoing TCP windows. > Drop all sessions which have not passed > a packet in the last N seconds. > > Red: > Drop all un-assembled fragments. > Drop all "final-stages" TCP pcbs. > Drop all sessions which have not passed > a packet in the last M seconds. (M << N) > > Now, before anyone starts point indignated fingers in RFC's and > other such moral high-ground, let me just make it perfectly clear > that YELLOW isn't set until the system detects the risk of meltdown > and RED is the meltdown. > > Think of YELLOW as being set a little while before you cannot log > into the console in finite time anymore and usually decide to give > the machine a reboot/reset to get it back into action. > > RED is when the machine would normally wedge solid or panic. > > As can be seen from the above, the only and entire impact on the > VM system is to maintain the kvm_light variable and call the > eventahandler when it is changed. The rest of this is about getting > individual parts of the kernel to behave more responsible in YELLOW, > and to start pumping water all they can in RED. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message