From owner-freebsd-current@FreeBSD.ORG Sat Aug 14 06:24:47 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A61BA16A4CE; Sat, 14 Aug 2004 06:24:47 +0000 (GMT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 79F3D43D31; Sat, 14 Aug 2004 06:24:47 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i7E6Ojla012365; Fri, 13 Aug 2004 23:24:45 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i7E6OjiU012364; Fri, 13 Aug 2004 23:24:45 -0700 (PDT) (envelope-from dillon) Date: Fri, 13 Aug 2004 23:24:45 -0700 (PDT) From: Matthew Dillon Message-Id: <200408140624.i7E6OjiU012364@apollo.backplane.com> To: Don Lewis References: <200408140523.i7E5NO2x003922@gw.catspoiler.org> cc: jroberson@chesapeake.net cc: freebsd-current@freebsd.org cc: rwatson@freebsd.org Subject: Re: nice handling in ULE (was: Re: SCHEDULE and high load situations) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 14 Aug 2004 06:24:47 -0000 :The best idea that I can come up with is for sched_add(), :sched_add_internal(), kseq_runq_add(), and runq_add() to grow another :parameter that would tell them whether to prepend to the beginning of :the run queue or append to the end. If setrunqueue() detects that :TD_IS_RUNNING(td) is true, it would pass the flag to sched_add() that :would cause the thread to be added to the beginning of the queue. : :I don't know if this is appropriate in the PRI_ITHD and PRI_REALTIME :cases, or if we want to continue to to round-robin. : :Comments? I've written and played with schedulers for a very long time, and pretty much every time I've tried a insert-at-beginning-or-insert-at-end hack it's always introduced more problems then it has solved. This isn't to say that one is better then the other, just that having a conditional that tries to figure out whether to add at the beginning or add to the end is bad in general. The scheduler should be designed either for front-loading or for rear-loading. Typically speaking, a fair-share scheduler (which I think ULE is) should be front-loaded. That is, for any particular priority a thread should be added to the beginning of the run queue (but not preempt the currently running thread, just be the next runnable thread). The reason for this is because the thread's fair-share quantum calculation is expected to be correct and when it *IS* correct front loading almost always yields the best performance. A good example of this is a heavily loaded system running, say, a thousand simultanious ssh'd logins. Whenever a user types over their ssh connection you get this sort of pattern: sshd reads from socket, writes to pty, blocks csh on pty reads from pty, echos char, blocks sshd unblocks, reads the echo, and writes it back to the socket. Now imagine that situation in an extreme loading situation for both the front loading and the rear loading case. In the front loading case sshd will get the cpu quickly for both the initial read and for the echo response. In the rear loading case sshd will have to wait for the entire round robin before it can pass the character to the pty, the csh will have to wait before it can read the character and echo it, and sshd will have to wait before it can read the echo and return the response. Again, that's for a fair-share scheduler, i.e. ULE. For something like 4BSD you do NOT usually want to front-load because the priority mechanism is expected to reflect the interactivity of the process and so the process should get the cpu quickly in the above case with rear loading. Front loading 4BSD could potentially lead to starvtion due to the fairly long period of time it takes for a process's priority to be reduced (at least 4 ticks). A fair share scheduler, on the otherhand, is theoretically capable of much finer-grained quantums but also tends to have much longer run queues because it depends on the fair share algorithm to figure out the dynamic quantum more then it depends on multi-queuing priorities. I hope that all made sense. The jist is that if you find yourself trying to optimize a scheduler by guessing whether to add to the front or the rear of the queue, it usually means that you've already lost the game and are now resorting to hacks to try to fix things. I don't mean that in a bad way, just that it is my own experience that that is the case. -Matt