Date: Sun, 26 Jan 2003 00:35:54 -0800 From: Terry Lambert <tlambert2@mindspring.com> To: Matthew Dillon <dillon@apollo.backplane.com> Cc: Steve Kargl <sgk@troutmask.apl.washington.edu>, Jeff Roberson <jroberson@chesapeake.net>, Robert Watson <rwatson@FreeBSD.ORG>, Gary Jennejohn <garyj@jennejohn.org>, arch@FreeBSD.ORG Subject: Re: New scheduler (#3) Message-ID: <3E339DEA.E1380FCE@mindspring.com> References: <20030125171217.D18109-100000@mail.chesapeake.net> <200301252320.h0PNKVoq090077@apollo.backplane.com> <200301252350.h0PNo6xO009489@apollo.backplane.com> <200301260114.h0Q1EXuu017546@apollo.backplane.com> <20030126013234.GA19891@troutmask.apl.washington.edu> <200301260218.h0Q2IlkX024483@apollo.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Matthew Dillon wrote: > I expect Jeff can fix the more obvious bugs in a few seconds. Dealing > with the cpu-thread-stealing issue is a much harder problem. I saw > the flip-flop and reported it, but did not try to fix it. My fuzzy logic suggestion is to make the CPU migration model "push" instead of "pull". This also gets rid of the need to lock in the scheduler, in all cases (whereas a pull model requires locking each time). The idea is that you create a process migration list for each CPU, and when you migrate, you do it by "pushing" the process onto the least-loaded CPU's migration queue (doing this requires grabbing a queue lock, but checking that the queue is non-NULL does not require grabbing a queue lock, so it is non-locking in the common case, which is the non-migration case). A "pull" model requires locking your queue all the time, since it permits another CPU to access, which requires a lock to keep someone from pulling from your queue while you are accessing your queue. By only permitting access to an exceptional queue, and not having read access to the normal queue for other processes, you save yourself from the locking of your own queue. Relatively simple, and obvious, once explained. The "flip flopping" is solved because "push" only occurs as a result of relative load... in the "pull" case, you have a virtual global queue, so there's no way to avoid it. Basically, "pull" prevents you from using a fuzzy difference in a "figure of merit" indicating CPU load, and from which relative load can be calculated. The only missing thing here is that the figure of merit needs to be atomically readable without a lock -- sizeof(int), in the general case -- which allows it to be calculated locally to the CPU in question, and read (and acted upon) without a lock. My suggestion is to calculate "percentage load". The easiest way to do this is to keep a count of the number of processes on the ready-to-run queue for a given CPU, and just use that number as the basis of a moving average, which is multiplied by 100 (or 1000) to get the integer figure-of-merit. The hysteresis values, which you could allow to be set via sysctl, provide high and low *differential* values, so that the flip-flopping can be damped, as necessary, to get to the point where the effect no longer impacts overall performance. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E339DEA.E1380FCE>