From owner-freebsd-hackers Sat Jul 29 15:30:39 1995 Return-Path: hackers-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.11/8.6.6) id PAA01061 for hackers-outgoing; Sat, 29 Jul 1995 15:30:39 -0700 Received: from cs.weber.edu (cs.weber.edu [137.190.16.16]) by freefall.cdrom.com (8.6.11/8.6.6) with SMTP id PAA01055 for ; Sat, 29 Jul 1995 15:30:36 -0700 Received: by cs.weber.edu (4.1/SMI-4.1.1) id AA08430; Sat, 29 Jul 95 16:22:20 MDT From: terry@cs.weber.edu (Terry Lambert) Message-Id: <9507292222.AA08430@cs.weber.edu> Subject: Re: Scheduling Algorithms (was: Re: panic in brelse() ... ) To: kwong@fathergoose.net6c.io.org (Ken Wong) Date: Sat, 29 Jul 95 16:22:20 MDT Cc: hackers@freebsd.org In-Reply-To: <199507291324.JAA00264@fathergoose.net6c.io.org> from "Ken Wong" at Jul 29, 95 09:24:35 am X-Mailer: ELM [version 2.4dev PL52] Sender: hackers-owner@freebsd.org Precedence: bulk > > The main issue in both of these tasks that are on the table is a > > divorce of the ready-to-run state from the act of running the > > code. I think that this really implies seperate queues based > > on priority. > > I think single queue is best serve here. if we want realtime and > SMP. > I know some RT SMP system does it by dividing priority into 2 sections > 1) RT 0-7 > 2) the Rest is Timeshring > within each group, round robin is used to dispatch proceess with > the same priority. within group 2) some other algo can still be > used to change the priority but never goes up to 1). 1) is fix > priority and pre-emptive. the process table can include cpu which > it will execute. SVR4 ES/MP runs multiple queues, as does Solaris. The big issue here is per processor queue equivalency. It buys you the ability to have one processor pull a process off the ready-to-run queue and run it without hitting a global mutex. For SMP performance, the ability to avoid global synchronizations is the biggest win you can have. The RT issue is addressed by queu prioritization. In effect, this is the same as your queue insertion model, with the increased concurrency on get from multiplying the available semaphore heirarchy by the number of queues. In terms of RT, the divorce of the processor as a resource (which is what divorcing the ready-to-run state setting from the context switch process itself is, in effect) turns RT events into kernel preemption points, with processors as anonymous resources. The per processor queue ownership also has the side effect of saving the peorcessor instruction and data caches, and TLB settings, from undue thrashing (as a process goes from one processor to another, causing the thrash. Finally, a multiple queue model allows queueing to be order independent on the basis of priority. This is an important effect, since an RT system must invoke priority lending if the system resources are to be shared instead of block-committed for RT tasks. What that means is that the scheduling priority that would normally determine the queue to which the processor was pushed when ready to run is not the sole determiner of which queu the process can go on. If a non-RT process holds a resource that a high priority RT process needs while a read-to-run medium priority RT process exists, the medium priority RT process will block the non-RT porcess, in effect blocking the high priority RT process. To avoid this, the act of the high priority RT process blocking on the resource held by the lower priority process will cause the low priority process to be "lent" the higher priority processes priority until such time as it unblockes the needed resource. This prevents the high priority process from suffering starvation at the hands of a medium priority process because a low priority process holds a resource that the high priority process wants. > > It's necessary because multiple processes can be in the run state (for > > SMP, or even for kernel premption), and RT processes have to be able > > to be pigs as much as they want (if they are supported). > > the above would solve this problem. I think it's both a scheduling and a concurrency issue. I think that the ordered queue mechanism would have to have multiple priority based entry order pointers to support priority lending, and that that's really a more complex design at a higher concurrency cost that you'd want if you only have one queue in which to implement the ordering. Look at the complexities ordered insertion in the LRU list in page replacement algorithms (to implement per vnode working sets, for instance, to avoid cache thrashing) just to avoid staging the queues. It's really not worth it. Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.