From owner-freebsd-current@FreeBSD.ORG Fri Oct 27 19:27:17 2006 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CA20516A412 for ; Fri, 27 Oct 2006 19:27:17 +0000 (UTC) (envelope-from prvs=julian=44840db18@elischer.org) Received: from a50.ironport.com (a50.ironport.com [63.251.108.112]) by mx1.FreeBSD.org (Postfix) with ESMTP id 08E2943D7C for ; Fri, 27 Oct 2006 19:27:14 +0000 (GMT) (envelope-from prvs=julian=44840db18@elischer.org) Received: from unknown (HELO [10.251.18.229]) ([10.251.18.229]) by a50.ironport.com with ESMTP; 27 Oct 2006 12:27:14 -0700 Message-ID: <45425D92.8060205@elischer.org> Date: Fri, 27 Oct 2006 12:27:14 -0700 From: Julian Elischer User-Agent: Thunderbird 1.5.0.7 (Macintosh/20060909) MIME-Version: 1.0 To: current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Comments on the KSE option X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Oct 2006 19:27:17 -0000 John, I appreciate that you have made KSE an option, but the way you have done it shows a complete misundertanding of what is there. What you are calling "KSE" is in fact several different facilities that are orthogonal. The one that you have the most trouble with is in fact not SA based threading (refered to by most people as "KSE" but, rather the fair scheduling code). The aim of the fair scheduling code is to ensure that if you, as a user, make a process that starts 1000 threads, and I as a user, make an unthreaded process, then I can still get to the CPU at somewhat similar rates to you. A naive scheduler would give you 1000 cpu slots and me 1. the current fair scheduler tries to make sure that each process gets a fair crack at the CPU by holding back some of the runnable threads from the threadded process, until the ones it has in therun queu have been completed.. A bit like telling a young child, "yes you can have more ice-cream, when you've finished the ice-cream you already have". I note that David recently (in the last year) disabled the fair scheduling capacity of the libthr code, but he didn't do it quite right so that it still does all the work for it, and then disregarded the result. This means that not only does a 1000 thread process (libthr) completely push a nonthreaded process out of the system, but it pays all the costs in the scheduler for working out how to NOT do that. The fairness algorythm that you have made 'optional' is a very crude one and I had thought that by now someone would have written a better one, but no-one has. I suggest that you fix your patch in two ways: 1/ you need (at least) 2 options. KSE and FAIR_THREADS most of the improvements you are seeing comes from the second one. Especially all your changes that are in the scheduler. This removes the fair scheduling capability. It affects all threading libraries that do not deliberatly knacker it. In other words it should be orthogonal to what threading library is running. If it is made a project goal that threads should be unfair, then I have no objections to removing the code, but it needs to be a decision that is deliberately taken. It was an initial project goal that threads should be fair, and the fact that David has made it ineffective for libthr (though he still pays the full price for it) is not a reason to throw it out. (What he does is to assign a new KSEGRP for each thread, but he doesn't label it as exempt from fairness so it does all the work only to discover at the end that it is the only thread on the ksegrp, and therefore always eligible to run). If the correct flags were set, then then David's threads could probably get the same speedup as seen with the KSE option removed, as all the overhead would be skipped, but then we would be officially condoning unfair threading. teh chage to do thos would be to add a ksegrp or thread flag (possibly thread) called TDF_FAIR_SCHED and change the few lines in the scheduler that do: if ((td->td_proc->p_flag & P_HADTHREADS) == 0) { to be if ((td->flags & TDF_FAIR_SCHED) == 0) { and set that flag in the threading libraries when threads should be made fair. then probably the entire advantage seen by David in the supersmack tests from unsetting KSE would be seen by simply not setting that bit. (it might also just look for: if (td->ksegrp->kg_numthreads == 1) and achieve the same thing automatically. So, the question is: DO we as a project want to have fair threading or unfair threading? Should processes with a lot of threads be able to push out processes that do the same thing by using a state machine or an event loop? BTW another alternative would be to write a different scheduler, called sched_4bsd-unfair (or similar) and just strip out the fairness code. it would be another way of doing much the same thing. This is a completely different question to whether there should be an M:N threading library, the existance of which should make no noticable difference to the speed of processses that don't use it. My moral for this story is. "If you don't understand the bigger picture and you modify things then you can expect that your modifications may have unforseen circumstances." I as well as most other people fall foul of this at various times in our carreers. ============ Technical note: The current fairness code relies on a sub structure of the proc, called a ksegrp. This structure represents the "unit of fairness". Most processes have one of these so they act as if the unit of fairness is the entire process. The concept was that a threaded process would have one of these for it's directly allocated threads, and that they woudl act as a group, fairly towards the rest of the system. A process could also have a library that unbeknownst to the program propper, would create its own ksegrp, with its own threads that would act independently and as their own 'fairness' characteristics, priorities etc. The threads only the top N (= ncpu usually) threads are aloowed onto the system run queue to compete with other processes. By assigning a separate KSEGRP for each thread the libthr code assures that each thread is immediatly promoted to the system run queue, however because the system code doesn't realise that he is trying to subvert the fairness code, it still takes the code path the looks at the ksegrp run qieies and does all sorts of other checks. If someone can come up with a better fairness method (Please!) then I'm happy to see all that code in the shceduler replaced by whatever else is chosen (nothing if we REALLY want to see thread unfairness). I think that libthr should be moved back to be "fair" by default, and that unfair mode should be made optional (if you are root) so that dedicated servers, where the administrator wants to get all the performance, and is willing to state explicitly that fairness is not important to him, can do just that (and for benchmarks).