From owner-freebsd-arch Thu Sep 5 9:40:29 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AA58A37B400 for ; Thu, 5 Sep 2002 09:40:13 -0700 (PDT) Received: from sccrmhc01.attbi.com (sccrmhc01.attbi.com [204.127.202.61]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0031043E6A for ; Thu, 5 Sep 2002 09:40:12 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org ([12.232.206.8]) by sccrmhc01.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020905164011.CUMD11061.sccrmhc01.attbi.com@InterJet.elischer.org>; Thu, 5 Sep 2002 16:40:11 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id JAA36858; Thu, 5 Sep 2002 09:27:34 -0700 (PDT) Date: Thu, 5 Sep 2002 09:27:33 -0700 (PDT) From: Julian Elischer To: Terry Lambert Cc: Giorgos Keramidas , Bruce Evans , arch@FreeBSD.ORG Subject: Re: Process/thread states. In-Reply-To: <3D770DF1.623A84B0@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 5 Sep 2002, Terry Lambert wrote: > Julian Elischer wrote: > > the state of being on the sleep queue can occur in parallel with the other > > states, e.g. on the run queue, or running or on the suspend queue. > > Sleeping is really only being on the sleep queue without any other > > state taking precedence. > > This means that to truely represent the queue and run states of a thread > > the state variable consists of a state field with several values > > and a separate bit that represents the background reality of it also being > > on the sleep queue, and thus able to be awoken. There are also > > possibilities that a thread having been swapped out (as part of a process) > > that must also be kept orthogonal to whether the thread is suspended or > > sleeping or runnable. if this really does turn out to be the case > > REAL thread state tracking requires 3 orthognal tests. For this reason I > > want touse a macro to allow, > > 1/ experimentation on how thisis best done > > 2/ hiding of the sometime messy tests with meaningful named abstractions. > > To me this seems to introduce a bitfield/other-field-contents > synchroniztion problem that wasn't there before. Effectively, > you are saying that the value of the bit follows the change from > NULL-to-non-NULL and non-NULL-to-NULL. There is an implicit race > that happens, unless both objects are protected by the same mutex. > > In the case of a direct compare with the object itself, then there > is not race in the following field. > > Bruce made the point that debugging such macros is arbitrarily > hard. This is true because, even if you ensure that the order > of operation is always set A/set B, unset B/unset A, your macro > used in the comparator functions has to be consistent... and also > used consistently. In other words, there is an implied semantic > on use, when there is a combined state comparison, which is not > explicit in the transitions between states themselves (assuming > your state is represented by the "and" or "xor" of two pieces of > information). There are several cases in the code in question > where there are comparisons which fail. Specifically, the negation > of the macro operation requires inverse ordering of the compared > values, in order to avoid introducing a "follower" race window. state changes are all done under the sched lock > > It's possible to get this right, but it also means a lot of work; > IMO, it's not worth the implied semantics to get the simplified > "if" statements; you would basically need two macros, with one > order for the assertion, and the other order for the assertion > of the negation. The compiler can't enforce the use of the > correct semantics here, so it's easy for programmers to make > errors, e.g.: > > if (!ASSERTION_STATEMENT(x)) > > vs. > > if (NEGATION_STATEMENT(x)) > > that end up introducing incredibly small -- but still there -- > race windows in the code, which will pop up as unrepeatable > errors suffered by one or two users, which "disappear" when a > printf() (or other timing-changing function) is introduced, > etc.. > > To my mind, it is better to multiply the states explicitly, > and use the product of the multiplication in the code. For > anything more than a two-by-two, this introduces more states > (2x3 := 5->6, 4x5 := 9->20, etc.). This means that a lot of > care will have to be taken in the code. On the plus side, it > means that someone who comes after you is not going to screw > things up accidently by missing some required semantics that > the compiler is unable to enforce. It's also possible to > reduce the total number of states by eliminating "impossible" > states (e.g. with a Karnot map), so the increase in total > states is not necessarily really any greater than it would > have been the other way. > > > > > Neither one of these works properly if there is more than > > > one bit hidden in either of these; you have to get a lot more > > > complicated: > > > > this one could be real: > > if ((td->td_wchan == 0) && (td->td_state & TD_ST_MASK == TDS_UNQUEUED) && > > ((td->td_state & TDS_SWAPPED) == 0)) > > setrunnable(td); > > > > I'd rather make it > > > > if (td_on_sleepq_only(td) && !td_is_swapped(td)) > > setrunnable(td); > > > > Isn't that somewhat easier to understand? > > Yes. But it falls into the negation case race problem. A similar > condition elsewhere: > > if (!td_on_sleepq_only(td) || td_is_swapped(td)) > do_something(td); > > is actually a potential race, unless it's: > > if(td_is_swapped(td) || !td_on_sleepq_only(td)) > > ...but it's worse than that. By compressing the state compare into a > single macro, you lock the order of operation. The only legitimate > way to do this is if you have a mutex that protects the contents of > td->td_state, since there is no way that you can guarantee that both > elements will be updated atomically otherwise. The order of the > update becomes important, if the condition is not exactly equal to > the example. Further, I would argue that the use of a lowercase > macro obfuscates the idea of atmoicity -- atomicity is implied, but > not actually guaranteed over (possible) kernel preemption (e.g. a > thread on another CPU explcitly killing the thread in question > between one compare of td->td_state contents, and the other). > > > > > That *still* doesn't work if, say, TDS_XXX is a superset of the > > > bits in TDS_SUSPENDED. > > > > > I personally dislike the idea omf combined bits, without explicit > > > equality tests. > > > > I have no idea what you mean by that. > > When you defined your manifest constants, you defined them as the > logical OR of a number of bits in your enumerated type declaration; > this means that one set of bits could be a subset of another. Even > if you have a mask and two flag bits on a value, this is a problem. > To make something up: > > enum { > foo = 75 | 0x10000000, > fum = 75 | 0x10000000 | 0x20000000, > ... > } > > This is actually *likely*, given that people will tend to add bits > as time goes on. > > > > > The problem that's trying to be solved here is the one of how > > > do you represent a combinatorial state in a single comparison, > > > and the equivalency of one combinatorial state and another for > > > the purposes of whether or not some code should be run (as in > > > this specific case). > > > > > > The reality is probably that this state should be explicitly > > > multiplied out in the code, so that it becomes a single value > > > compare. > > > > sometimes you want otherwise.. e.g. > > What is the base state REGARDLESS of whether it is on the sleep queue. > > I would argue that the compare of the sleep queue entry pointer > to NULL is the *ideal* was to get this information. 8-). > > -- Terry > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message