From owner-freebsd-arch Sat Jan 20 10:28:34 2001 Delivered-To: freebsd-arch@freebsd.org Received: from VL-MS-MR002.sc1.videotron.ca (relais.videotron.ca [24.201.245.36]) by hub.freebsd.org (Postfix) with ESMTP id 50E3437B400 for ; Sat, 20 Jan 2001 10:28:13 -0800 (PST) Received: from jehovah ([24.201.144.31]) by VL-MS-MR002.sc1.videotron.ca (Netscape Messaging Server 4.15) with SMTP id G7H4MR02.59T; Sat, 20 Jan 2001 13:28:03 -0500 Message-ID: <007c01c0830e$f1c84a70$1f90c918@jehovah> From: "Bosko Milekic" To: "Jason Evans" Cc: References: <000701c0829c$e0378d60$1f90c918@jehovah> <20010120015720.K69199@canonware.com> Subject: Re: wakeup_one() return value Date: Sat, 20 Jan 2001 13:29:37 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2919.6700 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Jason Evans wrote: > On Fri, Jan 19, 2001 at 11:53:05PM -0500, Bosko Milekic wrote: > > Here's a diff: http://people.freebsd.org/~bmilekic/code/wakeupone.diff > > that makes wakeup_one() return an integer specifying whether a process was > > actually taken off the sleep queue, or not. The reason for the proposed > > change as to do with SMP and potential race conditions in certain situations. > > For example, wakeup_one() is sometimes (and quite effectively) used in this > > scenario: > > > > thread 1: > > --------- > > > > mtx_enter(lock_A, foo); > > do stuff; > > slp_cnt++; > > if (msleep(foobar, lock_A, ..., some_time_period) == EWOULDBLOCK) > > slp_cnt--; > > do more stuff; > > mtx_exit(lock_A, foo); > > > > > > thread 2: > > --------- > > > > mtx_enter(lock_A, foo); > > do stuff; > > if (slp_cnt >= 0) { > > wakeup_one(foobar); > > slp_cnt--; > > } > > mtx_exit(lock_A, foo); > > > > > > The problem is that under some conditions, there is a race and slp_cnt can > > get decremented twice. For example, consider slp_cnt == 0, and thread 1 comes > > in and increments it to 1. It goes to sleep but wakes up and grabs sched_lock > > (it hasn't yet regrabbed lock_A). At this time, thread 2 has lock_A and is > > spinning on sched_lock in wakeup_one(). But, thread 1 has timed out, and > > removes itself from the sleep queue; it returns EWOULDBLOCK, so slp_cnt goes > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > down to 0. sched_lock is dropped, the thread blocks waiting for thread 2 to > ^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > let go of lock_A so it can "do more stuff." Meanwhile, thread 2 gets > ^^^^^^^^^^^^^^^^ > > sched_lock and does a wakeup_one() (effectively a no-op) and decrements > > slp_cnt again. There's an underflow. > > I don't quite follow this paragraph, but I think that part of it rests on > the assumption that slp_cnt is decremented before lock_A is reacquired, > which is impossible. The ordering that you describe in the underscored > text is not possible, since msleep() returns with lock_A held. It is > possible for msleep() to return EWOULDBLOCK _after_ thread 2 has released > lock_A though, so the race condition you describe is still possible. Your That's exactly what I described. See "thread 1 blocks waiting for thread 2 to let go of lock_A so [that it can return EWOULDBLOCK finally] and do more stuff." Perhaps it was a tad misleading. > example can be "fixed" as follows: > > thread 1: > --------- > > mtx_enter(lock_A, foo); > do stuff; > slp_cnt++; > msleep(foobar, lock_A, ..., some_time_period); > slp_cnt--; > do more stuff; > mtx_exit(lock_A, foo); > > > thread 2: > --------- > > mtx_enter(lock_A, foo); > do stuff; > if (slp_cnt >= 0) > wakeup_one(foobar); > mtx_exit(lock_A, foo); > > Now, slp_cnt will always be correct. However, if the example is actually > more complex than what you present and there are multiple threads that can > potentially call wakeup_one(), it is possible to call wakeup_one() more > times than there are sleeping threads. If the example is even more > complex, perhaps that means that too many threads were woken (i.e. it's a > bad thing to have woken two threads instead of just one). > > This new problem can be "fixed" by introducing another variable that keeps > track of the number of times wakeup_one() is called. I won't go there now > though. =) > > > Any objections? If this goes in, we should probably do similar to the > > cond-var interface function that wakes up one sleeper. > > I object to this change (even if you do come up with an example that would > benefit from the change =) ). The example tries to use slp_cnt to pass > information between two threads in an unsafe way. Perhaps you > oversimplified the example; there is no reason I can see for even using > slp_cnt. The following works fine: > > thread 1: > --------- > > mtx_enter(lock_A, foo); > do stuff; > msleep(foobar, lock_A, ..., some_time_period); > do more stuff; > mtx_exit(lock_A, foo); > > > thread 2: > --------- > > mtx_enter(lock_A, foo); > do stuff; > wakeup_one(foobar); > mtx_exit(lock_A, foo); > > If thread 1 isn't sleeping, the call to wakeup_one() has no effect, which > is fine. > > Basically, all my rambling can be summarized thus: > > 1) The example code is broken (as you point out), but it may fail to > highlight the point you were trying to make. > > 2) The methodology behind any similar example that does highlight a race > condition would be demonstrating a flawed programming idiom, not a > weakness of wakeup_one(). > > Thread synchronization rests on the foundation of mutexes and condition > variables (msleep() is an equivalent). You are suggesting a fundamental > change to one of those synchronization primitives. That should be a pretty > strong indicator by itself that the change isn't needed. =) > > Jason Okay, this is much better. Your first solution is better, because: * msleep is called with A held. * wakeup_one only happens when A is held. so wakeup_one() may be called multiple times if, during the period between thread1 dropping sched_lock and thread2 doing it's stuff and releasing A, some thread3 jumps in with perfect timing and gets A before thread1 does (this is probably highly unlikely). But even if it's called multiple times, as long as slp_cnt is only decremented once, there's no problem. As for your last solution, I would prefer calling wakeup_one() [and traversing the sleep queue with sched_lock held for no good reason] only if need be. Conclusion: I'll scrap the diff and use the method described in your first solution (i.e. sleeping thread does all tampering with the slp_cnt). Note: I think there used to be a reason why I didn't do it this way before, I just can't remember it right now. Maybe it just slipped through. Sorry about that and thanks for the pointer. :-) Cheers, Bosko. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message