From owner-freebsd-current Thu Jul 4 3:56:50 2002 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DF67A37B400 for ; Thu, 4 Jul 2002 03:56:46 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id CF0A143E3B for ; Thu, 4 Jul 2002 03:56:45 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id UAA06509; Thu, 4 Jul 2002 20:56:29 +1000 Date: Thu, 4 Jul 2002 21:02:30 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: David Xu Cc: Julian Elischer , Subject: Re: Timeout and SMP race In-Reply-To: <00f401c22337$c6c94d90$ef01a8c0@davidwnt> Message-ID: <20020704202730.X21619-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 4 Jul 2002, David Xu wrote: > ----- Original Message ----- > From: "Julian Elischer" > To: "David Xu" > Cc: > Sent: Thursday, July 04, 2002 4:36 PM > Subject: Re: Timeout and SMP race > > > > On Thu, 4 Jul 2002, David Xu wrote: > > > > > while we are getting rid of Giant, current race condition between softclock() > > > and callout_stop() is unacceptable. the race causes two many places in source > > > code would be modified to fit this new behaviour, besides this, everywhere > > > callout_stop() is used need to hold sched_lock and do a mi_switch() and > > > modify td_flags is also unacceptable, this SMP race should be resolved in > > > kern_timeout.c. > > > > > > David Xu > > > > This is probably true.. > > the current hacks for this are rather horrible. I think there msut be > > better ways. Your suggestion sounds plausible. > > > > > > if another thread other than softclock itself is calling callout_stop(), > and callout_stop() detected that softclock is currently running the > callout, it should wait until softclock finishes the work, then return. softclock() intentionally releases callout_lock() to allow other processes to manipulate callouts. What is the race exactly? Concurrent calls to softclock() seem to be possible but seem to be handled correctly (internal locking prevents problems). Well, I can see one race in softclock(): % c_func = c->c_func; % c_arg = c->c_arg; % c_flags = c->c_flags; This caches some values, as is needed since the 'c' pointer may become invalid after we release the lock ... but the things pointed to may become invalid too. % c->c_func = NULL; % if (c->c_flags & CALLOUT_LOCAL_ALLOC) { % c->c_flags = CALLOUT_LOCAL_ALLOC; % SLIST_INSERT_HEAD(&callfree, c, % c_links.sle); % } else % c->c_flags &= ~CALLOUT_PENDING; % mtx_unlock_spin(&callout_lock); callout_stop() may stop 'c' here. It won't do much, since we have already set c->c_func to NULL, but its caller may want the callout actually stopped so that it can do things like unloading the old c->c_func. % if ((c_flags & CALLOUT_MPSAFE) == 0) % mtx_lock(&Giant); % c_func(c_arg); This calls through a possibly-invalid function pointer. % if ((c_flags & CALLOUT_MPSAFE) == 0) % mtx_unlock(&Giant); % mtx_lock_spin(&callout_lock); This seems to be an old bug. In RELENG_4, splsoftclock() gives a more global lock, but there is nothing to prevent callout_stop() being run at splsoftclock(). In fact, it must be able to run when called nested from inside softclock(), since it might be called from the handler. Waiting in callout_stop() for softclock() to finish would deadlock in this case. It's interesting that this case is (always?) avoided in untimeout() by not calling callout_stop() when c->c_func == NULL. softclock() can't do anything about c->c_func going away after it is called. Clients must somehow avoid killing it. I think c->c_func rarely goes away, and the race that you noticed is lost more often. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message