Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 13 Oct 2003 11:00:39 -0700 (PDT)
From:      Bruce Evans <bde@zeta.org.au>
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: kern/57945: [patch] Add locking to kqueue to make it MP-safe
Message-ID:  <200310131800.h9DI0duG043149@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR kern/57945; it has been noted by GNATS.

From: Bruce Evans <bde@zeta.org.au>
To: Stefan Farfeleder <stefan@fafoe.narf.at>
Cc: FreeBSD-gnats-submit@freebsd.org, freebsd-bugs@freebsd.org
Subject: Re: kern/57945: [patch] Add locking to kqueue to make it MP-safe
Date: Tue, 14 Oct 2003 03:56:19 +1000 (EST)

 On Mon, 13 Oct 2003, Stefan Farfeleder wrote:
 
 > >Description:
 > The current kqueue implementation does not seem to be MP-safe.  The
 > kqueue facility does not having its own locks, I believe it was
 > intended to rely on Giant.  As Giant gets locked down, kqueue was
 > apparently forgotten.  The KNOTE() macro is spread over the whole
 > kernel, sometimes called with Giant hold, sometimes not.  The function
 > knote_enqueue() (called via knote() and KNOTE_ACTIVATE()) inserts a node
 > into a TAILQ, this operation isn't atomic.  If another processor
 > executes KNOTE() or kevent() at the same time, the queue might get
 > corrupted.
 > ...
 > >Fix:
 > I'm using patches similar to this one since about a year, the
 > kqueue-enabled make(1) runs without problems with it.  Unfortunately the
 > mutex is (I'm not sure it has to be) acquired and released rather often
 > in the loop in kqueue_scan() to avoid deadlocks due to LORs.  Maybe a
 > spin lock would increase the performance?  (I didn't do any benchmarks.)
 
 The small critical sections seem to be mostly required.
 
 Spin mutexes are actually less efficient, since in the usual uncontested
 case they do the same things plus extra calls to critical_enter() and
 critical exit.
 
 > --- kqueue_lock.diff begins here ---
 > Index: src/sys/kern/kern_event.c
 > ===================================================================
 > RCS file: /usr/home/ncvs/src/sys/kern/kern_event.c,v
 > retrieving revision 1.60
 > diff -u -r1.60 kern_event.c
 > --- src/sys/kern/kern_event.c	18 Jun 2003 18:16:39 -0000	1.60
 > +++ src/sys/kern/kern_event.c	13 Oct 2003 00:35:42 -0000
 > ...
 > @@ -704,15 +705,16 @@
 >
 >  start:
 >  	kevp = kq->kq_kev;
 > -	s = splhigh();
 > +	mtx_lock(&kq->kq_mtx);
 >  	if (kq->kq_count == 0) {
 >  		if (timeout < 0) {
 >  			error = EWOULDBLOCK;
 > +			mtx_unlock(&kq->kq_mtx);
 >  		} else {
 >  			kq->kq_state |= KQ_SLEEP;
 > -			error = tsleep(kq, PSOCK | PCATCH, "kqread", timeout);
 > +			error = msleep(kq, &kq->kq_mtx, PSOCK | PCATCH | PDROP,
 > +			    "kqread", timeout);
 >  		}
 > -		splx(s);
 
 Not copying the spl locking seems to give a bug here.  msleep() returns with
 the mutex held.
 
 >  		if (error == 0)
 >  			goto retry;
 >  		/* don't restart after signals... */
 > @@ -728,20 +730,22 @@
 >  		kn = TAILQ_FIRST(&kq->kq_head);
 >  		TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe);
 >  		if (kn == &marker) {
 > -			splx(s);
 > +			mtx_unlock(&kq->kq_mtx);
 >  			if (count == maxevents)
 >  				goto retry;
 >  			goto done;
 >  		}
 > +		kq->kq_count--;
 > +		mtx_unlock(&kq->kq_mtx);
 >  		if (kn->kn_status & KN_DISABLED) {
 >  			kn->kn_status &= ~KN_QUEUED;
 > -			kq->kq_count--;
 > +			mtx_lock(&kq->kq_mtx);
 >  			continue;
 >  		}
 
 I don't see any reason to unlock/lock here.  There were no splx()/splhigh()'s.
 splhigh() is very global but kq_mtx is very local so keeping the
 critical sections short should be much less needed than before.
 
 >  		if ((kn->kn_flags & EV_ONESHOT) == 0 &&
 >  		    kn->kn_fop->f_event(kn, 0) == 0) {
 >  			kn->kn_status &= ~(KN_QUEUED | KN_ACTIVE);
 > -			kq->kq_count--;
 > +			mtx_lock(&kq->kq_mtx);
 >  			continue;
 >  		}
 >  		*kevp = kn->kn_kevent;
 
 Here dropping the mutex seems to be be necessary for calling the
 function, but the function call was locked by splhigh() before.
 
 BTW, what locks kn?
 
 Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200310131800.h9DI0duG043149>