Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 17 Dec 1998 00:05:29 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        freebsd-current@FreeBSD.ORG
Subject:   asleep()/await(), M_AWAIT, etc...
Message-ID:  <199812170805.AAA89361@apollo.backplane.com>

next in thread | raw e-mail | index | archive | help
    Ok, I've spent a couple of hours thinking about this and I think I can
    trivially implement an awesome OS feature I've used in the past in my own 
    OS work into FreeBSD.

    Here are the semantics.  I am presuming people are familiar with 
    tsleep().

    We add an asleep() kernel function to complement tsleep().  asleep()
    works like tsleep() in that it adds the process to the appropriate
    slpque, but asleep() does *not* put the process to sleep.  Instead it
    returns immediately.  The process stays runnable.  Additional calls
    to asleep() (or a call to tsleep()) removes the proc from any slpque
    and re-adds it to the new one.  i.e. only the most recent call is
    effective.

    We add an await() kernel function.  This function initiates any timeout
    and puts the process to sleep, but only if it is still on a sleep queue.
    If someone (i.e. an interrupt) wakes up the sleep address after the
    process calls asleep() but before it calls await(), the slpque is
    cleared and the await() winds up being a NOP.

    I have included a unified diff (DON'T APPLY THIS!, FOR REFERENCE ONLY!)
    at the bottom.

    The purpose of the new routines is to allow blocking conditions to
    propogate up a subroutine chain and get handled at a higher level rather
    then at a lower level in those areas of code that cannot afford to 
    leave exclusive locks sitting around.  For example, if bread() blocks
    waiting for a low level disk I/O on a block device, the vnode remains
    locked throughout which badly mars potential parallelism when multiple
    programs are accessing the same file.  There is no reason to leave the
    high level vnode locked while bringing a page into the VM buffer cache!
    Another example:  If a piece of critically locked code needs to allocate
    memory but cannot afford to block with the lock intact, we can implement
    M_AWAIT.  The code would allocate memory using M_AWAIT and if the 
    allocation fails would be able to unwind the lock(s), await(), and retry.
    This is something the current code cannot do at all.

    We could start 'fixing' not only bread(), but getnewbuf() (adding a 
    new flag for slpflag), malloc(), kmem_malloc(), vm_page_alloc(), and
    many other routines to support the new mechanism without changing
    backwards compatibility with the existing tsleep()/M_WAITOK mechanism
    (which, of course, is still useful).  I also believe we can fix
    known deadlocks with mmap()ed files and vnode interlock situations
    trivially.

    There are many other cool things you can do with this sort of 
    functionality.... fixing possible deadlock situations allows us to more
    easily pull SMP locks deeper into the kernel, for example.

						-Matt

    Matthew Dillon  Engineering, HiWay Technologies, Inc. & BEST Internet 
                    Communications & God knows what else.
    <dillon@backplane.com> (Please include original email in any response)    


Index: kern/kern_synch.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_synch.c,v
retrieving revision 1.69
diff -u -r1.69 kern_synch.c
--- kern_synch.c	1998/11/27 11:44:22	1.69
+++ kern_synch.c	1998/12/17 07:46:39
@@ -407,9 +407,21 @@
 	if (ident == NULL || p->p_stat != SRUN)
 		panic("tsleep");
 	/* XXX This is not exhaustive, just the most common case */
+#ifdef NOTDEF
+	/*
+	 * This can happen legitimately now with asleep()/await()
+	 */
 	if ((p->p_procq.tqe_prev != NULL) && (*p->p_procq.tqe_prev == p))
 		panic("sleeping process already on another queue");
 #endif
+#endif
+	/*
+	 * Process may be sitting on a slpque if asleep() was called, remove
+	 * it before re-adding.
+	 */
+	if (p->p_wchan != NULL)
+		unsleep(p);
+
 	p->p_wchan = ident;
 	p->p_wmesg = wmesg;
 	p->p_slptime = 0;
@@ -475,7 +487,166 @@
 }
 
 /*
- * Implement timeout for tsleep.
+ * asleep() - async sleep call.  Place process on wait queue and return 
+ * immediately without blocking.  The process stays runnable until await() 
+ * is called.
+ *
+ * Only the most recent sleep condition is effective when making successive
+ * calls to asleep() or when calling tsleep().
+ *
+ * The timeout, if any, is not initiated until await() is called.  The sleep
+ * priority, signal, and timeout is specified in the asleep() call but may be
+ * overriden in the await() call.
+ */
+
+int
+asleep(void *ident, int priority, const char *wmesg, int timo)
+{
+	struct proc *p = curproc;
+	int s;
+
+	/*
+	 * splhigh() while manipulating sleep structures and slpque.
+	 *
+	 * Remove preexisting wait condition (if any) and place process
+	 * on appropriate slpque, but do not put process to sleep.
+	 */
+
+	s = splhigh();
+
+	if (p->p_wchan != NULL)
+		unsleep(p);
+
+	if (ident) {
+		p->p_wchan = ident;
+		p->p_wmesg = wmesg;
+		p->p_slptime = 0;
+		p->p_asleep.as_priority = priority;
+		p->p_asleep.as_timo = timo;
+		TAILQ_INSERT_TAIL(&slpque[LOOKUP(ident)], p, p_procq);
+	}
+
+	splx(s);
+
+	return(0);
+}
+
+/*
+ * await() - wait for async condition to occur.   The process blocks until
+ * wakeup() is called on the most recent asleep() address.  If wakeup is called
+ * priority to await(), await() winds up being a NOP.
+ *
+ * If await() is called more then once (without an intervening asleep() call),
+ * await() is still effectively a NOP but it calls mi_switch() to give other
+ * processes some cpu before returning.  The process is left runnable.
+ */
+
+int
+await(int priority, int timo)
+{
+	struct proc *p = curproc;
+	int s;
+
+	s = splhigh();
+
+	if (p->p_wchan != NULL) {
+		struct callout_handle thandle;
+		int sig;
+		int catch;
+
+		/*
+		 * The call to await() can override defaults specified in
+		 * the original asleep().
+		 */
+		if (priority < 0)
+			priority = p->p_asleep.as_priority;
+		if (timo < 0)
+			timo = p->p_asleep.as_timo;
+
+		/*
+		 * Install timeout
+		 */
+
+		if (timo)
+			thandle = timeout(endtsleep, (void *)p, timo);
+
+		sig = 0;
+		catch = priority & PCATCH;
+
+		if (catch) {
+			p->p_flag |= P_SINTR;
+			if ((sig = CURSIG(p))) {
+				if (p->p_wchan)
+					unsleep(p);
+				p->p_stat = SRUN;
+				goto resume;
+			}
+			if (p->p_wchan == NULL) {
+				catch = 0;
+				goto resume;
+			}
+		}
+		p->p_stat = SSLEEP;
+		p->p_stats->p_ru.ru_nvcsw++;
+		mi_switch();
+resume:
+		curpriority = p->p_usrpri;
+
+		splx(s);
+		p->p_flag &= ~P_SINTR;
+		if (p->p_flag & P_TIMEOUT) {
+			p->p_flag &= ~P_TIMEOUT;
+			if (sig == 0) {
+#ifdef KTRACE
+				if (KTRPOINT(p, KTR_CSW))
+					ktrcsw(p->p_tracep, 0, 0);
+#endif
+				return (EWOULDBLOCK);
+			}
+		} else if (timo)
+			untimeout(endtsleep, (void *)p, thandle);
+		if (catch && (sig != 0 || (sig = CURSIG(p)))) {
+#ifdef KTRACE
+			if (KTRPOINT(p, KTR_CSW))
+				ktrcsw(p->p_tracep, 0, 0);
+#endif
+			if (p->p_sigacts->ps_sigintr & sigmask(sig))
+				return (EINTR);
+			return (ERESTART);
+		}
+#ifdef KTRACE
+		if (KTRPOINT(p, KTR_CSW))
+			ktrcsw(p->p_tracep, 0, 0);
+#endif
+	} else {
+		/*
+		 * If as_priority is 0, await() has been called without an 
+		 * intervening asleep().  We are still effectively a NOP, 
+		 * but we call mi_switch() for safety.
+		 */
+
+		if (p->p_asleep.as_priority == 0) {
+			p->p_stats->p_ru.ru_nvcsw++;
+			mi_switch();
+		}
+		splx(s);
+
+	}
+
+	/*
+	 * clear p_asleep.as_priority as an indication that await() has been
+	 * called.  If await() is called again without an intervening asleep(),
+	 * await() is still effectively a NOP but the above mi_switch() code
+	 * is triggered.
+	 */
+	p->p_asleep.as_priority = 0;
+
+	return (0);
+}
+
+/*
+ * Implement timeout for tsleep or asleep()/await()
+ *
  * If process hasn't been awakened (wchan non-zero),
  * set timeout flag and undo the sleep.  If proc
  * is stopped, just unsleep so it will remain stopped.
@@ -532,9 +703,15 @@
 restart:
 	for (p = qp->tqh_first; p != NULL; p = p->p_procq.tqe_next) {
 #ifdef DIAGNOSTIC
+#ifdef NOTDEF
+		/*
+		 * The process can legitimately be running now with 
+		 * asleep()/await().
+		 */
 		if (p->p_stat != SSLEEP && p->p_stat != SSTOP)
 			panic("wakeup");
 #endif
+#endif
 		if (p->p_wchan == ident) {
 			TAILQ_REMOVE(qp, p, p_procq);
 			p->p_wchan = 0;
@@ -577,8 +754,14 @@
 
 	for (p = qp->tqh_first; p != NULL; p = p->p_procq.tqe_next) {
 #ifdef DIAGNOSTIC
+#ifdef NOTDEF
+		/*
+		 * The process can legitimately be running now with 
+		 * asleep()/await().
+		 */
 		if (p->p_stat != SSLEEP && p->p_stat != SSTOP)
 			panic("wakeup_one");
+#endif
 #endif
 		if (p->p_wchan == ident) {
 			TAILQ_REMOVE(qp, p, p_procq);

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199812170805.AAA89361>