From owner-freebsd-current Thu Dec 17 00:05:37 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id AAA21309 for freebsd-current-outgoing; Thu, 17 Dec 1998 00:05:37 -0800 (PST) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id AAA21302 for ; Thu, 17 Dec 1998 00:05:36 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.1/8.9.1) id AAA89361; Thu, 17 Dec 1998 00:05:29 -0800 (PST) (envelope-from dillon) Date: Thu, 17 Dec 1998 00:05:29 -0800 (PST) From: Matthew Dillon Message-Id: <199812170805.AAA89361@apollo.backplane.com> To: freebsd-current@FreeBSD.ORG Subject: asleep()/await(), M_AWAIT, etc... Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Ok, I've spent a couple of hours thinking about this and I think I can trivially implement an awesome OS feature I've used in the past in my own OS work into FreeBSD. Here are the semantics. I am presuming people are familiar with tsleep(). We add an asleep() kernel function to complement tsleep(). asleep() works like tsleep() in that it adds the process to the appropriate slpque, but asleep() does *not* put the process to sleep. Instead it returns immediately. The process stays runnable. Additional calls to asleep() (or a call to tsleep()) removes the proc from any slpque and re-adds it to the new one. i.e. only the most recent call is effective. We add an await() kernel function. This function initiates any timeout and puts the process to sleep, but only if it is still on a sleep queue. If someone (i.e. an interrupt) wakes up the sleep address after the process calls asleep() but before it calls await(), the slpque is cleared and the await() winds up being a NOP. I have included a unified diff (DON'T APPLY THIS!, FOR REFERENCE ONLY!) at the bottom. The purpose of the new routines is to allow blocking conditions to propogate up a subroutine chain and get handled at a higher level rather then at a lower level in those areas of code that cannot afford to leave exclusive locks sitting around. For example, if bread() blocks waiting for a low level disk I/O on a block device, the vnode remains locked throughout which badly mars potential parallelism when multiple programs are accessing the same file. There is no reason to leave the high level vnode locked while bringing a page into the VM buffer cache! Another example: If a piece of critically locked code needs to allocate memory but cannot afford to block with the lock intact, we can implement M_AWAIT. The code would allocate memory using M_AWAIT and if the allocation fails would be able to unwind the lock(s), await(), and retry. This is something the current code cannot do at all. We could start 'fixing' not only bread(), but getnewbuf() (adding a new flag for slpflag), malloc(), kmem_malloc(), vm_page_alloc(), and many other routines to support the new mechanism without changing backwards compatibility with the existing tsleep()/M_WAITOK mechanism (which, of course, is still useful). I also believe we can fix known deadlocks with mmap()ed files and vnode interlock situations trivially. There are many other cool things you can do with this sort of functionality.... fixing possible deadlock situations allows us to more easily pull SMP locks deeper into the kernel, for example. -Matt Matthew Dillon Engineering, HiWay Technologies, Inc. & BEST Internet Communications & God knows what else. (Please include original email in any response) Index: kern/kern_synch.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_synch.c,v retrieving revision 1.69 diff -u -r1.69 kern_synch.c --- kern_synch.c 1998/11/27 11:44:22 1.69 +++ kern_synch.c 1998/12/17 07:46:39 @@ -407,9 +407,21 @@ if (ident == NULL || p->p_stat != SRUN) panic("tsleep"); /* XXX This is not exhaustive, just the most common case */ +#ifdef NOTDEF + /* + * This can happen legitimately now with asleep()/await() + */ if ((p->p_procq.tqe_prev != NULL) && (*p->p_procq.tqe_prev == p)) panic("sleeping process already on another queue"); #endif +#endif + /* + * Process may be sitting on a slpque if asleep() was called, remove + * it before re-adding. + */ + if (p->p_wchan != NULL) + unsleep(p); + p->p_wchan = ident; p->p_wmesg = wmesg; p->p_slptime = 0; @@ -475,7 +487,166 @@ } /* - * Implement timeout for tsleep. + * asleep() - async sleep call. Place process on wait queue and return + * immediately without blocking. The process stays runnable until await() + * is called. + * + * Only the most recent sleep condition is effective when making successive + * calls to asleep() or when calling tsleep(). + * + * The timeout, if any, is not initiated until await() is called. The sleep + * priority, signal, and timeout is specified in the asleep() call but may be + * overriden in the await() call. + */ + +int +asleep(void *ident, int priority, const char *wmesg, int timo) +{ + struct proc *p = curproc; + int s; + + /* + * splhigh() while manipulating sleep structures and slpque. + * + * Remove preexisting wait condition (if any) and place process + * on appropriate slpque, but do not put process to sleep. + */ + + s = splhigh(); + + if (p->p_wchan != NULL) + unsleep(p); + + if (ident) { + p->p_wchan = ident; + p->p_wmesg = wmesg; + p->p_slptime = 0; + p->p_asleep.as_priority = priority; + p->p_asleep.as_timo = timo; + TAILQ_INSERT_TAIL(&slpque[LOOKUP(ident)], p, p_procq); + } + + splx(s); + + return(0); +} + +/* + * await() - wait for async condition to occur. The process blocks until + * wakeup() is called on the most recent asleep() address. If wakeup is called + * priority to await(), await() winds up being a NOP. + * + * If await() is called more then once (without an intervening asleep() call), + * await() is still effectively a NOP but it calls mi_switch() to give other + * processes some cpu before returning. The process is left runnable. + */ + +int +await(int priority, int timo) +{ + struct proc *p = curproc; + int s; + + s = splhigh(); + + if (p->p_wchan != NULL) { + struct callout_handle thandle; + int sig; + int catch; + + /* + * The call to await() can override defaults specified in + * the original asleep(). + */ + if (priority < 0) + priority = p->p_asleep.as_priority; + if (timo < 0) + timo = p->p_asleep.as_timo; + + /* + * Install timeout + */ + + if (timo) + thandle = timeout(endtsleep, (void *)p, timo); + + sig = 0; + catch = priority & PCATCH; + + if (catch) { + p->p_flag |= P_SINTR; + if ((sig = CURSIG(p))) { + if (p->p_wchan) + unsleep(p); + p->p_stat = SRUN; + goto resume; + } + if (p->p_wchan == NULL) { + catch = 0; + goto resume; + } + } + p->p_stat = SSLEEP; + p->p_stats->p_ru.ru_nvcsw++; + mi_switch(); +resume: + curpriority = p->p_usrpri; + + splx(s); + p->p_flag &= ~P_SINTR; + if (p->p_flag & P_TIMEOUT) { + p->p_flag &= ~P_TIMEOUT; + if (sig == 0) { +#ifdef KTRACE + if (KTRPOINT(p, KTR_CSW)) + ktrcsw(p->p_tracep, 0, 0); +#endif + return (EWOULDBLOCK); + } + } else if (timo) + untimeout(endtsleep, (void *)p, thandle); + if (catch && (sig != 0 || (sig = CURSIG(p)))) { +#ifdef KTRACE + if (KTRPOINT(p, KTR_CSW)) + ktrcsw(p->p_tracep, 0, 0); +#endif + if (p->p_sigacts->ps_sigintr & sigmask(sig)) + return (EINTR); + return (ERESTART); + } +#ifdef KTRACE + if (KTRPOINT(p, KTR_CSW)) + ktrcsw(p->p_tracep, 0, 0); +#endif + } else { + /* + * If as_priority is 0, await() has been called without an + * intervening asleep(). We are still effectively a NOP, + * but we call mi_switch() for safety. + */ + + if (p->p_asleep.as_priority == 0) { + p->p_stats->p_ru.ru_nvcsw++; + mi_switch(); + } + splx(s); + + } + + /* + * clear p_asleep.as_priority as an indication that await() has been + * called. If await() is called again without an intervening asleep(), + * await() is still effectively a NOP but the above mi_switch() code + * is triggered. + */ + p->p_asleep.as_priority = 0; + + return (0); +} + +/* + * Implement timeout for tsleep or asleep()/await() + * * If process hasn't been awakened (wchan non-zero), * set timeout flag and undo the sleep. If proc * is stopped, just unsleep so it will remain stopped. @@ -532,9 +703,15 @@ restart: for (p = qp->tqh_first; p != NULL; p = p->p_procq.tqe_next) { #ifdef DIAGNOSTIC +#ifdef NOTDEF + /* + * The process can legitimately be running now with + * asleep()/await(). + */ if (p->p_stat != SSLEEP && p->p_stat != SSTOP) panic("wakeup"); #endif +#endif if (p->p_wchan == ident) { TAILQ_REMOVE(qp, p, p_procq); p->p_wchan = 0; @@ -577,8 +754,14 @@ for (p = qp->tqh_first; p != NULL; p = p->p_procq.tqe_next) { #ifdef DIAGNOSTIC +#ifdef NOTDEF + /* + * The process can legitimately be running now with + * asleep()/await(). + */ if (p->p_stat != SSLEEP && p->p_stat != SSTOP) panic("wakeup_one"); +#endif #endif if (p->p_wchan == ident) { TAILQ_REMOVE(qp, p, p_procq); To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message