Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Nov 2000 15:34:43 -0800 (PST)
From:      John Baldwin <jhb@FreeBSD.org>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        Jake Burkholder <jburkhol@home.com>, jake@io.yi.org, cp@bsdi.com, smp@FreeBSD.org
Subject:   Re: cvs commit: src/sys/kern kern_timeout.c
Message-ID:  <XFMail.001116153443.jhb@FreeBSD.org>
In-Reply-To: <XFMail.001116145928.jhb@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On 16-Nov-00 John Baldwin wrote:
> 
> On 16-Nov-00 John Baldwin wrote:
>>> I think we need a separate spin lock for the callout wheel, ala BSD/OS's
>>> callout_mtx.  Hardclock looks at the callout wheel and is now a fast
>>> interrupt, so it can't acquire a sleep mutex.  Its a little paranoid
>>> because hardclock doesn't actually traverse any lists, it just checks
>>> if the current callout bucket is empty, and potentially schedules
>>> softclock, but you could miss a very short timeout on an smp system.
>>> ticks could also get incremented in the middle of softclock's test
>>> for if the callout's time has come.
>>> 
>>> I have patches that do this and make softclock INTR_MPSAFE, I just need
>>> to test them.
>> 
>> Ok.  I was about to check the BSD/OS code to see how this was done there.
>> 
>>> There's actually another major problem with this.  The run queue and
>>> sleep queue use the same list linkage in struct proc, so its not
>>> safe to release sched_lock while you're on the sleep queue.  If
>>> the process blocks on giant in CURSIG, the sleep queue will get
>>> corrupted.  We really need to split the run queue/sleep queue
>>> linkage.
>> 
>> Ugh, ok.  I'll do this next then.  Grrrr.
> 
> Grr, wouldn't you know it, bar just died with a double fault because
> 
> panic: cpu_switch has wchan
> 
> Happened when I Ctrl-C'd a process. :-P
> 
> *sigh*

I actually don't like the concept of CURSIG() forcing a context switch due to
needing to grab Giant.  For one thing, it breaks the nice assertion of running
processes not having p->p_wchan != NULL that caused my machine to panic.  I'm
trying a patch right now that grabs Giant in msleep() before we grab the
sched_lock so that the call to CURSIG() before mi_switch() won't need to block.
It then releases Giant after CURSIG().  For the CURSIG() after mi_switch(),
doing another context switch due to blocking on Giant isn't a problem, so it
doesn't mess with it.  (Not that there is anything one could do to work around
it.)

-- 

John Baldwin <jhb@FreeBSD.org> -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.001116153443.jhb>