Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Mar 2002 14:48:30 -0500 (EST)
From:      John Baldwin <jhb@FreeBSD.org>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        freebsd-smp@freebsd.org
Subject:   RE: Syscall contention tests return, userret() bugs/issues.
Message-ID:  <XFMail.20020329144830.jhb@FreeBSD.org>
In-Reply-To: <200203291929.g2TJTIi59785@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 29-Mar-2002 Matthew Dillon wrote:
>     * system call counter (and counters in the kernel in general)
> 
>       My current hack does not properly protect the counter.  Properly
>       speaking, due to not being a single instruction, the per-cpu
>       must be protected by critical_enter()/critical_exit().
> 
>       I am thinking of adding a PCPU_ADD_INT() macro to all architectures
>       and putting all 'global' system counters in the per-cpu area, i.e.
>       embed the struct vmmeter structure in the per-cpu area.  sysctl
>       would combine the information from all the per-cpu areas (i.e. for
>       systat / vmstat reporting).
> 
>       ( I am not covering things like interface counters here, just global
>       system counters ).

PCPU_SET(foo, PCPU_GET(foo) + 1) would work w/o a critical section actually.
To be honest, stats such as these don't have to be perfect, so using a simple
non-atomic increment is probably ok. :)

>     * userret / CURSIG case.
> 
>       My current hack is somewhat messy.  Presumably we will eventually be
>       able to remove Giant from this path.  The question is:  When?  And
>       what is the current state of work in that area?  Also, it seems to
>       me that a signal can come in after the check (with or without my
>       hack) and not get recognized until the next system call or interrupt.
>       This is bad.
> 
>       Comments? John?

The only reason Giant is in that path right now is for ktrace. :)  When ktrace
is fixed (coming as soon as I get all of td_ucred cleared and out of the way)
then this won't need Giant anymore.  Alternatively, could tweak it right now to
make the Giant lock/unlock #ifdef KTRACE.

>     * userret / NEEDRESCHED case.
> 
>       This case seems rather odd to me.  I understand why we might need to
>       obtain the sched_lock as a memory barrier but what happens if 
>       NEEDRESCHED is set after we release the sched_lock?  Interrupts
>       are enabled at the point where userret() is called, there is
>       nothing preventing either a signal or a NEEDRESCHED from occuring
>       after we've tested it but before we've returned to usermode,
>       causing us to not honor either until the next interrupt or syscall.
> 
>       Has anyone grappled with this issue yet?

Yes, that is what the interrupt disable stuff in the old version of ast() (that
is now MD) is about.  If another process or thread on another CPU triggers an
AST on this process, it will send an IPI when it does so.  Thus, if we don't
see the flag being set here, we will get the IPI via an interrupt after
resuming in user mode.  If you don't lock the access to ASTPENDING and
NEEDRESCHED there is a small race on archs w/ weaker memory models (i.e. !i386)
and SMP where you could miss the flag, however, since we always send the IPI in
that case and interrupts usually implicitly enforce the equivalence of membars
it will probably never occur.  Also, if it did, all it would mean is possibly
missing a reschedule or signal (the only AST's we get async) until the next
kernel entry which really isn't that big of a deal.

All of the MD code currently checks NEEDRESCHED and ASTPENDING w/o a lock right
now anyways thanks to Jake's changes.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.20020329144830.jhb>