Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 May 2018 12:15:52 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Mateusz Guzik <mjg@freebsd.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org,  svn-src-head@freebsd.org
Subject:   Re: svn commit: r333344 - in head/sys: kern sys
Message-ID:  <20180508113755.S1350@besplex.bde.org>
In-Reply-To: <201805072336.w47NaGXH001864@repo.freebsd.org>
References:  <201805072336.w47NaGXH001864@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 7 May 2018, Mateusz Guzik wrote:

> Log:
>  Inlined sched_userret.
>
>  The tested condition is rarely true and it induces a function call
>  on each return to userspace.
>
>  Bumps getuid rate by about 1% on Broadwell.
> ...
> Modified: head/sys/kern/sched_4bsd.c
> ==============================================================================
> --- head/sys/kern/sched_4bsd.c	Mon May  7 23:23:11 2018	(r333343)
> +++ head/sys/kern/sched_4bsd.c	Mon May  7 23:36:16 2018	(r333344)
> @@ -1481,25 +1481,13 @@ sched_preempt(struct thread *td)
> }
>
> void
> -sched_userret(struct thread *td)
> +sched_userret_slowpath(struct thread *td)
> {
> -	/*
> -	 * XXX we cheat slightly on the locking here to avoid locking in
> -	 * the usual case.  Setting td_priority here is essentially an
> -	 * incomplete workaround for not setting it properly elsewhere.
> -	 * Now that some interrupt handlers are threads, not setting it
> -	 * properly elsewhere can clobber it in the window between setting
> -	 * it here and returning to user mode, so don't waste time setting
> -	 * it perfectly here.
> -	 */

This moves the comment to where it mostly doesn't apply.  It was already
slightly rotted.

> -	KASSERT((td->td_flags & TDF_BORROWING) == 0,
> -	    ("thread with borrowed priority returning to userland"));
> -	if (td->td_priority != td->td_user_pri) {

The first sentence in the comment applies to not locking for this test.
I wonder if there is a larger problem with the newer TDF_BORROWING test.

> -		thread_lock(td);
> -		td->td_priority = td->td_user_pri;
> -		td->td_base_pri = td->td_user_pri;
> -		thread_unlock(td);
> -	}
> +
> +	thread_lock(td);
> +	td->td_priority = td->td_user_pri;
> +	td->td_base_pri = td->td_user_pri;
> +	thread_unlock(td);
> }

No cheating here.  The comment should have started a new paragraph (or
be split up) for this part.

> ...
> Modified: head/sys/sys/sched.h
> ...
> +static inline void
> +sched_userret(struct thread *td)
> +{
> +
> +	/*
> +	 * XXX we cheat slightly on the locking here to avoid locking in
> +	 * the usual case.  Setting td_priority here is essentially an
> +	 * incomplete workaround for not setting it properly elsewhere.
> +	 * Now that some interrupt handlers are threads, not setting it
> +	 * properly elsewhere can clobber it in the window between setting
> +	 * it here and returning to user mode, so don't waste time setting
> +	 * it perfectly here.
> +	 */
> +	KASSERT((td->td_flags & TDF_BORROWING) == 0,
> +	    ("thread with borrowed priority returning to userland"));
> +	if (__predict_false(td->td_priority != td->td_user_pri))
> +		sched_userret_slowpath(td);
> +}

Only the first sentence in the comment applies here.

I don't like the __predict_false() obfuscation.  It is especially useless
here, since we have manually moved the slow path to a function, so all
the compiler can do is change the sense of a branch and it should get
that right anyway.

I have used the following more hackish optimization for this for 20-25
years, in i386/trap.c up to FreeBSD-~5.2 only:

  #define	userret(td, framep, sticks) do					\
XX 	if ((td)->td_proc->p_flag & P_PROFIL)				\
XX 		(userret)((td), (framep), (sticks));			\
XX 	else								\
XX 		(td)->td_priority = (td)->td_ksegrp->kg_user_pri;	\
XX while (0)

The whole sched_userret() API is fairly pessimal.  sched_userret() is the
same for SCHED_ULE as for SCHED_4BSD except for 1 more statement in the
former.  Apart from that, scheduler layering is not needed for this function.
The layering is currently:

     syscall return -> userret() -> sched_userret()

with the last call now optimized to a tail call, and much more overhead
than the P_PROFIL test in userret() (but "only" 21 instructions with
lots of branches before the tail call).  Doing the tail call depends
on RACCT and some other options not being configured.  The P_PROFIL
check was moved earlier, perhaps to allow the tail call, but doing
RACCT later defeats this.  Anyway, it makes most sense to check the
priorities last.  RACCT sleeps after checking the priorities, so they
could change a lot.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180508113755.S1350>