From owner-freebsd-current@FreeBSD.ORG  Wed Jan 17 22:52:21 2007
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
X-Original-To: freebsd-current@freebsd.org
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 0089816A40F;
	Wed, 17 Jan 2007 22:52:21 +0000 (UTC)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.freebsd.org (Postfix) with ESMTP id CD8B413C45D;
	Wed, 17 Jan 2007 22:52:20 +0000 (UTC)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	by apollo.backplane.com (8.13.7/8.13.7) with ESMTP id l0HMhEx9054400;
	Wed, 17 Jan 2007 14:46:14 -0800 (PST)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.13.7/8.13.4/Submit) id l0HMhECY054399;
	Wed, 17 Jan 2007 14:43:14 -0800 (PST)
Date: Wed, 17 Jan 2007 14:43:14 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200701172243.l0HMhECY054399@apollo.backplane.com>
To: Ivan Voras <ivoras@fer.hr>
References: <3bbf2fe10607250813w8ff9e34pc505bf290e71758@mail.gmail.com>	<3bb
	f2fe10607281004o6727e976h19ee7e054876f914@mail.gmail.com>	<3bbf2fe107011608
	51r79b04464m2cbdbb7f644b22b6@mail.gmail.com>	<20070116154258.568e1aaf@pleia
	des.nextvenue.com>	<b1fa29170701161355lc021b90o35fa5f9acb5749d@mail.gmail.c
	om>	<eoji7s$cit$2@sea.gmane.org>	<b1fa29170701161425n7bcfe1e5m1b8c671caf375
	8db@mail.gmail.com>	<eojlnb$qje$1@sea.gmane.org>	<3bbf2fe10701161525j6ad929
	2y93502b8df0f67aa9@mail.gmail.com>	<45AD6DFA.6030808@FreeBSD.org>	<3bbf2fe1
	0701161655p5e686b52n7340b3100ecfab93@mail.gmail.com> 
	<200701172022.l0HKMYV8053837@apollo.backplane.com>
	<45AE8DDC.8030402@fer.hr>
Cc: freebsd-current@freebsd.org, freebsd-arch@freebsd.org
Subject: Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Jan 2007 22:52:21 -0000


:Does the same hold true with kernel threads in FreeBSD (e.g. two threads
:using FPU)?

    Preemption and pinning make the issue a bit more difficult for FreeBSD,
    but the basic idea remains valid.

    From the point of view of NPXTHREAD the situation is very simple:

    * NPXTHREAD = NULL

	nobody owns the FP, nobody is using the FP.  If the kernel wants
	to use the FP it just FNCLEX + CLTS and sets NPXTHREAD = curthread.
	When it is finished, it undoes that sequence (NPXTHREAD = NULL,
	set CR0_TS again).

	PLUSES: FP state does not need to be saved or restored

	ISSUES: due to cpu migration and preemption the setup and teardown
	sequence must be done with the cpu pinned, inside a critical section.
	But the actual use of the FP does not need to occur inside a 
	critical section or with the cpu pinned.

    * NPXTHREAD = other_thread

	Some other thread owns the FP, but it isn't our thread so we can
	safely save the FP state for the other thread without worrying
	about creating a situation where we thrash the T_DNA exception.
	Save FP state, FNCLEX, CLTS, set NPXTHREAD = curthread.
	When finished, NPXTHREAD = NULL, set CR0_TS, do *not* restore
	the 'other' thread's FP state.

	PLUSES: The FP state probably had to be saved anyway, it's no big
	deal or at least it is not as big a deal as the NPXTHREAD = curthread
	case.

	ISSUES: Same as above.

    * NPXTHREAD = curthread

	The current thread (either userland or a pushed kernel FP context)
	is using the FP.  If the kernel decides it needs the FP it
	must save the FP state, FNCLEX, CLTS, do its thing.  When finished
	it can decide to set NPXTHREAD = NULL and set CR0_TS, or it 
	can restore the previously saved state and leave NPXTHREAD = curthread.

	PLUSES: Very few

	ISSUES: Same as above, but here the kernel must decide whether it
	is worth stealing the FP or not, because it might get into a 
	thrashing situation with the T_DNA exception under certain
	userland loads.

	Note that there are many cases where userland may use the FP unit
	very occassionally.  In such cases you *DO* want to be able to steal
	it, so perhaps some heuristic is needed to determine the cost of
	stealing the FP unit dynamically.

    It is possible to abstract it even more... for example, one can set
    CR0_TS when going from userland to the kernel and completely abstract
    out the kernel's use of the FP unit at the cost of a higher entrance
    fee to get in and out of the kernel.  I decided NOT to do this in
    DragonFly.  If the DragonFly kernel wants to use the FP it has to 
    check and adjust the NPXTHREAD state.

    But, to be absolutely clear here, it costs virtually *nothing* to use
    the FP in the kernel for non-FP media instructions (i.e. movdq and
    friends) if userland has not used the FP recently.  You push a 
    temporary save area, set NPXTHREAD, FNCLEX, CLTS, use the FP, then
    pop the save area pointer, set NPXTHREAD to NULL, and set CR0_TS, and
    that's it.  It may seem like a lot of steps but those are all
    very fast instructions verses having to actually save and restore the
    512 byte FP state.  The biggest overhead would actually be the critical
    section and cpu pinning required to properly transition the NPXTHREAD
    state.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>