From owner-freebsd-hackers@FreeBSD.ORG  Fri Apr  1 18:18:54 2005
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 699E116A4CE; Fri,  1 Apr 2005 18:18:54 +0000 (GMT)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 0B5BE43D31; Fri,  1 Apr 2005 18:18:54 +0000 (GMT)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	j31IIA0e059472;	Fri, 1 Apr 2005 10:18:10 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id j31IIAM3059471;
	Fri, 1 Apr 2005 10:18:10 -0800 (PST)
	(envelope-from dillon)
Date: Fri, 1 Apr 2005 10:18:10 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200504011818.j31IIAM3059471@apollo.backplane.com>
To: Bruce Evans <bde@zeta.org.au>
References: <423C15C5.6040902@fsn.hu>
	<20050327133059.3d68a78c@Magellan.Leidinger.net>
	<5bbfe7d405032823232103d537@mail.gmail.com>
	<424A23A8.5040109@ec.rr.com><20050330130051.GA4416@VARK.MIT.EDU>
	<200504010315.j313FGLn056122@apollo.backplane.com>
	<20050401215011.R24396@delplex.bde.org>
cc: Peter Jeremy <PeterJeremy@optushome.com.au>
cc: David Schultz <das@freebsd.org>
cc: hackers@freebsd.org
cc: jason henson <jason@ec.rr.com>
cc: bde@freebsd.org
Subject: Re: Fwd: 5-STABLE kernel build with icc broken
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Apr 2005 18:18:54 -0000

    Here is the core of the FPU setup and restoration code for the kernel
    bcopy in DragonFly, from i386/bcopy.s.

    DragonFly uses the TD_SAVEFPU-is-a-pointer method that was outlined in
    the original comment in the FreeBSD code.  I further enhance the
    algorithm to guarentee that the FPU is in a sane state (does not
    require any further initialization other then a clts) if userland has
    NOT used it.  However, there are definitely some race cases that
    must be considered (see the comments).

    The on-fault handling in DragonFly is stackable (which further simplifies
    the whole mess of on-fault vs non-on-fault copying code) and the DFly
    bcopy just sets up the frame for it whether or not the onfault handling 
    is actually needed.

    This could be further optimized, but I had already spent at least a month
    on it and had to move on to other things.  In particular, the setting
    of CR0_TS and the restoration of TD_SAVEFPU could be moved to the
    syscall-return code, so multiple in-kernel bcopy operations could be
    issued without any further FPU setup or teardown.

						-Matt

	/*
         * RACES/ALGORITHM:
         *
         *      If gd_npxthread is not NULL we must save the application's
         *      current FP state to the current save area and then NULL
         *      out gd_npxthread to interlock against new interruptions
         *      changing the FP state further.
         *
         *      If gd_npxthread is NULL the FP unit is in a known 'safe'
         *      state and may be used once the new save area is installed.
         *
         *      race(1): If an interrupt occurs just prior to calling fxsave
         *      all that happens is that fxsave gets a npxdna trap, restores
         *      the app's environment, and immediately traps, restores,
         *      and saves it again.
         *
         *      race(2): No interrupt can safely occur after we NULL-out
         *      npxthread until we fninit, because the kernel assumes that
         *      the FP unit is in a safe state when npxthread is NULL.  It's
         *      more convenient to use a cli sequence here (it is not
         *      considered to be in the critical path), but a critical
         *      section would also work.
         *
         *      race(3): The FP unit is in a known state (because npxthread
         *      was either previously NULL or we saved and init'd and made
         *      it NULL).  This is true even if we are preempted and the
         *      preempting thread uses the FP unit, because it will be
         *      fninit's again on return.  ANY STATE WE SAVE TO THE FPU MAY
         *      BE DESTROYED BY PREEMPTION WHILE NPXTHREAD IS NULL!  However,
         *      an interrupt occuring inbetween clts and the setting of
         *      gd_npxthread may set the TS bit again and cause the next
         *      npxdna() to panic when it sees a non-NULL gd_npxthread.
         *
         *      We can safely set TD_SAVEFPU to point to a new uninitialized
         *      save area and then set GD_NPXTHREAD to non-NULL.  If an
         *      interrupt occurs after we set GD_NPXTHREAD, all that happens
         *      is that the safe FP state gets saved and restored.  We do not
         *      need to fninit again.
         *
         *      We can safely clts after setting up the new save-area, before
         *      installing gd_npxthread, even if we get preempted just after
         *      calling clts.  This is because the FP unit will be in a safe
         *      state while gd_npxthread is NULL.  Setting gd_npxthread will
         *      simply lock-in that safe-state.  Calling clts saves
         *      unnecessary trap overhead since we are about to use the FP
         *      unit anyway and don't need to 'restore' any state prior to
         *      that first use.
	 */

#define MMX_SAVE_BLOCK(missfunc)                                        \
        cmpl    $2048,%ecx ;                                            \
        jb      missfunc ;                                              \
        movl    MYCPU,%eax ;                    /* EAX = MYCPU */       \
        btsl    $1,GD_FPU_LOCK(%eax) ;                                  \
        jc      missfunc ;                                              \
        pushl   %ebx ;                                                  \
        pushl   %ecx ;                                                  \
        movl    GD_CURTHREAD(%eax),%edx ;       /* EDX = CURTHREAD */   \
        movl    TD_SAVEFPU(%edx),%ebx ;         /* save app save area */\
        addl    $TDPRI_CRIT,TD_PRI(%edx) ;                              \
        cmpl    $0,GD_NPXTHREAD(%eax) ;                                 \
        je      100f ;                                                  \
        fxsave  0(%ebx) ;                       /* race(1) */           \
        movl    $0,GD_NPXTHREAD(%eax) ;         /* interlock intr */    \
        clts ;                                                          \
        fninit ;                                /* race(2) */           \
100: ;                                                                  \
        leal    GD_SAVEFPU(%eax),%ecx ;                                 \
        movl    %ecx,TD_SAVEFPU(%edx) ;                                 \
        clts ;                                                          \
        movl    %edx,GD_NPXTHREAD(%eax) ;       /* race(3) */           \
        subl    $TDPRI_CRIT,TD_PRI(%edx) ;      /* crit_exit() */       \
        cmpl    $0,GD_REQFLAGS(%eax) ;                                  \
        je      101f ;                                                  \
        cmpl    $TDPRI_CRIT,TD_PRI(%edx) ;                              \
        jge     101f ;                                                  \
        call    lwkt_yield_quick ;                                      \
        /* note: eax,ecx,edx destroyed */                               \
101: ;                                                                  \
        movl    (%esp),%ecx ;                                           \
        movl    $mmx_onfault,(%esp) ;

        /*
         * When restoring the application's FP state we must first clear
         * npxthread to prevent further saves, then restore the pointer
         * to the app's save area.  We do not have to (and should not)
         * restore the app's FP state now.  Note that we do not have to
         * call fninit because our use of the FP guarentees that it is in
         * a 'safe' state (at least for kernel use).
         *
         * NOTE: it is not usually safe to mess with CR0 outside of a
         * critical section, because TS may get set by a preemptive
         * interrupt.  However, we *can* race a load/set-ts/store against
         * an interrupt doing the same thing.
         */

#define MMX_RESTORE_BLOCK                       \
        addl    $4,%esp ;                       \
        MMX_RESTORE_BLOCK2

#define MMX_RESTORE_BLOCK2                      \
        movl    MYCPU,%ecx ;                    \
        movl    GD_CURTHREAD(%ecx),%edx ;       \
        movl    $0,GD_NPXTHREAD(%ecx) ;         \
        movl    %ebx,TD_SAVEFPU(%edx) ;         \
        smsw    %ax ;                           \
        popl    %ebx ;                          \
        orb     $CR0_TS,%al ;                   \
        lmsw    %ax ;                           \
        movl    $0,GD_FPU_LOCK(%ecx)