From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 1 18:18:54 2005 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 699E116A4CE; Fri, 1 Apr 2005 18:18:54 +0000 (GMT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0B5BE43D31; Fri, 1 Apr 2005 18:18:54 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) j31IIA0e059472; Fri, 1 Apr 2005 10:18:10 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id j31IIAM3059471; Fri, 1 Apr 2005 10:18:10 -0800 (PST) (envelope-from dillon) Date: Fri, 1 Apr 2005 10:18:10 -0800 (PST) From: Matthew Dillon Message-Id: <200504011818.j31IIAM3059471@apollo.backplane.com> To: Bruce Evans References: <423C15C5.6040902@fsn.hu> <20050327133059.3d68a78c@Magellan.Leidinger.net> <5bbfe7d405032823232103d537@mail.gmail.com> <424A23A8.5040109@ec.rr.com><20050330130051.GA4416@VARK.MIT.EDU> <200504010315.j313FGLn056122@apollo.backplane.com> <20050401215011.R24396@delplex.bde.org> cc: Peter Jeremy cc: David Schultz cc: hackers@freebsd.org cc: jason henson cc: bde@freebsd.org Subject: Re: Fwd: 5-STABLE kernel build with icc broken X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Apr 2005 18:18:54 -0000 Here is the core of the FPU setup and restoration code for the kernel bcopy in DragonFly, from i386/bcopy.s. DragonFly uses the TD_SAVEFPU-is-a-pointer method that was outlined in the original comment in the FreeBSD code. I further enhance the algorithm to guarentee that the FPU is in a sane state (does not require any further initialization other then a clts) if userland has NOT used it. However, there are definitely some race cases that must be considered (see the comments). The on-fault handling in DragonFly is stackable (which further simplifies the whole mess of on-fault vs non-on-fault copying code) and the DFly bcopy just sets up the frame for it whether or not the onfault handling is actually needed. This could be further optimized, but I had already spent at least a month on it and had to move on to other things. In particular, the setting of CR0_TS and the restoration of TD_SAVEFPU could be moved to the syscall-return code, so multiple in-kernel bcopy operations could be issued without any further FPU setup or teardown. -Matt /* * RACES/ALGORITHM: * * If gd_npxthread is not NULL we must save the application's * current FP state to the current save area and then NULL * out gd_npxthread to interlock against new interruptions * changing the FP state further. * * If gd_npxthread is NULL the FP unit is in a known 'safe' * state and may be used once the new save area is installed. * * race(1): If an interrupt occurs just prior to calling fxsave * all that happens is that fxsave gets a npxdna trap, restores * the app's environment, and immediately traps, restores, * and saves it again. * * race(2): No interrupt can safely occur after we NULL-out * npxthread until we fninit, because the kernel assumes that * the FP unit is in a safe state when npxthread is NULL. It's * more convenient to use a cli sequence here (it is not * considered to be in the critical path), but a critical * section would also work. * * race(3): The FP unit is in a known state (because npxthread * was either previously NULL or we saved and init'd and made * it NULL). This is true even if we are preempted and the * preempting thread uses the FP unit, because it will be * fninit's again on return. ANY STATE WE SAVE TO THE FPU MAY * BE DESTROYED BY PREEMPTION WHILE NPXTHREAD IS NULL! However, * an interrupt occuring inbetween clts and the setting of * gd_npxthread may set the TS bit again and cause the next * npxdna() to panic when it sees a non-NULL gd_npxthread. * * We can safely set TD_SAVEFPU to point to a new uninitialized * save area and then set GD_NPXTHREAD to non-NULL. If an * interrupt occurs after we set GD_NPXTHREAD, all that happens * is that the safe FP state gets saved and restored. We do not * need to fninit again. * * We can safely clts after setting up the new save-area, before * installing gd_npxthread, even if we get preempted just after * calling clts. This is because the FP unit will be in a safe * state while gd_npxthread is NULL. Setting gd_npxthread will * simply lock-in that safe-state. Calling clts saves * unnecessary trap overhead since we are about to use the FP * unit anyway and don't need to 'restore' any state prior to * that first use. */ #define MMX_SAVE_BLOCK(missfunc) \ cmpl $2048,%ecx ; \ jb missfunc ; \ movl MYCPU,%eax ; /* EAX = MYCPU */ \ btsl $1,GD_FPU_LOCK(%eax) ; \ jc missfunc ; \ pushl %ebx ; \ pushl %ecx ; \ movl GD_CURTHREAD(%eax),%edx ; /* EDX = CURTHREAD */ \ movl TD_SAVEFPU(%edx),%ebx ; /* save app save area */\ addl $TDPRI_CRIT,TD_PRI(%edx) ; \ cmpl $0,GD_NPXTHREAD(%eax) ; \ je 100f ; \ fxsave 0(%ebx) ; /* race(1) */ \ movl $0,GD_NPXTHREAD(%eax) ; /* interlock intr */ \ clts ; \ fninit ; /* race(2) */ \ 100: ; \ leal GD_SAVEFPU(%eax),%ecx ; \ movl %ecx,TD_SAVEFPU(%edx) ; \ clts ; \ movl %edx,GD_NPXTHREAD(%eax) ; /* race(3) */ \ subl $TDPRI_CRIT,TD_PRI(%edx) ; /* crit_exit() */ \ cmpl $0,GD_REQFLAGS(%eax) ; \ je 101f ; \ cmpl $TDPRI_CRIT,TD_PRI(%edx) ; \ jge 101f ; \ call lwkt_yield_quick ; \ /* note: eax,ecx,edx destroyed */ \ 101: ; \ movl (%esp),%ecx ; \ movl $mmx_onfault,(%esp) ; /* * When restoring the application's FP state we must first clear * npxthread to prevent further saves, then restore the pointer * to the app's save area. We do not have to (and should not) * restore the app's FP state now. Note that we do not have to * call fninit because our use of the FP guarentees that it is in * a 'safe' state (at least for kernel use). * * NOTE: it is not usually safe to mess with CR0 outside of a * critical section, because TS may get set by a preemptive * interrupt. However, we *can* race a load/set-ts/store against * an interrupt doing the same thing. */ #define MMX_RESTORE_BLOCK \ addl $4,%esp ; \ MMX_RESTORE_BLOCK2 #define MMX_RESTORE_BLOCK2 \ movl MYCPU,%ecx ; \ movl GD_CURTHREAD(%ecx),%edx ; \ movl $0,GD_NPXTHREAD(%ecx) ; \ movl %ebx,TD_SAVEFPU(%edx) ; \ smsw %ax ; \ popl %ebx ; \ orb $CR0_TS,%al ; \ lmsw %ax ; \ movl $0,GD_FPU_LOCK(%ecx)