From owner-freebsd-emulation@FreeBSD.ORG Sat May 10 12:29:08 2008 Return-Path: Delivered-To: freebsd-emulation@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 54CF9106566C for ; Sat, 10 May 2008 12:29:08 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail10.syd.optusnet.com.au (mail10.syd.optusnet.com.au [211.29.132.191]) by mx1.freebsd.org (Postfix) with ESMTP id DF2548FC37 for ; Sat, 10 May 2008 12:29:07 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c220-239-252-11.carlnfd3.nsw.optusnet.com.au [220.239.252.11]) by mail10.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m4ACSrYW024925 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 10 May 2008 22:28:56 +1000 Date: Sat, 10 May 2008 22:28:53 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Juergen Lock In-Reply-To: <20080509220922.GA13480@saturn.kn-bremen.de> Message-ID: <20080510213519.P3083@besplex.bde.org> References: <20080507162713.73A3A5B47@mail.bitblocks.com> <20080508195843.G17500@delplex.bde.org> <20080509220922.GA13480@saturn.kn-bremen.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-emulation@freebsd.org Subject: Re: seems I finally found what upset kqemu on amd64 SMP... shared gdt! (please test patch :) X-BeenThere: freebsd-emulation@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Development of Emulators of other operating systems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 10 May 2008 12:29:08 -0000 On Sat, 10 May 2008, Juergen Lock wrote: > On Thu, May 08, 2008 at 09:59:57PM +1000, Bruce Evans wrote: >> The message in npx.c is actually about violation of an even more >> fundamental invariant -- the invariant that owning the FPU includes >> having the TS flag clear so that DNA traps cannot occur. The bug in >> kqemu seems to be mismanagement of the TS flag related to this. I >> forget if it is the host or the target TS flag that seems to be mismanaged. >> For the target, it would take a bug in the virtualization of the TS flag >> to break this invariant (assuming no related bugs in the target kernel). >> > Well the `fpcurthread == curthread' bug has been fixed quite a while > ago already, or do you mean another one? I didn't know what is already fixed. >> The message in amd64/machdep.c is about violation of the invariant >> that the kernel cannot cause DNA traps. Spurious DNA traps in the >> ... >> > Okay I _think_ I know a little more about this now... kqemu itself > doesn't use the fpu, but the guest code it runs can, and in that case the > DNA trap is just used for (host) lazy fpu context switching like as if the > code was running in userland regularly. And I just tested the following > patch that should get rid of the message by calling fpudna/npxdna directly > (files/patch-fpucontext is the interesting part:) This seems reasonable. Is the following summary of my understanding of kqemu's implementation of this and your change correct?: - kqemu runs in kernel mode on the host and needs to have exactly the same effect as a DNA exception on the target. - having exactly the same effect requires calling the host DNA exception handler. - now it uses a software int $7 (dna) to implement the above, but this is not permitted in kernel mode (although the software int could be permitted, it is hard to distinguish from a hardware exception for unintentional use). - your change makes it call the DNA trap handler directly. This gives the same effect as a permitted software int $7. It is also faster. It would be better to use an official API for this, but none exists. > ... > +Index: kqemu-freebsd.c > +@@ -33,6 +33,11 @@ > + > + #include > + #include > ++#ifdef __x86_64__ > ++#include > ++#else > ++#include > ++#endif > + > + #include "kqemu-kernel.h" > + > +@@ -172,6 +177,15 @@ > + { > + } > + > ++void CDECL kqemu_loadfpucontext(unsigned long cpl) > ++{ > ++#ifdef __x86_64__ > ++ fpudna(); > ++#else > ++ npxdna(); > ++#endif > ++} Just be sure that the system state is not too different from that of trap() (directly below a syscall or trap from userland) when this is called. Better not have any interrupts disabled or locks held, though I think npxdna() doesn't care. The FPU must not be owned already at this point. > ++ > + #if __FreeBSD_version < 500000 > + static int > + curpriority_cmp(struct proc *p) I guess kqemu duplicates this old mistake instead of calling it because it is static. npxdna() is already public so it can be abused easily :-), Bruce