From owner-freebsd-amd64@FreeBSD.ORG Tue May 6 18:21:29 2008 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 326D41065673; Tue, 6 May 2008 18:21:29 +0000 (UTC) (envelope-from nox@saturn.kn-bremen.de) Received: from gwyn.kn-bremen.de (gwyn.kn-bremen.de [212.63.36.242]) by mx1.freebsd.org (Postfix) with ESMTP id A6F328FC15; Tue, 6 May 2008 18:21:28 +0000 (UTC) (envelope-from nox@saturn.kn-bremen.de) Received: by gwyn.kn-bremen.de (Postfix, from userid 10) id B07002CE67D; Tue, 6 May 2008 20:21:26 +0200 (CEST) Received: from saturn.kn-bremen.de (nox@localhost [127.0.0.1]) by saturn.kn-bremen.de (8.14.2/8.13.8) with ESMTP id m46IJVAX022873; Tue, 6 May 2008 20:19:31 +0200 (CEST) (envelope-from nox@saturn.kn-bremen.de) Received: (from nox@localhost) by saturn.kn-bremen.de (8.14.2/8.13.6/Submit) id m46IJVfX022872; Tue, 6 May 2008 20:19:31 +0200 (CEST) (envelope-from nox) From: Juergen Lock Date: Tue, 6 May 2008 20:19:31 +0200 To: John Baldwin Message-ID: <20080506181931.GA22856@saturn.kn-bremen.de> Mail-Followup-To: John Baldwin , freebsd-amd64@freebsd.org, freebsd-emulation@freebsd.org References: <20080429222458.GA20855@saturn.kn-bremen.de> <200805011335.06415.jhb@freebsd.org> <20080503131139.GA37131@saturn.kn-bremen.de> <200805050950.57484.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200805050950.57484.jhb@freebsd.org> User-Agent: Mutt/1.5.16 (2007-06-09) X-Mailman-Approved-At: Tue, 06 May 2008 18:47:26 +0000 Cc: freebsd-emulation@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: seems I finally found what upset kqemu on amd64 SMP... shared gdt! (please test patch :) X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 May 2008 18:21:29 -0000 On Mon, May 05, 2008 at 09:50:57AM -0400, John Baldwin wrote: > On Saturday 03 May 2008 09:11:39 am Juergen Lock wrote: > > On Thu, May 01, 2008 at 01:35:06PM -0400, John Baldwin wrote: > > > On Thursday 01 May 2008 11:53:04 am Juergen Lock wrote: > > > > On Thu, May 01, 2008 at 10:11:06AM -0400, John Baldwin wrote: > > > > > On Thursday 01 May 2008 06:19:51 am Juergen Lock wrote: > > > > > > On Wed, Apr 30, 2008 at 12:24:58AM +0200, Juergen Lock wrote: > > > > > > > Yeah, the amd64 kernel reuses the same gdt to setup all cpus, > > > > > > > causing kqemu to end up restoring the interrupt stackpointer > > > > > > > (after running guest code using its own cpu state) from the tss > > > > > > > of the last cpu, regardless which cpu it happened to run on. And > > > > > > > that then causes the last cpu's (usually) idle thread's stack to > > > > > > > get smashed and the host doing multiple panics... (Which also > > > > > > > explains why pinning qemu onto > > > > > > cpu > > > > > > > > > > 1 worked on a 2-way host.) > > > > > > > > > > > > Hmm maybe the following is a little more clear: kqemu sets up its > > > > > > own cpu state and has to save and restore the original state > > > > > > because of > > > > > > that, > > > > > > > > > so among other things it does an str insn (store task register), > > > > > > and > > > > > > later > > > > > > > > > an ltr insn (load task register) using the value it got from the > > > > > > first str insn. That ltr insn loads the selector for the tss which > > > > > > is stored in the gdt, and that entry in the gdt is different for > > > > > > each cpu, but > > > > > > since > > > > > > > > > a single gdt was reused to setup the cpus at boot (in > > > > > > init_secondary() > > > > > > in > > > > > > > > > /sys/amd64/amd64/mp_machdep.c), it still points to the tss for the > > > > > > last cpu, instead of to the right one for the cpu the ltr insn gets > > > > > > executed > > > > > > on. > > > > > > > > > That is what the kqemu_tss_workaround() in the patch `fixes'... > > > > > > > > > > Perhaps kqemu shouldn't be doing str/ltr on amd64 instead? The > > > > > things > > > > > > i386 > > > > > > > > uses a separate tss for in the kernel (separate stack for double > > > > > faults) > > > > > > is > > > > > > > > handled differently on amd64 (on amd64 we make the double fault > > > > > handler > > > > > > use > > > > > > > > one of the IST stacks). > > > > > > > > Well, kqemu uses its own gdt, tss and everything while running guest > > > > code in its monitor, so it kinda has to do the str/ltr.s to setup its > > > > stuff, run guest code, and then restore the original state of things. > > > > (And `restore original state of things' is what failed here.) > > > > > > > > Oh and also the tss does seem to be used for the interrupt stack on > > > > amd64 too, at least thats the one that ended up wrong and caused the > > > > panics I saw... > > > > > > The single TSS holds the IST pointers. On i386 we use a separate TSS for > > > double faults, but on amd64 a double fault uses the same TSS but uses the > > > IST pointers from that same TSS. The TSS also holds the ring stack > > > pointer for when syscalls, interrupts, and traps from userland cross from > > > ring 3 to ring 0 which is probably why you got a panic. > > > > Yeah thats where it happened. > > > > > Because of the fact that amd64 in normal operation never changes the task > > > register (and that the gdt isn't used quite the same either, all the > > > per-cpu stuff is via FSBASE and GSBASE) I don't expect the kernel to > > > change to use a per-cpu gdt or the like. I think you will need to use > > > the current approach of patching kqemu to fixup the tss/gdt when > > > reloading the task register. You might want to make it a regular part of > > > the code rather than a workaround as a result. > > > > Hmm okay, how would you call it then, kqemu_tss_fixup? > > Sure. Okay, I shall rename it at the next kqemu commit. Thanx, Juergen