From owner-freebsd-amd64@FreeBSD.ORG Thu May 1 10:54:21 2008 Return-Path: Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F1C46106566B; Thu, 1 May 2008 10:54:21 +0000 (UTC) (envelope-from nox@saturn.kn-bremen.de) Received: from gwyn.kn-bremen.de (gwyn.kn-bremen.de [212.63.36.242]) by mx1.freebsd.org (Postfix) with ESMTP id 7CE7E8FC0A; Thu, 1 May 2008 10:54:21 +0000 (UTC) (envelope-from nox@saturn.kn-bremen.de) Received: by gwyn.kn-bremen.de (Postfix, from userid 10) id C67082CAD4E; Thu, 1 May 2008 12:54:19 +0200 (CEST) Received: from saturn.kn-bremen.de (nox@localhost [127.0.0.1]) by saturn.kn-bremen.de (8.14.2/8.13.8) with ESMTP id m41AJpTa030617; Thu, 1 May 2008 12:19:51 +0200 (CEST) (envelope-from nox@saturn.kn-bremen.de) Received: (from nox@localhost) by saturn.kn-bremen.de (8.14.2/8.13.6/Submit) id m41AJpdk030616; Thu, 1 May 2008 12:19:51 +0200 (CEST) (envelope-from nox) From: Juergen Lock Date: Thu, 1 May 2008 12:19:51 +0200 To: freebsd-emulation@FreeBSD.org, freebsd-amd64@FreeBSD.org Message-ID: <20080501101951.GA30274@saturn.kn-bremen.de> Mail-Followup-To: freebsd-emulation@FreeBSD.org, freebsd-amd64@FreeBSD.org References: <20080429222458.GA20855@saturn.kn-bremen.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080429222458.GA20855@saturn.kn-bremen.de> User-Agent: Mutt/1.5.16 (2007-06-09) X-Mailman-Approved-At: Thu, 01 May 2008 11:20:42 +0000 Cc: Subject: Re: seems I finally found what upset kqemu on amd64 SMP... shared gdt! (please test patch :) X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 May 2008 10:54:22 -0000 On Wed, Apr 30, 2008 at 12:24:58AM +0200, Juergen Lock wrote: > Yeah, the amd64 kernel reuses the same gdt to setup all cpus, causing > kqemu to end up restoring the interrupt stackpointer (after running > guest code using its own cpu state) from the tss of the last cpu, > regardless which cpu it happened to run on. And that then causes the last > cpu's (usually) idle thread's stack to get smashed and the host doing > multiple panics... (Which also explains why pinning qemu onto cpu 1 > worked on a 2-way host.) > Hmm maybe the following is a little more clear: kqemu sets up its own cpu state and has to save and restore the original state because of that, so among other things it does an str insn (store task register), and later an ltr insn (load task register) using the value it got from the first str insn. That ltr insn loads the selector for the tss which is stored in the gdt, and that entry in the gdt is different for each cpu, but since a single gdt was reused to setup the cpus at boot (in init_secondary() in /sys/amd64/amd64/mp_machdep.c), it still points to the tss for the last cpu, instead of to the right one for the cpu the ltr insn gets executed on. That is what the kqemu_tss_workaround() in the patch `fixes'... > Here's the patch I just tested, of course you'd want to disable this > once the gdt is no longer shared, so assuming someone wants to fix this, > please also do an OSVERSION bump... The patch applied with offsets (I still had debug code in when I made it), here is a rebased version: Index: kqemu-freebsd.c @@ -33,6 +33,11 @@ #include #include +#ifdef __x86_64__ +#include +#include +#include +#endif #include "kqemu-kernel.h" @@ -234,6 +239,19 @@ va_end(ap); } +#ifdef __x86_64__ +/* called with interrupts disabled */ +void CDECL kqemu_tss_workaround(void) +{ + int gsel_tss = GSEL(GPROC0_SEL, SEL_KPL); + + gdt_segs[GPROC0_SEL].ssd_base = (long) &common_tss[PCPU_GET(cpuid)]; + ssdtosyssd(&gdt_segs[GPROC0_SEL], + (struct system_segment_descriptor *)&gdt[GPROC0_SEL]); + ltr(gsel_tss); +} +#endif + struct kqemu_instance { #if __FreeBSD_version >= 500000 TAILQ_ENTRY(kqemu_instance) kqemu_ent; Index: common/kernel.c @@ -1025,6 +1025,9 @@ #ifdef __x86_64__ uint16_t saved_ds, saved_es; unsigned long fs_base, gs_base; +#ifdef __FreeBSD__ + struct kqemu_global_state *g = s->global_state; +#endif #endif #ifdef PROFILE @@ -1188,6 +1191,13 @@ apic_restore_nmi(s, apic_nmi_mask); } profile_record(s); +#ifdef __FreeBSD__ +#ifdef __x86_64__ + spin_lock(&g->lock); + kqemu_tss_workaround(); + spin_unlock(&g->lock); +#endif +#endif if (s->mon_req == MON_REQ_IRQ) { struct kqemu_exception_regs *r; Index: kqemu-kernel.h @@ -44,4 +44,10 @@ void CDECL kqemu_log(const char *fmt, ...); +#ifdef __FreeBSD__ +#ifdef __x86_64__ +void CDECL kqemu_tss_workaround(void); +#endif +#endif + #endif /* KQEMU_KERNEL_H */