From owner-freebsd-amd64@FreeBSD.ORG  Mon May  5 13:56:20 2008
Return-Path: <owner-freebsd-amd64@FreeBSD.ORG>
Delivered-To: freebsd-amd64@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 76F821065676;
	Mon,  5 May 2008 13:56:20 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from elvis.mu.org (elvis.mu.org [192.203.228.196])
	by mx1.freebsd.org (Postfix) with ESMTP id 5430B8FC21;
	Mon,  5 May 2008 13:56:20 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from zion.baldwin.cx (unknown [208.65.91.234])
	by elvis.mu.org (Postfix) with ESMTP id 0B48F1A4D84;
	Mon,  5 May 2008 06:56:19 -0700 (PDT)
From: John Baldwin <jhb@freebsd.org>
To: Juergen Lock <nox@jelal.kn-bremen.de>
Date: Mon, 5 May 2008 09:50:57 -0400
User-Agent: KMail/1.9.7
References: <20080429222458.GA20855@saturn.kn-bremen.de>
	<200805011335.06415.jhb@freebsd.org>
	<20080503131139.GA37131@saturn.kn-bremen.de>
In-Reply-To: <20080503131139.GA37131@saturn.kn-bremen.de>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200805050950.57484.jhb@freebsd.org>
Cc: freebsd-emulation@freebsd.org, freebsd-amd64@freebsd.org
Subject: Re: seems I finally found what upset kqemu on amd64 SMP... shared
	gdt! (please test patch :)
X-BeenThere: freebsd-amd64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to the AMD64 platform <freebsd-amd64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-amd64>
List-Post: <mailto:freebsd-amd64@freebsd.org>
List-Help: <mailto:freebsd-amd64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-amd64>,
	<mailto:freebsd-amd64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 05 May 2008 13:56:20 -0000

On Saturday 03 May 2008 09:11:39 am Juergen Lock wrote:
> On Thu, May 01, 2008 at 01:35:06PM -0400, John Baldwin wrote:
> > On Thursday 01 May 2008 11:53:04 am Juergen Lock wrote:
> > > On Thu, May 01, 2008 at 10:11:06AM -0400, John Baldwin wrote:
> > > > On Thursday 01 May 2008 06:19:51 am Juergen Lock wrote:
> > > > > On Wed, Apr 30, 2008 at 12:24:58AM +0200, Juergen Lock wrote:
> > > > > > Yeah, the amd64 kernel reuses the same gdt to setup all cpus,
> > > > > > causing kqemu to end up restoring the interrupt stackpointer
> > > > > > (after running guest code using its own cpu state) from the tss
> > > > > > of the last cpu, regardless which cpu it happened to run on.  And
> > > > > > that then causes the last cpu's (usually) idle thread's stack to
> > > > > > get smashed and the host doing multiple panics...  (Which also
> > > > > > explains why pinning qemu onto
> >
> > cpu
> >
> > > > > > 1 worked on a 2-way host.)
> > > > >
> > > > > Hmm maybe the following is a little more clear:  kqemu sets up its
> > > > > own cpu state and has to save and restore the original state
> > > > > because of
> >
> > that,
> >
> > > > > so among other things it does an str insn (store task register),
> > > > > and
> >
> > later
> >
> > > > > an ltr insn (load task register) using the value it got from the
> > > > > first str insn.  That ltr insn loads the selector for the tss which
> > > > > is stored in the gdt, and that entry in the gdt is different for
> > > > > each cpu, but
> >
> > since
> >
> > > > > a single gdt was reused to setup the cpus at boot (in
> > > > > init_secondary()
> >
> > in
> >
> > > > > /sys/amd64/amd64/mp_machdep.c), it still points to the tss for the
> > > > > last cpu, instead of to the right one for the cpu the ltr insn gets
> > > > > executed
> >
> > on.
> >
> > > > > That is what the kqemu_tss_workaround() in the patch `fixes'...
> > > >
> > > > Perhaps kqemu shouldn't be doing str/ltr on amd64 instead?  The
> > > > things
> >
> > i386
> >
> > > > uses a separate tss for in the kernel (separate stack for double
> > > > faults)
> >
> > is
> >
> > > > handled differently on amd64 (on amd64 we make the double fault
> > > > handler
> >
> > use
> >
> > > > one of the IST stacks).
> > >
> > > Well, kqemu uses its own gdt, tss and everything while running guest
> > > code in its monitor, so it kinda has to do the str/ltr.s to setup its
> > > stuff, run guest code, and then restore the original state of things. 
> > > (And `restore original state of things' is what failed here.)
> > >
> > >  Oh and also the tss does seem to be used for the interrupt stack on
> > > amd64 too, at least thats the one that ended up wrong and caused the
> > > panics I saw...
> >
> > The single TSS holds the IST pointers.  On i386 we use a separate TSS for
> > double faults, but on amd64 a double fault uses the same TSS but uses the
> > IST pointers from that same TSS.  The TSS also holds the ring stack
> > pointer for when syscalls, interrupts, and traps from userland cross from
> > ring 3 to ring 0 which is probably why you got a panic.
>
> Yeah thats where it happened.
>
> > Because of the fact that amd64 in normal operation never changes the task
> > register (and that the gdt isn't used quite the same either, all the
> > per-cpu stuff is via FSBASE and GSBASE) I don't expect the kernel to
> > change to use a per-cpu gdt or the like.  I think you will need to use
> > the current approach of patching kqemu to fixup the tss/gdt when
> > reloading the task register.  You might want to make it a regular part of
> > the code rather than a workaround as a result.
>
>  Hmm okay, how would you call it then, kqemu_tss_fixup?

Sure.

-- 
John Baldwin