From owner-freebsd-mips@FreeBSD.ORG  Sat Mar 17 08:23:42 2012
Return-Path: <owner-freebsd-mips@FreeBSD.ORG>
Delivered-To: freebsd-mips@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 6D02F1065670
	for <freebsd-mips@freebsd.org>; Sat, 17 Mar 2012 08:23:42 +0000 (UTC)
	(envelope-from juli@clockworksquid.com)
Received: from mail-wi0-f178.google.com (mail-wi0-f178.google.com
	[209.85.212.178])
	by mx1.freebsd.org (Postfix) with ESMTP id E6AD48FC19
	for <freebsd-mips@freebsd.org>; Sat, 17 Mar 2012 08:23:41 +0000 (UTC)
Received: by wibhq7 with SMTP id hq7so1439083wib.13
	for <freebsd-mips@freebsd.org>; Sat, 17 Mar 2012 01:23:35 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=google.com; s=20120113;
	h=mime-version:in-reply-to:references:from:date:message-id:subject:to
	:cc:content-type:content-transfer-encoding:x-gm-message-state;
	bh=6KZXtGTUzUi+/a+q/J3fqXWukS4S1EaEDn4sR55MqSo=;
	b=oAkx6ywD/Zb5WNpSGplVShAAQ1crSkgm6g10k+qAydYiaJRhNCwYQrcDkU7PjhTgiX
	FnvlezradhVXJLxzInRe4fbBKEyNELUWEztN9Zf51qss6u15kNih33LltO5bVy2SmZn1
	6LKF+Dog/RxBpDDzcLlMA8m44orfn10FkItC7GEI6r3Rhe5lZO0Jr5W69ScHfeCfrN3a
	sfywiBdzZmg90JrGrM/XV52kUZU8ksuAL1DxrHtE/gTaMYlGThdv7j8k9kG95sv1lfC8
	hRTiwWa3mnjef+CqsjsyBG05N3K8TDx3DeA9yREf+OqUNfrqMoVLALouHYpToLaVd2n/
	XXeQ==
Received: by 10.180.104.137 with SMTP id ge9mr4678776wib.20.1331972615352;
	Sat, 17 Mar 2012 01:23:35 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.180.96.231 with HTTP; Sat, 17 Mar 2012 01:23:15 -0700 (PDT)
In-Reply-To: <980330D2-1890-48DC-8D7C-5D831F18E42D@bsdimp.com>
References: <CACVs6=8z4BYcpQ=jVKLLb7v2LmSD-MRxXQdYRrOj-hG1j572Cg@mail.gmail.com>
	<E580AB5B-AD2A-4E04-A040-8E9E5D667040@bsdimp.com>
	<CACVs6=_FXqM2vjx1B4C819kuapaCXek-RfVwo0YPsk74r4+gkA@mail.gmail.com>
	<980330D2-1890-48DC-8D7C-5D831F18E42D@bsdimp.com>
From: Juli Mallett <juli@clockworksquid.com>
Date: Sat, 17 Mar 2012 01:23:15 -0700
Message-ID: <CACVs6=-KVUuc0Vqz1cAQTsDSvB2j=KKj=bhmp3Ju5qVfBh0-Ng@mail.gmail.com>
To: Warner Losh <imp@bsdimp.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQmPjkINUyF2ACMb1j6/eDgBpI1heceqgHURUv98xkLdg1Tx14FI9zkAVylLx+EDiKyyO11z
Cc: "freebsd-mips@FreeBSD.org" <freebsd-mips@freebsd.org>
Subject: Re: Unbreaking ports with n64 MIPS.
X-BeenThere: freebsd-mips@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Porting FreeBSD to MIPS <freebsd-mips.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-mips>,
	<mailto:freebsd-mips-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-mips>
List-Post: <mailto:freebsd-mips@freebsd.org>
List-Help: <mailto:freebsd-mips-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-mips>,
	<mailto:freebsd-mips-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 17 Mar 2012 08:23:42 -0000

On Fri, Mar 16, 2012 at 23:05, Warner Losh <imp@bsdimp.com> wrote:
> The argument for adding the alaises is transition from older release.

Indeed.  What do you think about Makefile.inc1 giving a helpful error
if TARGET_ARCH is set to mips(n32|64)?eb so that it's somewhat more
guided (i.e. it's not that the build breaks at some point due to wrong
TARGET_CPUARCH) but still not the baggage of an alias?  Is an alias
much of a bigger win?

> This is a bigger discussion. =C2=A0Several issues:
>
> (1) Multilib. =C2=A0If we had multilib, then we can build one or more of =
{o32, n32, n64}. =C2=A0Then the ABI decision would be what to build for the=
 entire system. =C2=A0SGI used n64 for everything. =C2=A0Other systems have=
 a default ABI that we build.

SGI used n32 plenty :)  I still have several IRIX systems that in SGI
parlance were "32-bit systems" because they were lowish-end, but they
were mips3-based (the R4K and R4400 went into systems like this) and
after IRIX 6.2 (IIRC) were using n32 and not o32.

> (2) What's the default ABI that tools produce? =C2=A0Is this tied to MACH=
INE_ARCH? =C2=A0We spent a lot of time making sure that we have the right d=
efault tools so we build everything correctly.

The default has to be right not just for MACHINE_ARCH but also for
CPUTYPE/TARGET_CPUTYPE.  I've complained about the binutils
shortcoming that necessitates this before, but I'm still not happy
about it.  Perhaps our MACHINE_ARCH should be more like an ISA if we
have a more mature notion of ABI, and then much of the need for
TARGET_CPUTYPE goes away.

> (3) Do we support building other ABIs as part of the build system. =C2=A0=
We had that before, but TARGET_BIG_ENDIAN removal killed that. =C2=A0There'=
s pros and cons of adding support here. =C2=A0Multiple ABIs does junk up a =
lot of places in very machine specific ways. =C2=A0Lots of places need twea=
king if we go back to this.

Did it kill that?  -mabi=3D{32,n32,64} works with my n64 base system.
-mabi=3Do64 fails wrongly, but o64 has never been quite right with our
binutils.

> MACHINE_ABI is what we need. =C2=A0But do we really need it? =C2=A0If we =
want to support building different ABIs for the same MACHINE_ARCH, then we'=
ll need some way to persist this so we can be self-hosting. =C2=A0Right now=
 the 'make this the default ABI' method for gcc/binutils persists this info=
rmation and makes things work nicely. =C2=A0sysctl likely is the way to go =
here.

I think we are trying to persist too many parameters, really.  ISA and
ABI (including endianness) are really what we have to persist, but
we're doing it piecemeal in slightly-contradictory ways in several
places, and are talking about adding more.  It sucks.  I'm also not
sure how we solve it well.  I think moving FreeBSD to a triple-like
model in which it's all in one place and easy to parse out would be
nice.  ISA, ISA variant, ABI, Endianness.  Kernels inherently have
each of those, too, but can support variations of at least the last
two in userland, and so the possibility of persisting these things
through a sysctl is I think problematic.  I have an n64 kernel and an
o32 world, why should self-hosting (without overrides) mean I end up
with an o32 kernel?

> I'm sure that this has decayed into dust. =C2=A0I tried to get gcc to gen=
erate -msoft-float on x86, and it just didn't work. =C2=A0Today, I think we=
 burn this into the default settings of the toolchain we use to bootstrap t=
he system. =C2=A0We can have a knob for it, but it is purely a userland con=
cept: there's no floating point in the kernel to speak of. =C2=A0MACHINE_FL=
OAT=3D{hard,soft} might not be a bad idea, with the value exported via sysc=
tl. =C2=A0Not sure if make needs to grow support for this and MACHINE_ABI, =
or if it would suffice for the necessary Makefiles and/or .mk files to quer=
y the sysctl value.

I'd argue that floatness is a part of the ISA variant sort of field
above.  mips64r2-octeon-n64-big has soft float, for example.
mips64r2-softfloat-n64-big does, too.  If one compiles for a specific
CPU family as the ISA variant, then floating point is usually
consistent.  Otherwise, are there other variations on the ISA that one
cares about other than floating point?  Perhaps soon: hypervisor.  I
want this all in one place, though.  Old BSD/Mach-style plus-and-minus
config strings?  mips:mips64r2+hypervisor-fp:n64:big?

>> We need to be thinking about superpages. =C2=A0This is non-trivial even
>> though MIPS is just about ideal for superpages. =C2=A0For one thing, it'=
d
>> be really nice if we did not split TLB entries as we currently do, so
>> the default PAGE_SIZE would be 8K, and then we wouldn't have to deal
>> with TLB behavior where superpages are involved. =C2=A0Does the TLB alwa=
ys
>> use the nearest match? =C2=A0How does it impact performance to have two =
TLB
>> entries covering the same range of addresses? =C2=A0It depends on how th=
e
>> hardware implements TLB lookups, yes? =C2=A0Wouldn't it be nice to not h=
ave
>> to split the TLB? =C2=A0Wouldn't it? =C2=A0I know I bring this up a lot,=
 but it
>> seems like it really would make superpages just slightly less ugly. =C2=
=A0I
>> mean, you do tlbp and you find that your VA is covered by the TLB, but
>> the entry it's in is split, and your VA isn't covered by a superpage,
>> but the one in the TLB is, so you have to add a more specific entry,
>> and suddenly all of your functions using the TLB have gotten
>> non-trivially complex.
>
> Doesn't cache aliasing occur when you have multiple TLBs pointing to the =
same physical page, which is a MIPSy no-no?

I don't mean to suggest doing that.  I mean that TLB Lo0 and TLB Lo1
point to successive physical pages coming from a single PTE.  So you
have 8K pages in the VM system which are automatically translated into
two 4K pages in a single TLB entry.  Otherwise, when you have
superpages, and you have a 256MB region followed by a 4K page, the
superpage TLB entry is going to have a valid Lo0 and an invalid Lo1
and then you have to have a separate 4K-page TLB entry for the VA of
the 4K page.

So at least for superpages, you have to always not share the TLB entry
(and just have TLB Lo0 and Lo1 point to successive physical
superpages; if the VM system were aware of this page-splitting, you
could use a 512MB superpage with two non-contiguous 256MB regions, but
I don't think anyone wants to try to make that work.)  You can still
share TLB entries for 4K pages, but at that point I'd rather not,
y'know?

Also, which superpage sizes do we support?  For quick lookups,
remembering that we have software page tables, we'd probably only want
to support those that align with the levels of our page tables, yes?
So that you just check the low bits of the address when walking the
page tables and you can quickly tell that you've hit a superpage and
don't actually need to load the next level.  That sucks, because MIPS
supports much better granularity, but otherwise TLB refills are a
nightmare.  If we need to double the size of the kernel stack or the
PCPU or some other wired region, we can use a smaller superpage, but
do we have a good way to handle things in the page table?

With 64-bit PTEs we have a lot more software-usable bits, so we can
just copy the PTE into all the PTE slots covered by the superpage
mapping, and that opens up most of our superpage sizes, but at the
cost of bigger page tables.  If you do it that way, it's easy to
design and easy to implement, modulo the need to ensure that your
superpages are actually twice the size of half the TLB entry :)

> I haven't thought about this in ages. =C2=A0I believe that it is complex =
to design, but relatively simple to implement. =C2=A0I did some preliminary=
 looking into this a couple of years ago, but never made it out of the earl=
y explorer stage for want of time.

Implementing superpages is easy.  Not sharing TLB entries is easy.
Flipping the switch is not.  When I last did this with FreeBSD as-is,
extant binaries simply broke, since our image activator couldn't
handle semipage-aligned executables.