Date: Wed, 1 Dec 2004 10:37:59 -0800 From: "David O'Brien" <obrien@freebsd.org> To: "Ketrien I. Saihr-Kenchedra" <ketrien@error404.nls.net> Cc: freebsd-amd64@freebsd.org Subject: Re: kernel panic with greater that 8 GB of memory Message-ID: <20041201183759.GB48294@dragon.nuxi.com> In-Reply-To: <20041201042900.A88450@bahre.achedra.org> References: <20041129211341.GA26548@troutmask.apl.washington.edu> <41AC6FF8.40501@freebsd.org> <20041130183555.GA32237@troutmask.apl.washington.edu> <20041130142652.A59122@bahre.achedra.org> <20041201084537.GA1621@dragon.nuxi.com> <20041201042900.A88450@bahre.achedra.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Dec 01, 2004 at 04:47:51AM -0500, Ketrien I. Saihr-Kenchedra wrote: > On Wed, 1 Dec 2004, David O'Brien wrote: > > >There is no 'ccNUMA' setting in the BIOS. A multi-processor Opteron is a > >NUMA architecture machine regardless of any BIOS settings. I.E. there is > >no way to disable that a MP Opteron is a NUMA machine. > >The setting of interest is should the BIOS round-robin interleave > >physical addresses across the NUMA nodes[*]. The reference AMI BIOS > >refers to this as "Node Interleaving". It should be "DISABLED". Or if > >the BIOS speaks of the SRAT table, it should be "ENABLED". While FreeBSD > >doesn't use the SRAT table (and cannot until ACPI 3.0 BIOS's); turning on > >the SRAT turns off node interleaving. > > David, trying very hard to be nice, the S2882's a 'Special Case.' > Where special should be taken as 'short bus.' VERY short bus. > > On the S2882/S2885, and even the S4882, the BIOS specifically says, and I > quote: 'ccNUMA Support.' No joke. The beauty is that ccNUMA is, you got > it, SRAT Table Control, which _disables_ interleave completely. Beautiful, > huh? The only way to enable interleave on the S2882/S2885 is to > specifically turn 'ccNUMA' off; otherwise it's SRAT only, with no > interleave. I've seen S2882 BIOS's that don't say "ccNUMA" and my S2885 K8W Thunder certainly doesn't. It's BIOS knobs and explanation are: Interleaving allows memory accesses to be spread out over BANKS on the same node, or across NODES, decreasing access contention. o "Bank Interleaving" o "Node Interleaving" On every Operon BIOS (AMI or Phoenix), the SRAT enabling and Node Interleaving are related -- they are mutually exclusive by design as Node Interleaving lets the BIOS setup the physical memory topology vs. enabling the SRAT which then puts it in the OS's court. Enabling the SRAT doesn't need to and shouldn't disable Bank Interleaving. > And no, it's not mentioned anywhere on the S2882s. Nor in the AMI reference BIOS :-(, which is a shame as more information might better clear things up. > Only on > the S4882, which uses a Phoenix 6.0 reference. (Which Tyan, predictably, > did not remove the reference lines _or_ fix the typos on.) The S2882s > also use a Phoenix, I believe. So in order to enable Interleave, your > only method is that switch. Actually the S2882 uses a AMI BIOS, as does all of Tyan's 2P Opteron boards, except the S2885 variant Tyan makes for the Fujitsu Celsius V810. I like the Fujitsu Celsius V810 Phoenix better, wished it was used on all S2885's. > What got me peering curiously here is the placement of the hole; there > appears to be a 512MB hole, starting at 4024MB. It certainly is similar > to the behavior of 2882 I've looked at; hole is about 1/3rd through > memory. On the 2.03 BIOS, still no way to control the PCI hole; that > feature is reserved for the S4882 apparently, and even then only size. How do you want to control the "memory hole"? It is mandatory in order to map in memory mapped I/O devices. > If the hole _is_ at 4024MB, that would put it past 4096MB and the 4GB > limit; on the S2882 the bulk of the peripherals are on the PCI32 bus, not > a PCI64 bus. Specifically, disk, fxp(4), bge(4), ACPI, and USB. > Presuming I am reading the hole correctly, and bearing in mind that I do > not have access to my copy of the PCI2.3 spec, ISTR that the entirety of > the PCI memory hole must be within the first 4GB of system memory. (I'd > appreciate it if someone could sanity check that.) Correct, the "memory hole" must be below the 4GB mark. http://www.amd.com/us-en/assets/content_type/DownloadableAssets/RichBrunnerClusterWorldpresFINAL.pdf slide 25 is AMD's picture for this. The "hole" is of course an area where memory mapped PCI I/O, APIC, APG GART, etc.. are mapped into the memory address space. Thus RAM cannot be addressed using (mapped into) those memory locations. -- -- David (obrien@FreeBSD.org)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041201183759.GB48294>