Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 06 Dec 1996 08:30:17 -0800
From:      Erich Boleyn <erich@uruk.org>
To:        Steve Passe <smp@csn.net>, Peter Wemm <peter@spinner.dialix.com>
Cc:        smp@freebsd.org
Subject:   P6 and FreeBSD/SMP (was -> Re: last major problem)
Message-ID:  <E0vW3AH-0007kq-00@uruk.org>
In-Reply-To: Your message of "Thu, 05 Dec 1996 23:47:35 MST." <199612060647.XAA18003@clem.systemsix.com> 

index | next in thread | previous in thread | raw e-mail


Steve Passe <smp@csn.net> writes:

> > Steve Passe wrote:
> > > Hi,
> > > 
> > > so all 3 failing systems are P6.  is there anyone now successfully running
> > > APIC_IO on a P6 system?
> > 
> > Wait a sec, I thought this was a problem caused by the SMP_INVLTLB code?
> > Or is it a generic "P6 dislikes APIC_IO in general" problem?
> 
> thats what I'm wondering, I've been so busy fighting the other fires that
> I haven't kept up on the details of this and may misunderstand...
> 
> so additionally, has ANYONE EVER run a 'solid' APIC_IO kernel on a P6?

Hmmm...  I think there are several issues afoot here.

 (1) Does the P6 run an APIC_IO kernel OK before activating the other CPUs
 (2) Does the P6 run an APIC_IO kernel OK after activating the other CPUs

...and seen in another message...

 (3) [the thing about relying on the short prefect queue of the Pentium, and
      the P6 might break it]

I ran a bunch of tests last night and this morning with the kernel tree
from yesterday afternoon/evening.  My results were (using APIC_IO +
SMP_INVLTLB) :

  --  After I activate the other 3 CPUs (via "sysctl -w kern.smp_active=4"),
      standard (sinple) operations seem OK, but when I start compiling a
      kernel, anywhere from 1/3 to 1/2 the way through it dies with a
      "kernel trap 12: supervisor read/write, page not present" break to
      the kernel debugger, going into pmap_enter.  It is always that error
      (I think I've seen a "write" once, with it saying a "read" trap the
      other times).  This implies that the answer to #2 is "no" (at least
      on my test box).  I tried this about a dozen times, with the same
      results each time.

  --  If I don't activate the other CPUs, I can do a dozen builds in a raw
      with no problems.  This implies the answwer to #1 is "yes".

This leads me to believe that the problem is in the MP handling, not the
base APIC_IO stuff.  Maybe this is a good time to tell some of us what
is different when activating the other CPUs as far as APIC control?

I've been digging through the source, and it seems that the "smp_invltlb"
is separate from the normal "invltlb" function?  There are *many* places
where "invltlb_1pg" (I think that was it) or other variants are called and
no SMP invalidates are propagated.  This strikes me as a situation
fraught with potential (and currently real) problems.  What is the
design goal here?

On a side note (issue #3 I saw a comment on), first of all, never, *ever*
rely on short prefetch queues.  A proper sequence which flushes the queues
in the appropriate places should be used.  The Intel manuals have many
examples of this.  I'll take a look at the code sequence and see what's
up with it.  It could be that the comment that was made about it was
bogus.  (if not, then that could very well be a fatal problem.  Certainly
the P6 and what I've heard of the next project will become more and more
unpredictable unless the proper methods to serialize control register
changes are used).

--
  Erich Stefan Boleyn                 \_ E-mail (preferred):  <erich@uruk.org>
Mad Genius wanna-be, CyberMuffin        \__      (finger me for other stats)
Web:  http://www.uruk.org/~erich/     Motto: "I'll live forever or die trying"


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E0vW3AH-0007kq-00>