Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 01 Dec 1996 11:10:02 +0800
From:      Peter Wemm <peter@spinner.dialix.com>
To:        Steve Passe <smp@csn.net>
Cc:        "J.M. Chuang" <smp@bluenose.na.tuns.ca>, smp@freebsd.org
Subject:   Re: New smp kernel 
Message-ID:  <199612010310.LAA02585@spinner.DIALix.COM>
In-Reply-To: Your message of "Sat, 30 Nov 1996 15:31:10 MST." <199611302231.PAA08844@clem.systemsix.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
Steve Passe wrote:
> > I managed to get a SCSI HD with Adaptic 2940A controller this morning.
> > With the same smp kernel, the system booted up from SCSI HD without any
> > problem and the system recognizes the IDE drive which can be accessed (moun
    t).
> > The only glitches with the current smp-kernel are the coredumps showed up o
    nce in a while 
> > when I compile the kernel. It is a kind of strang that after the coredump
> > if I keep on typing `make', the compilation of the kernel can be
> > finished. Is is due to the syncronization (??) problem of two CPU's?
> 
> It is believed that its caused by our failure to do tlb flushing during page
> stealing. Peter's working on an implementation of IPI's to trigger tlb
> flushes which should fix that problem.

Just as a BTW, I have an early implementation of this that I think is
working, apart from the fact that the system wedges shortly after going
smp.  I'm not yet sure whether this is the syscons problem, interrupt
masking problems, tlb sync problems, IPI problems, or what else I don't
know.  I'm in the middle of ressurecting the serial port debugging trace
code so that I can see how far it's getting.

And for what it's worth, I should mention that thse problems are nasty.  I
strongly reccomend *not* running with the kernel in this state for very
long and avoid lots of disk writes.  I got bitten by this yesterday where
a cron job that runs a mess of processes in sequence and each writes data
to differnet files accidently ran while the system was struggling to compile
a kernel.  The results were... "interesting"..  In several places in the
files, pages that were meant for one file actually ended up in another.
I think this is best explained by one cpu "stealing" pages and didn't
notify the other cpu that it had done so.

I have another theory as to why this has suddenly started being a problem.
The -current merge that happened right before these problems had quite a bit
of VM work done on it since the last time we merged -current from a few
months back.  One of the features was 'page colouring' where the VM system
attempts to use physical memory more efficiently to get the best effect
from a direct mapped cache.  I'm not 100% sure exactly what the impact
of this is on the page management policies, but it's bound to be something.
It might explain why page reclaiming appears to have become more common
even though there is free memory at the time.

Cheers,
-Peter



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199612010310.LAA02585>