Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 31 Jan 1997 11:35:50 -0700
From:      Steve Passe <smp@csn.net>
To:        bag@sinbin.demos.su (Alex G. Bulushev)
Cc:        mishania@demos.su, freebsd-smp@freebsd.org
Subject:   Re: troubles with smp kernel 
Message-ID:  <199701311835.LAA24610@clem.systemsix.com>
In-Reply-To: Your message of "Fri, 31 Jan 1997 20:45:05 %2B0300." <199701311745.UAA01408@sinbin.demos.su> 

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

> > is causing the reboots.  when you say "when runing with one CPU it works"
> > do you mean with 1 physical CPU installed, or do you mean with 2 CPUs installed
> > but without starting up the 2nd one via sysctl?
> > 
> 
> i mean "with 2 physical CPU installed, but with not SMP kernel"
> with single phisical CPU it stable too ...
> 
> now i remove 2nd 3940 and run SMP kernel with kern.smp_active=0
> after 2 hours (without reboots !) i write:
> sysctl -w kern.smp_active=1
> 			  ^ why? don't know.
> 
> sysctl -a show that kern.smp_active=2 and message on console
> say that two CPU's runing ...
> 
> now it works without reboots ... 3:13 ... 1 h with two CPU's

you have found a minor bug in the way I startup the 2nd CPU.  We
have tried to make this completely automatic, but those attempts
were never successful, so I threw in a quick hack to allow the
manual startup of the 2nd CPU without affecting the autostart
code any more than necessary.  This hack basically just waits for
smp_active to change from 0 to some non-0 value, then it sets 
smp_active = <the number of CPUs in the system>, so when you do:

sysctl -w kern.smp_active=1

it appears to have EXACTLY the same effect as setting it to 2.

so, for tagging the problem in the mail archive:

----------------------------------------------------------------
SMP_PROBLEM:

initial startup of APs via "sysctl kern.smp_active=x" ignores
the actual value of 'x' and starts all the APs.

solution:

not really a problem as much as it is just unexpected behaviour.
should be simple fix when I can find the time...
----------------------------------------------------------------

assumming you continue to run without trouble this points to the
2nd 3940 (or possibly any other hardware you have also removed).
I don't think it is INT sharing in itself, we now have the INTs
properly assigned, and many users have systems that have shared PCI INTs
working without problem, including several with 1 3940 sharing INTs with
network cards.  You are the first to try 2 3940s that I know of,
but that should not be a problem for any reason I can think of.
Perhaps its the 2nd 3940 itself?  After running without rebooting for a
day or so I would suggest that you swap the 2 3940s, ie replace the working
one with the currently removed one. Then run with it alone (ie leave the
known good one out) and see if this 2nd card also runs by itself withoout
problem.


--
Steve Passe	| powered by
smp@csn.net	|            FreeBSD

-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: 2.6.2

mQCNAzHe7tEAAAEEAM274wAEEdP+grIrV6UtBt54FB5ufifFRA5ujzflrvlF8aoE
04it5BsUPFi3jJLfvOQeydbegexspPXL6kUejYt2OeptHuroIVW5+y2M2naTwqtX
WVGeBP6s2q/fPPAS+g+sNZCpVBTbuinKa/C4Q6HJ++M9AyzIq5EuvO0a8Rr9AAUR
tBlTdGV2ZSBQYXNzZSA8c21wQGNzbi5uZXQ+
=ds99
-----END PGP PUBLIC KEY BLOCK-----




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199701311835.LAA24610>