Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Apr 2010 16:19:42 +0530
From:      "C. Jayachandran" <c.jayachandran@gmail.com>
To:        Rui Paulo <rpaulo@freebsd.org>
Cc:        freebsd-mips@freebsd.org
Subject:   Re: SMP support for XLR processors.
Message-ID:  <o2n98a59be81004200349yefc11499n4497544d6dbd9d0b@mail.gmail.com>
In-Reply-To: <BC57A6F0-4F2E-47F4-92BF-849AD18FC004@freebsd.org>
References:  <w2z98a59be81004171540t2f0d5193nca2ec9e2540502e2@mail.gmail.com> <A1FC32B9-1105-43C5-91C1-C4A81F78066B@lakerest.net> <3BCD65EB-B997-449D-864C-CA24C7B19026@freebsd.org> <CFE92A18-C834-45C5-B18C-7F62437D1A2B@lakerest.net> <z2z98a59be81004190411hd4bee7e4t6e5eed3d3789180a@mail.gmail.com> <6BDB3874-D779-45A6-ABAE-4C331D78A189@lakerest.net> <y2m98a59be81004190657kce2488b0p86a725b1175cb14b@mail.gmail.com> <l2n98a59be81004200252lf1d0a372pfae8ac5f55440e58@mail.gmail.com> <7BEFA3F5-97AE-477C-9DD3-EF1C4B7DCEB0@freebsd.org> <BC57A6F0-4F2E-47F4-92BF-849AD18FC004@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Apr 20, 2010 at 4:03 PM, Rui Paulo <rpaulo@freebsd.org> wrote:
> On 20 Apr 2010, at 11:05, Rui Paulo wrote:
>
>> On 20 Apr 2010, at 10:52, C. Jayachandran wrote:
>>
>>> On Mon, Apr 19, 2010 at 7:27 PM, C. Jayachandran
>>> <c.jayachandran@gmail.com> wrote:
>>>> I have a possible cause for the panic with invariants - we should not
>>>> schedule the msgring threads unless the smp is completely up. I guess
>>>> we start getting message ring interrupts on before the message ring
>>>> threads can be scheduled. =A0I am trying out some changes for this -
>>>> will send you a patch if this fixes it.
>>>
>>> I've attached a patch that should fix the issue. The cause was the way
>>> message ring threads are started on individual cores and the way
>>> interrupts are enabled in the core. =A0I've moved starting message ring
>>> threads on other cpus to be a SYSINIT after SMP is started. =A0I'd
>>> thought originally that it was due to some clash with the changes in
>>> HEAD - but looks like I was completely off-track there.
>>>
>>> Please let me know if you don't get multi-user with 32 cpus with this
>>> patch. There is still the original hang in buildworld, but that should
>>> be a bug elsewhere
>>>
>>> I have a copy at http://sites.google.com/site/cjayachandran/files too
>>
>> This works perfectly, thanks!
>
> On further inspection, I noticed that the load avg is now 7.
>
> last pid: =A01613; =A0load averages: =A06.99, =A06.97, =A06.08 =A0 =A0up =
0+00:30:11 =A010:32:48
> 108 processes: 40 running, 24 sleeping, 44 waiting
> CPU: =A00.0% user, =A00.0% nice, 21.9% system, =A00.0% interrupt, 78.1% i=
dle
> Mem: 8444K Active, 6028K Inact, 37M Wired, 308K Cache, 6800K Buf, 3190M F=
ree
> Swap:
>
> =A0PID USERNAME =A0THR PRI NICE =A0 SIZE =A0 =A0RES STATE =A0 C =A0 TIME =
=A0 WCPU COMMAND
> =A0 10 root =A0 =A0 =A0 32 171 ki31 =A0 =A0 0G =A0 =A0 0G CPU0 =A0 =A00 2=
63:26 2500.00% idle
> =A0 17 root =A0 =A0 =A0 =A01 -16 =A0 =A0- =A0 =A0 0K =A0 =A0 0G CPU12 =A0=
 2 =A0 0:00 100.00% msg_intr12
> =A0 15 root =A0 =A0 =A0 =A01 -16 =A0 =A0- =A0 =A0 0K =A0 =A0 0G CPU4 =A0 =
=A02 =A0 0:00 100.00% msg_intr4
> =A0 16 root =A0 =A0 =A0 =A01 -16 =A0 =A0- =A0 =A0 0K =A0 =A0 0G CPU8 =A0 =
=A02 =A0 0:00 100.00% msg_intr8
> =A0 20 root =A0 =A0 =A0 =A01 -16 =A0 =A0- =A0 =A0 0K =A0 =A0 0G CPU24 =A0=
 1 =A0 0:00 100.00% msg_intr24
> =A0 19 root =A0 =A0 =A0 =A01 -16 =A0 =A0- =A0 =A0 0K =A0 =A0 0G CPU20 =A0=
 1 =A0 0:00 100.00% msg_intr20
> =A0 21 root =A0 =A0 =A0 =A01 -16 =A0 =A0- =A0 =A0 0K =A0 =A0 0G CPU28 =A0=
 1 =A0 0:00 100.00% msg_intr28
> =A0 18 root =A0 =A0 =A0 =A01 -16 =A0 =A0- =A0 =A0 0K =A0 =A0 0G CPU16 =A0=
 1 =A0 0:00 100.00% msg_intr16
>
> What are these msg_intrXX kprocs doing?

They should really be sleeping unless there is a lot of network
traffic :)  The msg_intr threads are interrupt handlers which we run
one per core, in the first thread of each core.  They were modelled
after interrupt threads (in FreeBSD 6). This should be sleeping until
there is a message ring interrupt (which tells us that an IO has send
data to our core over the message ring).

Thanks for the report - I will look at the sleep logic.

JC.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?o2n98a59be81004200349yefc11499n4497544d6dbd9d0b>