FreeBSD Mail Archives

Date:      Sat, 21 Apr 2018 23:30:55 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>, "George Mitchell" <george+freebsd@m5p.com>, Peter <pmc@citylink.dinoex.sub.org>
Subject:   Re: SCHED_ULE makes 256Mbyte i386 unusable
Message-ID:  <YQBPR0101MB10421529BB346952BCE7F20EDD8B0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <20180421201128.GO6887@kib.kiev.ua>
References:  <YQBPR0101MB1042F252A539E8D55EB44585DD8B0@YQBPR0101MB1042.CANPRD01.PROD.OUTLOOK.COM>, <20180421201128.GO6887@kib.kiev.ua>

index | next in thread | previous in thread | raw e-mail


Konstantin Belousov wrote:
>On Sat, Apr 21, 2018 at 07:21:58PM +0000, Rick Macklem wrote:
>> I decided to start a new thread on current related to SCHED_ULE, since I see
>> more than just performance degradation and on a recent current kernel.
>> (I cc'd a couple of the people discussing performance problems in freebsd-stable
>>  recently under a subject line of "Re: kern.sched.quantum: Creepy, sadistic scheduler".
>>
>> When testing a pNFS server on a single core i386 with 256Mbytes using a Dec. 2017
>> current/head kernel, I would see about a 30% performance degradation (elapsed
>> run time for a kernel build over NFSv4.1) when the server kernel was built with
>> options SCHED_ULE
>> instead of
>> options SCHED_4BSD
>>
>> Now, with a kernel from a couple of days ago, the
>> options SCHED_ULE
>> kernel becomes unusable shortly after starting testing.
>> I have seen two variants of this:
>> - Became essentially hung. All I could do was ping the machine from the network.
>> - Reported "vm_thread_new: kstack allocation failed
>>   and then any attempt to do anything gets "No more processes".
>This is strange.  It usually means that you get KVA either exhausted or
>severly fragmented.
Yes. I reduced the number of nfsd threads from 256->32 and the SCHED_ULE
kernel is working ok now. I haven't done enough to compare performance yet.
Maybe I'll post again when I have some numbers.

>Enter ddb, it should be operational since pings are replied.  Try to see
>where the threads are stuck.
I didn't do this, since reducing the number of kernel threads seems to have fixed
the problem. For the pNFS server, the nfsd threads will spawn additional kernel
threads to do proxies to the mirrored DS servers.

>> with the only difference being a kernel built with
>> options SCHED_4BSD
>> everything works and performs the same as the Dec 2017 kernel.
>>
>> I can try rolling back through the revisions, but it would be nice if someone
>> could suggest where to start, because it takes a couple of hours to build a
>> kernel on this system.
>>
>> So, something has made things worse for a head/current kernel this winter, rick
>
>There are at least two potentially relevant changes.
>
>First is r326758 Dec 11 which bumped KSTACK_PAGES on i386 to 4.
I've been running this machine with KSTACK_PAGES=4 for some time, so no change.

>Second is r332489 Apr 13, which introduced 4/4G KVA/UVA split.
Could this change have resulted in the system being able to allocate fewer
kernel threads/stacks for some reason?

>Consequences of the first one are obvious, it is much harder to find
>the place to map the stack.  Second change, on the other hand, provides
>almost full 4G for KVA and should have mostly compensate for the negative
>effects of the first.
>
>And, I cannot see how changing the scheduler would fix or even affect that
>behaviour.
My hunch is that the system was running near its limit for kernel threads/stacks.
Then, somehow, the timing SCHED_ULE caused resulted in the nfsd trying to get
to a higher peak number of threads and hit the limit.
SCHED_4BSD happened to result in timing such that it stayed just below the
limit and worked.
I can think of a couple of things that might affect this:
1 - If SCHED_ULE doesn't do the termination of kernel threads as quickly, then
      they wouldn't terminate and release their resources before more new ones
      are spawned.
2 - If SCHED_ULE handles the nfsd threads in a more "bursty" way, then the burst
      could try and spawn more mirror DS worker threads at about the same time.

Anyhow, thanks for the help, rick

help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YQBPR0101MB10421529BB346952BCE7F20EDD8B0>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation