Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 03 Mar 2012 14:54:17 +0200
From:      Alexander Motin <mav@FreeBSD.org>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, George Mitchell <george+freebsd@m5p.com>
Subject:   Re: [RFT][patch] Scheduling for HTT and not only
Message-ID:  <4F521479.30704@FreeBSD.org>
In-Reply-To: <4F51E07C.4020706@FreeBSD.org>
References:  <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <alpine.BSF.2.00.1202131012270.2020@desktop> <4F3978BC.6090608@FreeBSD.org> <alpine.BSF.2.00.1202131108460.2020@desktop> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org> <alpine.BSF.2.00.1202150949480.2020@desktop> <4F3E807A.60103@FreeBSD.org> <CACqU3MWEC4YYguPQF_d%2B_i_CwTc=86hG%2BPbxFgJQiUS-=AHiRw@mail.gmail.com> <4F3E8858.4000001@FreeBSD.org> <4F4ACF2C.50300@m5p.com> <CABzXLYNhYmCgM7rhJa8g_7PYey-rVirDjo5FqRaEMi7m43y-0g@mail.gmail.com> <4F4B67AB.40907@m5p.com> <CABzXLYMthn-kh05Cu22=U_W4vV98YbQtuvEq7yrsYtKC3iHRUw@mail.gmail.com> <4F4C17E2.2040101@m5p.com> <CAJ-VmokGdDHCwNa3pzsL2a6pWRNA_%2B_-oPqwwMkDo=_9iyvVXg@mail.gmail.com> <4F516281.30603@m5p.com> <CAJ-VmomH38u3a%2BLtRPQkPtwkBbREQX8vyqxiROzuGb4o5eBa4Q@mail.gmail.com> <4F51CAE9.20905@FreeBSD.org> <CAJ-VmongT%2BUVbLDC=k4vV5gs4OZCjWy3E=tJcnEXN63MgEAu7A@mail.gmail.com> <4F51E07C.4020706@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 03/03/12 11:12, Alexander Motin wrote:
> On 03/03/12 10:59, Adrian Chadd wrote:
>> Right. Is this written up in a PR somewhere explaining the problem in
>> as much depth has you just have?
>
> Have no idea. I am new at this area and haven't looked on PRs yet.
>
>> And thanks for this, it's great to see some further explanation of the
>> current issues the scheduler faces.
>
> By the way I've just reproduced the problem with compilation. On
> dual-core system net/mpd5 compilation in one stream takes 17 seconds.
> But with two low-priority non-interactive CPU-burning threads running it
> takes 127 seconds. I'll try to analyze it more now. I have feeling that
> there could be more factors causing priority violation than I've
> described below.

On closer look my test appeared not so clean, but instead much more 
interesting. Because of NFS use, there is not just context switches 
between make, cc and as, that are possibly optimized a bit now, but many 
short sleeps when background process gets running. As result, in some 
moments I see such wonderful traces for cc:

wait on runq for 81ms,
run for 37us,
wait NFS for 202us,
wait on runq for 92ms,
run for 30us,
wait NFS for 245us,
wait on runq for 53ms,
run for 142us,

About 0.05% CPU time use for process that supposed to be CPU-bound. And 
while for small run/sleep times ratio process could be nominated on 
interactivity, with so small absolute sleep times it will need ages to 
compensate 5 seconds of "batch" run history, recorded before.

>> On 2 March 2012 23:40, Alexander Motin<mav@freebsd.org> wrote:
>>> On 03/03/12 05:24, Adrian Chadd wrote:
>>>>
>>>> mav@, can you please take a look at George's traces and see if there's
>>>> anything obviously silly going on?
>>>> He's reporting that your ULE work hasn't improved his (very) degenerate
>>>> case.
>>>
>>>
>>> As I can see, my patch has nothing to do with the problem. My patch
>>> improves
>>> SMP load balancing, while in this case problem is different. In some
>>> cases,
>>> when not all CPUs are busy, my patch could mask the problem by using
>>> more
>>> CPUs, but not in this case when dnets consumes all available CPUs.
>>>
>>> I still not feel very comfortable with ULE math, but as I understand, in
>>> both illustrated cases there is a conflict between clearly CPU-bound
>>> dnets
>>> threads, that consume all available CPU and never do voluntary context
>>> switches, and more or less interactive other threads. If other threads
>>> detected to be "interactive" in ULE terms, they should preempt dnets
>>> threads
>>> and everything will be fine. But "batch" (in ULE terms) threads never
>>> preempt each other, switching context only about 10 times per second, as
>>> hardcoded in sched_slice variable. Kernel build by definition
>>> consumes too
>>> much CPU time to be marked "interactive". exo-helper-1 thread in
>>> interact.out could potentially be marked "interactive", but possibly
>>> once it
>>> consumed some CPU to become "batch", it is difficult for it to get
>>> back, as
>>> waiting in a runq is not counted as sleep and each time it is getting
>>> running, it has some new work to do, so it remains "batch". May be if
>>> CPU
>>> time accounting was more precise it would work better (by accounting
>>> those
>>> short periods when threads really sleeps voluntary), but not with
>>> present
>>> sampled logic with 1ms granularity. As result, while dnets threads
>>> each time
>>> consume full 100ms time slices, other threads are starving, getting
>>> running
>>> only 10 times per second to voluntary switch out in just a few
>>> milliseconds.
>>>
>>>
>>>> On 2 March 2012 16:14, George Mitchell<george+freebsd@m5p.com> wrote:
>>>>>
>>>>> On 03/02/12 18:06, Adrian Chadd wrote:
>>>>>>
>>>>>>
>>>>>> Hi George,
>>>>>>
>>>>>> Have you thought about providing schedgraph traces with your
>>>>>> particular workload?
>>>>>>
>>>>>> I'm sure that'll help out the scheduler hackers quite a bit.
>>>>>>
>>>>>> THanks,
>>>>>>
>>>>>>
>>>>>> Adrian
>>>>>>
>>>>>
>>>>> I posted a couple back in December but I haven't created any more
>>>>> recently:
>>>>>
>>>>> http://www.m5p.com/~george/ktr-ule-problem.out
>>>>> http://www.m5p.com/~george/ktr-ule-interact.out
>>>>>
>>>>> To the best of my knowledge, no one ever examined them. -- George
>>>
>>> --
>>> Alexander Motin


-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F521479.30704>