Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 03 Mar 2012 11:12:28 +0200
From:      Alexander Motin <mav@FreeBSD.org>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, George Mitchell <george+freebsd@m5p.com>
Subject:   Re: [RFT][patch] Scheduling for HTT and not only
Message-ID:  <4F51E07C.4020706@FreeBSD.org>
In-Reply-To: <CAJ-VmongT%2BUVbLDC=k4vV5gs4OZCjWy3E=tJcnEXN63MgEAu7A@mail.gmail.com>
References:  <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <alpine.BSF.2.00.1202131012270.2020@desktop> <4F3978BC.6090608@FreeBSD.org> <alpine.BSF.2.00.1202131108460.2020@desktop> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org> <alpine.BSF.2.00.1202150949480.2020@desktop> <4F3E807A.60103@FreeBSD.org> <CACqU3MWEC4YYguPQF_d%2B_i_CwTc=86hG%2BPbxFgJQiUS-=AHiRw@mail.gmail.com> <4F3E8858.4000001@FreeBSD.org> <4F4ACF2C.50300@m5p.com> <CABzXLYNhYmCgM7rhJa8g_7PYey-rVirDjo5FqRaEMi7m43y-0g@mail.gmail.com> <4F4B67AB.40907@m5p.com> <CABzXLYMthn-kh05Cu22=U_W4vV98YbQtuvEq7yrsYtKC3iHRUw@mail.gmail.com> <4F4C17E2.2040101@m5p.com> <CAJ-VmokGdDHCwNa3pzsL2a6pWRNA_%2B_-oPqwwMkDo=_9iyvVXg@mail.gmail.com> <4F516281.30603@m5p.com> <CAJ-VmomH38u3a%2BLtRPQkPtwkBbREQX8vyqxiROzuGb4o5eBa4Q@mail.gmail.com> <4F51CAE9.20905@FreeBSD.org> <CAJ-VmongT%2BUVbLDC=k4vV5gs4OZCjWy3E=tJcnEXN63MgEAu7A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 03/03/12 10:59, Adrian Chadd wrote:
> Right. Is this written up in a PR somewhere explaining the problem in
> as much depth has you just have?

Have no idea. I am new at this area and haven't looked on PRs yet.

> And thanks for this, it's great to see some further explanation of the
> current issues the scheduler faces.

By the way I've just reproduced the problem with compilation. On 
dual-core system net/mpd5 compilation in one stream takes 17 seconds. 
But with two low-priority non-interactive CPU-burning threads running it 
takes 127 seconds. I'll try to analyze it more now. I have feeling that 
there could be more factors causing priority violation than I've 
described below.

> On 2 March 2012 23:40, Alexander Motin<mav@freebsd.org>  wrote:
>> On 03/03/12 05:24, Adrian Chadd wrote:
>>>
>>> mav@, can you please take a look at George's traces and see if there's
>>> anything obviously silly going on?
>>> He's reporting that your ULE work hasn't improved his (very) degenerate
>>> case.
>>
>>
>> As I can see, my patch has nothing to do with the problem. My patch improves
>> SMP load balancing, while in this case problem is different. In some cases,
>> when not all CPUs are busy, my patch could mask the problem by using more
>> CPUs, but not in this case when dnets consumes all available CPUs.
>>
>> I still not feel very comfortable with ULE math, but as I understand, in
>> both illustrated cases there is a conflict between clearly CPU-bound dnets
>> threads, that consume all available CPU and never do voluntary context
>> switches, and more or less interactive other threads. If other threads
>> detected to be "interactive" in ULE terms, they should preempt dnets threads
>> and everything will be fine. But "batch" (in ULE terms) threads never
>> preempt each other, switching context only about 10 times per second, as
>> hardcoded in sched_slice variable. Kernel build by definition consumes too
>> much CPU time to be marked "interactive". exo-helper-1 thread in
>> interact.out could potentially be marked "interactive", but possibly once it
>> consumed some CPU to become "batch", it is difficult for it to get back, as
>> waiting in a runq is not counted as sleep and each time it is getting
>> running, it has some new work to do, so it remains "batch". May be if CPU
>> time accounting was more precise it would work better (by accounting those
>> short periods when threads really sleeps voluntary), but not with present
>> sampled logic with 1ms granularity. As result, while dnets threads each time
>> consume full 100ms time slices, other threads are starving, getting running
>> only 10 times per second to voluntary switch out in just a few milliseconds.
>>
>>
>>> On 2 March 2012 16:14, George Mitchell<george+freebsd@m5p.com>    wrote:
>>>>
>>>> On 03/02/12 18:06, Adrian Chadd wrote:
>>>>>
>>>>>
>>>>> Hi George,
>>>>>
>>>>> Have you thought about providing schedgraph traces with your
>>>>> particular workload?
>>>>>
>>>>> I'm sure that'll help out the scheduler hackers quite a bit.
>>>>>
>>>>> THanks,
>>>>>
>>>>>
>>>>> Adrian
>>>>>
>>>>
>>>> I posted a couple back in December but I haven't created any more
>>>> recently:
>>>>
>>>> http://www.m5p.com/~george/ktr-ule-problem.out
>>>> http://www.m5p.com/~george/ktr-ule-interact.out
>>>>
>>>> To the best of my knowledge, no one ever examined them.   -- George
>>
>>
>>
>> --
>> Alexander Motin
>>
>> _______________________________________________
>> freebsd-hackers@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"


-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F51E07C.4020706>