Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 3 Mar 2012 17:26:01 +0200
From:      Ivan Klymenko <fidaj@ukr.net>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        freebsd-hackers@freebsd.org, Adrian Chadd <adrian@freebsd.org>, George, Mitchell <george+freebsd@m5p.com>
Subject:   Re: [RFT][patch] Scheduling for HTT and not only
Message-ID:  <20120303172601.07c9c2b5@nonamehost.>
In-Reply-To: <4F521479.30704@FreeBSD.org>
References:  <4F2F7B7F.40508@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <alpine.BSF.2.00.1202131012270.2020@desktop> <4F3978BC.6090608@FreeBSD.org> <alpine.BSF.2.00.1202131108460.2020@desktop> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org> <alpine.BSF.2.00.1202150949480.2020@desktop> <4F3E807A.60103@FreeBSD.org> <CACqU3MWEC4YYguPQF_d%2B_i_CwTc=86hG%2BPbxFgJQiUS-=AHiRw@mail.gmail.com> <4F3E8858.4000001@FreeBSD.org> <4F4ACF2C.50300@m5p.com> <CABzXLYNhYmCgM7rhJa8g_7PYey-rVirDjo5FqRaEMi7m43y-0g@mail.gmail.com> <4F4B67AB.40907@m5p.com> <CABzXLYMthn-kh05Cu22=U_W4vV98YbQtuvEq7yrsYtKC3iHRUw@mail.gmail.com> <4F4C17E2.2040101@m5p.com> <CAJ-VmokGdDHCwNa3pzsL2a6pWRNA_%2B_-oPqwwMkDo=_9iyvVXg@mail.gmail.com> <4F516281.30603@m5p.com> <CAJ-VmomH38u3a%2BLtRPQkPtwkBbREQX8vyqxiROzuGb4o5eBa4Q@mail.gmail.com> <4F51CAE9.20905@FreeBSD.org> <CAJ-VmongT%2BUVbLDC=k4vV5gs4OZCjWy3E=tJcnEXN63MgEAu7A@mail.gmail.com> <4F51E07C.4020706@FreeBSD.org> <4F521479.30704@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
=D0=92 Sat, 03 Mar 2012 14:54:17 +0200
Alexander Motin <mav@FreeBSD.org> =D0=BF=D0=B8=D1=88=D0=B5=D1=82:

> On 03/03/12 11:12, Alexander Motin wrote:
> > On 03/03/12 10:59, Adrian Chadd wrote:
> >> Right. Is this written up in a PR somewhere explaining the problem
> >> in as much depth has you just have?
> >
> > Have no idea. I am new at this area and haven't looked on PRs yet.
> >
> >> And thanks for this, it's great to see some further explanation of
> >> the current issues the scheduler faces.
> >
> > By the way I've just reproduced the problem with compilation. On
> > dual-core system net/mpd5 compilation in one stream takes 17
> > seconds. But with two low-priority non-interactive CPU-burning
> > threads running it takes 127 seconds. I'll try to analyze it more
> > now. I have feeling that there could be more factors causing
> > priority violation than I've described below.
>=20
> On closer look my test appeared not so clean, but instead much more=20
> interesting. Because of NFS use, there is not just context switches=20
> between make, cc and as, that are possibly optimized a bit now, but
> many short sleeps when background process gets running. As result, in
> some moments I see such wonderful traces for cc:
>=20
> wait on runq for 81ms,
> run for 37us,
> wait NFS for 202us,
> wait on runq for 92ms,
> run for 30us,
> wait NFS for 245us,
> wait on runq for 53ms,
> run for 142us,
>=20
> About 0.05% CPU time use for process that supposed to be CPU-bound.
> And while for small run/sleep times ratio process could be nominated
> on interactivity, with so small absolute sleep times it will need
> ages to compensate 5 seconds of "batch" run history, recorded before.
>=20
> >> On 2 March 2012 23:40, Alexander Motin<mav@freebsd.org> wrote:
> >>> On 03/03/12 05:24, Adrian Chadd wrote:
> >>>>
> >>>> mav@, can you please take a look at George's traces and see if
> >>>> there's anything obviously silly going on?
> >>>> He's reporting that your ULE work hasn't improved his (very)
> >>>> degenerate case.
> >>>
> >>>
> >>> As I can see, my patch has nothing to do with the problem. My
> >>> patch improves
> >>> SMP load balancing, while in this case problem is different. In
> >>> some cases,
> >>> when not all CPUs are busy, my patch could mask the problem by
> >>> using more
> >>> CPUs, but not in this case when dnets consumes all available CPUs.
> >>>
> >>> I still not feel very comfortable with ULE math, but as I
> >>> understand, in both illustrated cases there is a conflict between
> >>> clearly CPU-bound dnets
> >>> threads, that consume all available CPU and never do voluntary
> >>> context switches, and more or less interactive other threads. If
> >>> other threads detected to be "interactive" in ULE terms, they
> >>> should preempt dnets threads
> >>> and everything will be fine. But "batch" (in ULE terms) threads
> >>> never preempt each other, switching context only about 10 times
> >>> per second, as hardcoded in sched_slice variable. Kernel build by
> >>> definition consumes too
> >>> much CPU time to be marked "interactive". exo-helper-1 thread in
> >>> interact.out could potentially be marked "interactive", but
> >>> possibly once it
> >>> consumed some CPU to become "batch", it is difficult for it to get
> >>> back, as
> >>> waiting in a runq is not counted as sleep and each time it is
> >>> getting running, it has some new work to do, so it remains
> >>> "batch". May be if CPU
> >>> time accounting was more precise it would work better (by
> >>> accounting those
> >>> short periods when threads really sleeps voluntary), but not with
> >>> present
> >>> sampled logic with 1ms granularity. As result, while dnets threads
> >>> each time
> >>> consume full 100ms time slices, other threads are starving,
> >>> getting running
> >>> only 10 times per second to voluntary switch out in just a few
> >>> milliseconds.
> >>>
> >>>
> >>>> On 2 March 2012 16:14, George Mitchell<george+freebsd@m5p.com>
> >>>> wrote:
> >>>>>
> >>>>> On 03/02/12 18:06, Adrian Chadd wrote:
> >>>>>>
> >>>>>>
> >>>>>> Hi George,
> >>>>>>
> >>>>>> Have you thought about providing schedgraph traces with your
> >>>>>> particular workload?
> >>>>>>
> >>>>>> I'm sure that'll help out the scheduler hackers quite a bit.
> >>>>>>
> >>>>>> THanks,
> >>>>>>
> >>>>>>
> >>>>>> Adrian
> >>>>>>
> >>>>>
> >>>>> I posted a couple back in December but I haven't created any
> >>>>> more recently:
> >>>>>
> >>>>> http://www.m5p.com/~george/ktr-ule-problem.out
> >>>>> http://www.m5p.com/~george/ktr-ule-interact.out
> >>>>>
> >>>>> To the best of my knowledge, no one ever examined them. --
> >>>>> George
> >>>
> >>> --
> >>> Alexander Motin
>=20
>=20

I have FreeBSD 10.0-CURRENT #0 r232253M
Patch in r232454 broken my DRM
My system patched http://people.freebsd.org/~kib/drm/all.13.5.patch
After build kernel with only r232454 patch Xorg log contains:
...
[   504.865] [drm] failed to load kernel module "i915"
[   504.865] (EE) intel(0): [drm] Failed to open DRM device for pci:0000:00=
:02.0: File exists
[   504.865] (EE) intel(0): Failed to become DRM master.
[   504.865] (**) intel(0): Depth 24, (--) framebuffer bpp 32
[   504.865] (=3D=3D) intel(0): RGB weight 888
[   504.865] (=3D=3D) intel(0): Default visual is TrueColor
[   504.865] (**) intel(0): Option "DRI" "True"
[   504.865] (**) intel(0): Option "TripleBuffer" "True"
[   504.865] (II) intel(0): Integrated Graphics Chipset: Intel(R) Sandybrid=
ge Mobile (GT2)
[   504.865] (--) intel(0): Chipset: "Sandybridge Mobile (GT2)"
and black screen...

do not even know why it happened ... :(



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120303172601.07c9c2b5>