Date: Wed, 14 Dec 2011 01:39:06 +0200 From: Ivan Klymenko <fidaj@ukr.net> To: Jilles Tjoelker <jilles@stack.nl> Cc: "O. Hartmann" <ohartman@mail.zedat.fu-berlin.de>, Doug Barton <dougb@FreeBSD.org>, freebsd-stable@freebsd.org, freebsd-performance@freebsd.org, Current FreeBSD <freebsd-current@freebsd.org> Subject: Re: SCHED_ULE should not be the default Message-ID: <20111214013906.068f69df@nonamehost.> In-Reply-To: <20111213230441.GB42285@stack.nl> References: <4EE1EAFE.3070408@m5p.com> <4EE22421.9060707@gmail.com> <4EE6060D.5060201@mail.zedat.fu-berlin.de> <4EE69C5A.3090005@FreeBSD.org> <20111213104048.40f3e3de@nonamehost.> <20111213230441.GB42285@stack.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
=D0=92 Wed, 14 Dec 2011 00:04:42 +0100 Jilles Tjoelker <jilles@stack.nl> =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote: > > If the algorithm ULE does not contain problems - it means the > > problem has Core2Duo, or in a piece of code that uses the ULE > > scheduler. I already wrote in a mailing list that specifically in > > my case (Core2Duo) partially helps the following patch: > > --- sched_ule.c.orig 2011-11-24 18:11:48.000000000 +0200 > > +++ sched_ule.c 2011-12-10 22:47:08.000000000 +0200 > > @@ -794,7 +794,8 @@ > > * 1.5 * balance_interval. > > */ > > balance_ticks =3D max(balance_interval / 2, 1); > > - balance_ticks +=3D random() % balance_interval; > > +// balance_ticks +=3D random() % balance_interval; > > + balance_ticks +=3D ((int)random()) % balance_interval; > > if (smp_started =3D=3D 0 || rebalance =3D=3D 0) > > return; > > tdq =3D TDQ_SELF(); >=20 > This avoids a 64-bit division on 64-bit platforms but seems to have no > effect otherwise. Because this function is not called very often, the > change seems unlikely to help. Yes, this section does not apply to this problem :) Just I posted the latest patch which i using now... >=20 > > @@ -2118,13 +2119,21 @@ > > struct td_sched *ts; > > =20 > > THREAD_LOCK_ASSERT(td, MA_OWNED); > > + if (td->td_pri_class & PRI_FIFO_BIT) > > + return; > > + ts =3D td->td_sched; > > + /* > > + * We used up one time slice. > > + */ > > + if (--ts->ts_slice > 0) > > + return; >=20 > This skips most of the periodic functionality (long term load > balancer, saving switch count (?), insert index (?), interactivity > score update for long running thread) if the thread is not going to > be rescheduled right now. >=20 > It looks wrong but it is a data point if it helps your workload. Yes, I did it for as long as possible to delay the execution of the code in= section: ... #ifdef SMP /* * We run the long term load balancer infrequently on the first cpu. */ if (balance_tdq =3D=3D tdq) { if (balance_ticks && --balance_ticks =3D=3D 0) sched_balance(); } #endif ... >=20 > > tdq =3D TDQ_SELF(); > > #ifdef SMP > > /* > > * We run the long term load balancer infrequently on the > > first cpu. */ > > - if (balance_tdq =3D=3D tdq) { > > - if (balance_ticks && --balance_ticks =3D=3D 0) > > + if (balance_ticks && --balance_ticks =3D=3D 0) { > > + if (balance_tdq =3D=3D tdq) > > sched_balance(); > > } > > #endif >=20 > The main effect of this appears to be to disable the long term load > balancer completely after some time. At some point, a CPU other than > the first CPU (which uses balance_tdq) will set balance_ticks =3D 0, and > sched_balance() will never be called again. >=20 That is, for the same reason as above in the text... > It also introduces a hypothetical race condition because the access to > balance_ticks is no longer restricted to one CPU under a spinlock. >=20 > If the long term load balancer may be causing trouble, try setting > kern.sched.balance_interval to a higher value with unpatched code. I checked it in the first place - but it did not help fix the situation... The impression of malfunction rebalancing... It seems that the thread is passed on to the same core that is loaded and s= o... Perhaps this is a consequence of an incorrect definition of the topology CP= U? >=20 > > @@ -2144,9 +2153,6 @@ > > if > > (TAILQ_EMPTY(&tdq->tdq_timeshare.rq_queues[tdq->tdq_ridx])) > > tdq->tdq_ridx =3D tdq->tdq_idx; } > > - ts =3D td->td_sched; > > - if (td->td_pri_class & PRI_FIFO_BIT) > > - return; > > if (PRI_BASE(td->td_pri_class) =3D=3D PRI_TIMESHARE) { > > /* > > * We used a tick; charge it to the thread so > > @@ -2157,11 +2163,6 @@ > > sched_priority(td); > > } > > /* > > - * We used up one time slice. > > - */ > > - if (--ts->ts_slice > 0) > > - return; > > - /* > > * We're out of time, force a requeue at userret(). > > */ > > ts->ts_slice =3D sched_slice; >=20 > > and refusal to use options FULL_PREEMPTION > > But no one has unsubscribed to my letter, my patch helps or not in > > the case of Core2Duo... > > There is a suspicion that the problems stem from the sections of > > code associated with the SMP... > > Maybe I'm in something wrong, but I want to help in solving this > > problem ... >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111214013906.068f69df>