Date: Wed, 21 Jan 2015 10:16:06 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: "K. Macy" <kmacy@freebsd.org> Cc: Hans Petter Selasky <hps@selasky.org>, Adrian Chadd <adrian@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, Jason Wolfe <nitroboost@gmail.com>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, Sean Bruno <sbruno@freebsd.org>, Gleb Smirnoff <glebius@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org> Subject: Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys Message-ID: <20150121081606.GN42409@kib.kiev.ua> In-Reply-To: <CAHM0Q_MQPk6Gz%2Bon2fDKSMbzJzi4g-QF23pNP_-v8Y_jV7hhTA@mail.gmail.com> References: <20150120211137.GY15484@FreeBSD.org> <54BED6FB.8060401@selasky.org> <54BEE62D.2060703@ignoranthack.me> <CAHM0Q_MDJN_8sTvTDXfqA7UtJVO3Y8S8%2BNRCs_=6Nj4dkTzjOA@mail.gmail.com> <54BEE8E6.3080009@ignoranthack.me> <CAHM0Q_N_53BM-6RvXu8UpjfDzQHEn5oXZo1Nn8RO0cuOUhe8tg@mail.gmail.com> <54BEEA7F.1070301@ignoranthack.me> <CAHM0Q_PtJ7JHFTiu9_dmi_Ce=rmu1j72z2OYQ2CD3%2BEbcoEGsA@mail.gmail.com> <54BEF154.3030606@ignoranthack.me> <CAHM0Q_MQPk6Gz%2Bon2fDKSMbzJzi4g-QF23pNP_-v8Y_jV7hhTA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Jan 20, 2015 at 04:37:44PM -0800, K. Macy wrote: > I would pick stability over performance any day. However, it _seems_ > to me, and maybe I simply don't understand some key details, that the > fix consisted of largely single-threading the callout system. And as I > say I may simply not understand the specifics, but this sort of large > scale disabling does not constitute a fix but is a workaround. It's > more like disabling preemption because it fixes a panic. Yes, it might > "fix" a whole array of bugs that crop up but it could not be seen as a > fix to an otherwise working system. Well, this is not a complete truth. Let me try to explain my understanding, I spent some time actually looking at the new code. Would be nice if corrections to my understanding is posted, if needed. Now, when callout_reset() is directed to change cpu, the change only happens when the callout is associated with a lock, and that lock is owned by the caller of callout_reset(). There are two consequences. One, which is good, is significant simplification of the migration code, together with the drain. This is exactly the place where there bugs which make my head hurt, see e.g. r234952 and r243901. Note that some callouts follow this pattern already, e.g. process timers after r243869. Another consequence, which is very visible and which causes the roar, is that all lockless callers of callout_reset_on(9) now silently changed the behaviour to callout_reset(9). There is no audit performed to identify such callers, and there is no even a plan to fix them. The answer to the complains seems to be 'if you think that the fix is needed, go and fix'. My impression is that some set of vocal active developers consider the second consequence unacceptable, myself included. IMO, committing the change, however good the consequence one is, without fixing the consequence two, is inappropriate. And the onus of doing this is on the person doing the change. Yes, it is very interesting is the change actually good WRT fixing migration, since indeed there is serious reduction in the migration amount due to locked callout_reset() being not universally used. It is possible that the bugs are only covered.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150121081606.GN42409>