Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Jan 2015 10:07:36 +0100
From:      Hans Petter Selasky <hps@selasky.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        Adrian Chadd <adrian@freebsd.org>, "src-committers@freebsd.org" <src-committers@freebsd.org>, "K. Macy" <kmacy@freebsd.org>, Jason Wolfe <nitroboost@gmail.com>, "svn-src-all@freebsd.org" <svn-src-all@freebsd.org>, sbruno@freebsd.org, Gleb Smirnoff <glebius@freebsd.org>, "svn-src-head@freebsd.org" <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r277213 - in head: share/man/man9 sys/kern sys/ofed/include/linux sys/sys
Message-ID:  <54BF6C58.7010500@selasky.org>
In-Reply-To: <20150121085100.GQ42409@kib.kiev.ua>
References:  <54BDD9E1.6090505@selasky.org> <20150120075126.GA42409@kib.kiev.ua> <20150120211137.GY15484@FreeBSD.org> <54BED6FB.8060401@selasky.org> <54BEE62D.2060703@ignoranthack.me> <CAHM0Q_MDJN_8sTvTDXfqA7UtJVO3Y8S8%2BNRCs_=6Nj4dkTzjOA@mail.gmail.com> <54BEE8E6.3080009@ignoranthack.me> <CAHM0Q_N_53BM-6RvXu8UpjfDzQHEn5oXZo1Nn8RO0cuOUhe8tg@mail.gmail.com> <54BEEA7F.1070301@ignoranthack.me> <54BF640B.6000700@selasky.org> <20150121085100.GQ42409@kib.kiev.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On 01/21/15 09:51, Konstantin Belousov wrote:
> On Wed, Jan 21, 2015 at 09:32:11AM +0100, Hans Petter Selasky wrote:
>> On 01/21/15 00:53, Sean Bruno wrote:
>>> Unkown to me.  Nor am I aware of anyone else who ever hit our panics
>>> either.  Our environment, and the failure, was only seen in the Intel
>>> 10GE space (ixgbe).  This is an artifact of our use cases, and hasn't
>>> been expanded nor tested in our environment with other vendor interfaces.
>>>
>>> sean
>>

Hi,

>>
>> I've seen this with Mellanox hardware when running some special tests,
>> but not during regular use yet. That was the reason for going into the
>> callout subsystem in the first place. 40GE.
>>
>> Also I would like to mention during the heat of this discussion, that
>> during X-mas this year, I had a very heavy discussion with Attilio and a
>> few other FreeBSD developers, who's name was on a patch (r220456) that
>> changed how the return value of "callout_active()" works.
>> "callout_active()" is heavily used inside the TCP stack and what was
>> found is there is a potential race related to migrating the callout from
>> one CPU to the other, which in turn might give other symptoms than a
>> spinlock hang.
>>
>> FYI:
>>
>> https://svnweb.freebsd.org/base?view=revision&revision=225057
>>
>> Cite: "If the newly scheduled thread wants to acquire the old queue it
>> will just spin forever."
>>
>> This description reminds me very much of what "Jason Wolfe", others and
>> myself have seen.
>>
>> Konstantin, you're responsible for r220456 (Approved by: kib). I would
> I definitely do not see anything related to my freefall login in the
> log message for r220456, nor I participated in any way in the work
> which lead to that revision.
>
> If you mean r225057, note that approval by re != review.

Yes, I meant r225057.

>> like to ask what investigation you did to ensure that you solved the
>> problem as described in the commit message and didn't introduce a new one?
>>
>> In r220456 the "callout_reset_on()" function was changed in a way that
>> directly conflicts with how the TCP stack works, by not always ensuring
>> that "callout_active()" returns non-zero after a callout is restarted!
>> See return at line 821:
>>
>>> https://svnweb.freebsd.org/base/head/sys/kern/kern_timeout.c?revision=225057&view=markup&pathrev=225057#l821
>>
>> Kib: Any comments?
>
> With the re hat on, explanation for the proposed commit looked reasonable,
> and committer provided enough evidence that change got adequate testing.
> Since change fixed a bug, and this is exactly what re wants to see
> during release cycle, I see no reason why commit should be denied.

The problem is Attilio is no longer an active committer and he was not 
been very willing to do more work in this area. When people writing code 
in an area no longer respond - what should we do?

--HPS



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54BF6C58.7010500>