Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 9 Jul 2012 16:25:32 -0400
From:      Ryan Stone <rysto32@gmail.com>
To:        Gleb Smirnoff <glebius@freebsd.org>
Cc:        bzeeb-lists@lists.zabbadoz.net, Przemyslaw Frasunek <venglin@freebsd.lublin.pl>, Mike Tancsa <mike@sentex.net>, Eugene Grosbein <egrosbein@rdtc.ru>, freebsd-net@freebsd.org
Subject:   Re: mpd5/Netgraph issues after upgrading to 7.4
Message-ID:  <CAFMmRNydo=Bci=WM5mODS0vhiv8%2B2Oj39XKTXkEymYOJunhPxg@mail.gmail.com>
In-Reply-To: <20120709081225.GJ21957@glebius.int.ru>
References:  <20110513162311.GK95084@glebius.int.ru> <4DD298AD.2060905@frasunek.com> <20110517184613.GN74366@glebius.int.ru> <4FDB1D71.6050908@freebsd.lublin.pl> <20120615203142.GW28613@glebius.int.ru> <4FDBAFD7.9020606@freebsd.lublin.pl> <4FDF2F81.6030307@sentex.net> <4FDF3097.6080701@freebsd.lublin.pl> <4FE0EE62.5070905@freebsd.lublin.pl> <4FF7F2C6.5070401@freebsd.lublin.pl> <20120709081225.GJ21957@glebius.int.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jul 9, 2012 at 4:12 AM, Gleb Smirnoff <glebius@freebsd.org> wrote:
> This looks very much related to a known race in ARP code.
>
> See this email and related thread:
>
> http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031865.html
>
> Ryan didn't check in any patches since, and I failed to follow on this
> problem due to ENOTIME.
>
> I've added Ryan to Cc. Ryan, what's the status of the problem at your
> side? Did you come to any solution?

Unfortunately I was never able to come to a satisfactory solution.  As
I recall, in the end I ran headlong into problems with making the
locking sane.  The big problem was with arpresolve.  At one point it
calls callout_reset to schedule the LLE's la_timer.  In my patch this
would have to be done with a write lock help on the afdata lock.
However, this acquisition would have to be done before taking the
LLE_LOCK to prevent a LOR, and in the end you conclude that you have
to take a write lock on the ifnet's afdata lock for every packet that
goes through arpresolve, which was a non-starter.  That's the point
that I reached before I got distracted by other things at $WORK.

As I recall, the in6 case was even worse, as the in6 equivalent of
arptimer is significantly more complicated and likes to do crazy
things like dropping locks.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFMmRNydo=Bci=WM5mODS0vhiv8%2B2Oj39XKTXkEymYOJunhPxg>