Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 9 Jul 2012 16:25:32 -0400
From:      Ryan Stone <rysto32@gmail.com>
To:        Gleb Smirnoff <glebius@freebsd.org>
Cc:        bzeeb-lists@lists.zabbadoz.net, Przemyslaw Frasunek <venglin@freebsd.lublin.pl>, Mike Tancsa <mike@sentex.net>, Eugene Grosbein <egrosbein@rdtc.ru>, freebsd-net@freebsd.org
Subject:   Re: mpd5/Netgraph issues after upgrading to 7.4
Message-ID:  <CAFMmRNydo=Bci=WM5mODS0vhiv8%2B2Oj39XKTXkEymYOJunhPxg@mail.gmail.com>
In-Reply-To: <20120709081225.GJ21957@glebius.int.ru>
References:  <20110513162311.GK95084@glebius.int.ru> <4DD298AD.2060905@frasunek.com> <20110517184613.GN74366@glebius.int.ru> <4FDB1D71.6050908@freebsd.lublin.pl> <20120615203142.GW28613@glebius.int.ru> <4FDBAFD7.9020606@freebsd.lublin.pl> <4FDF2F81.6030307@sentex.net> <4FDF3097.6080701@freebsd.lublin.pl> <4FE0EE62.5070905@freebsd.lublin.pl> <4FF7F2C6.5070401@freebsd.lublin.pl> <20120709081225.GJ21957@glebius.int.ru>

index | next in thread | previous in thread | raw e-mail

On Mon, Jul 9, 2012 at 4:12 AM, Gleb Smirnoff <glebius@freebsd.org> wrote:
> This looks very much related to a known race in ARP code.
>
> See this email and related thread:
>
> http://lists.freebsd.org/pipermail/freebsd-net/2012-March/031865.html
>
> Ryan didn't check in any patches since, and I failed to follow on this
> problem due to ENOTIME.
>
> I've added Ryan to Cc. Ryan, what's the status of the problem at your
> side? Did you come to any solution?

Unfortunately I was never able to come to a satisfactory solution.  As
I recall, in the end I ran headlong into problems with making the
locking sane.  The big problem was with arpresolve.  At one point it
calls callout_reset to schedule the LLE's la_timer.  In my patch this
would have to be done with a write lock help on the afdata lock.
However, this acquisition would have to be done before taking the
LLE_LOCK to prevent a LOR, and in the end you conclude that you have
to take a write lock on the ifnet's afdata lock for every packet that
goes through arpresolve, which was a non-starter.  That's the point
that I reached before I got distracted by other things at $WORK.

As I recall, the in6 case was even worse, as the in6 equivalent of
arptimer is significantly more complicated and likes to do crazy
things like dropping locks.


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFMmRNydo=Bci=WM5mODS0vhiv8%2B2Oj39XKTXkEymYOJunhPxg>