Date: Sat, 1 Oct 2011 14:15:45 +0800 From: dave jones <s.dave.jones@gmail.com> To: Robert Watson <rwatson@freebsd.org> Cc: Mikolaj Golub <trociny@freebsd.org>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, Adrian Chadd <adrian@freebsd.org>, "K. Macy" <kmacy@freebsd.org>, Arnaud Lacombe <lacombar@gmail.com> Subject: Re: Kernel panic on FreeBSD 9.0-beta2 Message-ID: <CANf5e8ab=mUw-AJuRXZy1T6%2BZcryxjKfuCOsakPPfqatuA3HdA@mail.gmail.com> In-Reply-To: <alpine.BSF.2.00.1109301432570.65269@fledge.watson.org> References: <CANf5e8aG4go4M_vsRExUsJB_sjaN5x-QK-TCDAhSH64JSo0mdQ@mail.gmail.com> <CACqU3MXStMMEoppvDtZS6hV4WGttbdJiF8E-ORwJ%2BQSmnTy-Yg@mail.gmail.com> <CACqU3MV-t4Va6VWUoXy1Y9FYnNJTUw1X%2BE7ik-2%2BtMVuVOV3RA@mail.gmail.com> <CAJ-Vmom-177OkdUXjz%2BZLqbaqn=p%2BuTGypiVuMqdeXgdOgb4hQ@mail.gmail.com> <CAHM0Q_Mmn3z1V6AtZHQMpgbdY7oQqOChiNt=8NJrZQDnravb7A@mail.gmail.com> <CACqU3MU9ZZtOsdBOa%2BF3SqUaYgO%2BEo0v1ACjY0S4rY4fRQyv5Q@mail.gmail.com> <CAHM0Q_PZD9_0ZkELZ5XL8Ebh8eD-uFuSjXWKKVpGDeM_JDaqMA@mail.gmail.com> <8662kcigif.fsf@kopusha.home.net> <alpine.BSF.2.00.1109301432570.65269@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Sep 30, 2011 at 9:41 PM, Robert Watson wrote: > > On Wed, 28 Sep 2011, Mikolaj Golub wrote: > >> On Mon, 26 Sep 2011 16:12:55 +0200 K. Macy wrote: >> >> KM> Sorry, didn't look at the images (limited bw), I've seen something K= M> >> like this before in timewait. This "can't happen" with UDP so will be KM= > >> interested in learning more about the bug. >> >> The panic can be easily triggered by this: > > Hi: > > Just catching up on this thread. =A0I think the analysis here is generall= y > right: in 9.0, you're much more likely to see an inpcb with its in_socket > pointer cleared in the hash list than in prior releases, and > in_pcbbind_setup() trips over this. > > However, at least on first glance (and from the perspective of invariants > here), I think the bug is actualy that in_pcbbind_setup() is asking > in_pcblookup_local() for an inpcb and then access the returned inpcb's > in_socket pointer without acquiring a lock on the inpcb. =A0Structurally,= it > can't acquire this lock for lock order reasons -- it already holds the lo= ck > on its own inpcb. =A0Therefore, we should only access fields that are saf= e to > follow in an inpcb when you hold a reference via the hash lock and not a > lock on the inpcb itself, which appears generally OK (+/-) for all the > fields in that clause but the t->inp_socket->so_options dereference. > > A preferred fix would cache the SO_REUSEPORT flag in an inpcb-layer field= , > such as inp_flags2, giving us access to its value without having to walk > into the attached (or not) socket. > > This raises another structural question, which is whether we need a new > inp_foo flags field that is protected explicitly by the hash lock, and no= t > by the inpcb lock, which could hold fields relevant to address binding. = =A0I > don't think we need to solve that problem in this context, as a slightly > race on SO_REUSEPORT is likely acceptable. > > The suggested fix does perform the desired function of explicitly detachi= ng > the inpcb from the hash list before the socket is disconnected from the > inpcb. However, it's incomplete in that the invariant that's being broken= is > also relied on for other protocols (such as raw sockets). =A0The correct > invariant is that inp_socket is safe to follow unconditionally if an inpc= b > is locked and INP_DROPPED isn't set -- the bug is in "locked" not in > "INP_DROPPED", which is why I think this is the wrong fix, even though it > prevents a panic :-). Hello Robert, Thank you for taking your valuable time to find out the problem. Since I don't have idea about network internals, would you have a patch about this? I'd be glad to test it, thanks again. > Robert Best regards, Dave.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANf5e8ab=mUw-AJuRXZy1T6%2BZcryxjKfuCOsakPPfqatuA3HdA>