Date: Mon, 21 Sep 2009 15:52:55 +0100 From: Andrew Brampton <brampton+freebsd-net@gmail.com> To: Bruce Evans <brde@optusnet.com.au> Cc: freebsd-net@freebsd.org Subject: Re: Is this a race in mbuf's refcounting? Message-ID: <d41814900909210752t23309836y4b8a447e811db6d2@mail.gmail.com> In-Reply-To: <20090921235604.U12163@delplex.bde.org> References: <d41814900909210543p46894d83u6d814353ea1ee130@mail.gmail.com> <20090921235604.U12163@delplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
2009/9/21 Bruce Evans <brde@optusnet.com.au>: > On Mon, 21 Sep 2009, Andrew Brampton wrote: > >> I've been reading the FreeBSD source code to understand how mbufs are >> reference counted. However, there are a few bits of code that I'm >> wondering if they would fail under the exactly right timing. Take for >> example in uipc_mbuf.c: >> >> 286 static void >> 287 mb_dupcl(struct mbuf *n, struct mbuf *m) >> 288 { >> ... >> 293 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (*(m->m_ext.ref_cnt) =3D=3D 1) >> 294 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0*(m->m_ext.re= f_cnt) +=3D 1; >> 295 =C2=A0 =C2=A0 =C2=A0 =C2=A0else >> 296 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0atomic_add_in= t(m->m_ext.ref_cnt, 1); >> ... >> 305 } >> >> Now, the way I understand this code is, if ref_cnt is 1, then it is >> not shared. In that case non-atomically increment ref_cnt. However, if >> ref_cnt was something else, then it is shared so update the value in >> an atomic way. This seems valid, however what happens if two threads >> call mb_dupcl at the same time with a non-shared m. Could they both >> evaluate the if on line 293 at the same time, and then both >> non-atomically increment ref_cnt? >> >> If this could happen then we have a lost update and our reference >> counting is broken. I've also noticed that in other places similar >> optimisations are made to avoid the atomic operation. >> >> So is this a problem? > > I don't see how it can work. > > Also, if the count was 1, then it should become 2, but there is nothing t= o > flush the store to memory. =C2=A0This seems to mainly enlarge the race wi= ndow > for the previous problem. > > Bruce > Sorry, are you agreeing or disagreeing with my original post? If you are disagreeing I would appreciate if you could explain the error in my ways. I see the following happening: Thread 1: Reads *(m->m_ext.ref_cnt) and determines it is 1, and enters the true branch of the if Thread 1: Then reads *(m->m_ext.ref_cnt) again (since it is volatile) Thread 2: Interrupts and reads *(m->m_ext.ref_cnt) and determines it is 1, and enters the true branch of the if Thread 2: Then reads *(m->m_ext.ref_cnt), adds one to it and stores the result (ie 2) Thread 1: Resumes with the value it had (ie 1) and adds one to it, and stores the result (ie 2) Due to this sequence we have lost an update, since the value of *(m->m_ext.ref_cnt) should be 3. Now if this if wasn't there and atomic_add_int is used the result will be 3. If you find a flaw in my logic please point it out. thanks Andrew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d41814900909210752t23309836y4b8a447e811db6d2>