Date: Tue, 1 Mar 2005 18:10:55 -0500 From: Bosko Milekic <bosko.milekic@gmail.com> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-sparc64@freebsd.org Subject: Re: Race condition in mb_free_ext()? Message-ID: <bbebbd3d05030115104ac8f86e@mail.gmail.com> In-Reply-To: <bbebbd3d0503011504560a94b4@mail.gmail.com> References: <20050301000436.GA33346@xor.obsecurity.org> <200503011340.18162.jhb@FreeBSD.org> <bbebbd3d0503011504560a94b4@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
P.S.: Not sure if this is related: http://www.mail-archive.com/bk-commits-24@vger.kernel.org/msg00136.html On Tue, 1 Mar 2005 18:04:27 -0500, Bosko Milekic <bosko.milekic@gmail.com> wrote: > On Tue, 1 Mar 2005 13:40:18 -0500, John Baldwin <jhb@freebsd.org> wrote: > > On Monday 28 February 2005 07:04 pm, Kris Kennaway wrote: > > > I'm seeing an easily-provoked livelock on quad-CPU sparc64 machines > > > running RELENG_5. It's hard to get a good trace because the processes > > > running on other CPUs cannot be traced from DDB, but I've been lucky a > > > few times: > > > > > > db> show alllocks > > > Process 15 (swi1: net) thread 0xfffff8001fb07480 (100008) > > > exclusive sleep mutex so_snd r = 0 (0xfffff800178432a8) locked @ > > > netinet/tcp_input.c:2189 exclusive sleep mutex inp (tcpinp) r = 0 > > > (0xfffff800155c3b08) locked @ netinet/tcp_input.c:744 exclusive sleep mutex > > > tcp r = 0 (0xc0bdf788) locked @ netinet/tcp_input.c:617 db> wh 15 > > > Tracing pid 15 tid 100008 td 0xfffff8001fb07480 > > > sab_intr() at sab_intr+0x40 > > > psycho_intr_stub() at psycho_intr_stub+0x8 > > > intr_fast() at intr_fast+0x88 > > > -- interrupt level=0xd pil=0 %o7=0xc01a0040 -- > > > mb_free_ext() at mb_free_ext+0x28 > > > sbdrop_locked() at sbdrop_locked+0x19c > > > tcp_input() at tcp_input+0x2aa0 > > > ip_input() at ip_input+0x964 > > > netisr_processqueue() at netisr_processqueue+0x7c > > > swi_net() at swi_net+0x120 > > > ithread_loop() at ithread_loop+0x24c > > > fork_exit() at fork_exit+0xd4 > > > fork_trampoline() at fork_trampoline+0x8 > > > db> > > > > > > That code is here in mb_free_ext(): > > > > > > /* > > > * This is tricky. We need to make sure to decrement the > > > * refcount in a safe way but to also clean up if we're the > > > * last reference. This method seems to do it without race. > > > */ > > > while (dofree == 0) { > > > cnt = *(m->m_ext.ref_cnt); > > > if (atomic_cmpset_int(m->m_ext.ref_cnt, cnt, cnt - 1)) { > > > if (cnt == 1) > > > dofree = 1; > > > break; > > > } > > > } > > > > Well, this is obtuse at least. A simpler version would be: > > > > do { > > cnt = *m->m_ext.ref_cnt; > > } while (atomic_cmpset_int(m->m_ext.ref_cnt, cnt, cnt - 1) == 0); > > dofree = (cnt == 1); > > > > -- > > John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ > > "Power Users Use the Power to Serve" = http://www.FreeBSD.org > > Your suggestion will always enter the loop and do the atomic > regardless of what dofree is set to above that code (not shown in > Kris' paste): > > [...] > /* Account for lazy ref count assign. */ > if (m->m_ext.ref_cnt == NULL) > dofree = 1; > else > dofree = 0; > > /* > * This is tricky. We need to make sure to decrement the > * refcount in a safe way but to also clean up if we're the > * last reference. This method seems to do it without race. > */ > [...] > > The segment could still be reworked, but anyway: > > This does not appear to explain the livelock. What's m->m_ext.ref_cnt > point to? And what is the value at the location pointed to by > m->m_ext.ref_cnt? Regardless, though, the livelock itself, assuming it > is due to a long time being spent spinning in the above loop, should > not be caused by underruns or overruns of the reference count (those > may only cause leaking of the cluster). > > Furthermore, the above code has been around in that form for some time > now and in fact the loop was probably entered *more* often in the past > (before the 'dofree' variable was introduced there). Since when are > you able to cause the livelock to happen, and are you sure it is the > mb_free_ext() that is looping indefinitely? > > I do not know sparc64 well, but what are the semantics of > atomic_cmpset_int()? I see that it is defined to use the 'casa' > instruction; does atomic_cmpset_int() behave the same way as it does > on i386? > > -Bosko > > -- > Bosko Milekic - If I were a number, I'd be irrational. > Contact Info: http://bmilekic.unixdaemons.com/contact.txt > -- Bosko Milekic - If I were a number, I'd be irrational. Contact Info: http://bmilekic.unixdaemons.com/contact.txt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bbebbd3d05030115104ac8f86e>