From owner-freebsd-sparc64@FreeBSD.ORG Tue Mar 1 23:04:30 2005 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1900616A4CF for ; Tue, 1 Mar 2005 23:04:30 +0000 (GMT) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.192]) by mx1.FreeBSD.org (Postfix) with ESMTP id D0BAE43D5E for ; Tue, 1 Mar 2005 23:04:28 +0000 (GMT) (envelope-from bosko.milekic@gmail.com) Received: by wproxy.gmail.com with SMTP id 70so2439457wra for ; Tue, 01 Mar 2005 15:04:27 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=dZeLE31ZQy9p686MrwZ5ROAAUahNPUv/f4QmtSJXUWHOIIaIC1gMoAaWpzGqa484CPCW3SijDH7yQ+Pl4YtPHQDcNa/UpBxAkQS1dcyKEYCtCu78IyBEC1U+XE7qorrte//dal8Zc9K2ek9zNmLncT/Kw9klvao4r351Hc/NB4Y= Received: by 10.54.18.62 with SMTP id 62mr85204wrr; Tue, 01 Mar 2005 15:04:27 -0800 (PST) Received: by 10.54.24.41 with HTTP; Tue, 1 Mar 2005 15:04:27 -0800 (PST) Message-ID: Date: Tue, 1 Mar 2005 18:04:27 -0500 From: Bosko Milekic To: John Baldwin In-Reply-To: <200503011340.18162.jhb@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <20050301000436.GA33346@xor.obsecurity.org> <200503011340.18162.jhb@FreeBSD.org> cc: Kris Kennaway cc: net@freebsd.org cc: rwatson@freebsd.org cc: bmilekic@freebsd.org cc: sparc64@freebsd.org cc: freebsd-sparc64@freebsd.org Subject: Re: Race condition in mb_free_ext()? X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Bosko Milekic List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2005 23:04:30 -0000 On Tue, 1 Mar 2005 13:40:18 -0500, John Baldwin wrote: > On Monday 28 February 2005 07:04 pm, Kris Kennaway wrote: > > I'm seeing an easily-provoked livelock on quad-CPU sparc64 machines > > running RELENG_5. It's hard to get a good trace because the processes > > running on other CPUs cannot be traced from DDB, but I've been lucky a > > few times: > > > > db> show alllocks > > Process 15 (swi1: net) thread 0xfffff8001fb07480 (100008) > > exclusive sleep mutex so_snd r = 0 (0xfffff800178432a8) locked @ > > netinet/tcp_input.c:2189 exclusive sleep mutex inp (tcpinp) r = 0 > > (0xfffff800155c3b08) locked @ netinet/tcp_input.c:744 exclusive sleep mutex > > tcp r = 0 (0xc0bdf788) locked @ netinet/tcp_input.c:617 db> wh 15 > > Tracing pid 15 tid 100008 td 0xfffff8001fb07480 > > sab_intr() at sab_intr+0x40 > > psycho_intr_stub() at psycho_intr_stub+0x8 > > intr_fast() at intr_fast+0x88 > > -- interrupt level=0xd pil=0 %o7=0xc01a0040 -- > > mb_free_ext() at mb_free_ext+0x28 > > sbdrop_locked() at sbdrop_locked+0x19c > > tcp_input() at tcp_input+0x2aa0 > > ip_input() at ip_input+0x964 > > netisr_processqueue() at netisr_processqueue+0x7c > > swi_net() at swi_net+0x120 > > ithread_loop() at ithread_loop+0x24c > > fork_exit() at fork_exit+0xd4 > > fork_trampoline() at fork_trampoline+0x8 > > db> > > > > That code is here in mb_free_ext(): > > > > /* > > * This is tricky. We need to make sure to decrement the > > * refcount in a safe way but to also clean up if we're the > > * last reference. This method seems to do it without race. > > */ > > while (dofree == 0) { > > cnt = *(m->m_ext.ref_cnt); > > if (atomic_cmpset_int(m->m_ext.ref_cnt, cnt, cnt - 1)) { > > if (cnt == 1) > > dofree = 1; > > break; > > } > > } > > Well, this is obtuse at least. A simpler version would be: > > do { > cnt = *m->m_ext.ref_cnt; > } while (atomic_cmpset_int(m->m_ext.ref_cnt, cnt, cnt - 1) == 0); > dofree = (cnt == 1); > > -- > John Baldwin <>< http://www.FreeBSD.org/~jhb/ > "Power Users Use the Power to Serve" = http://www.FreeBSD.org Your suggestion will always enter the loop and do the atomic regardless of what dofree is set to above that code (not shown in Kris' paste): [...] /* Account for lazy ref count assign. */ if (m->m_ext.ref_cnt == NULL) dofree = 1; else dofree = 0; /* * This is tricky. We need to make sure to decrement the * refcount in a safe way but to also clean up if we're the * last reference. This method seems to do it without race. */ [...] The segment could still be reworked, but anyway: This does not appear to explain the livelock. What's m->m_ext.ref_cnt point to? And what is the value at the location pointed to by m->m_ext.ref_cnt? Regardless, though, the livelock itself, assuming it is due to a long time being spent spinning in the above loop, should not be caused by underruns or overruns of the reference count (those may only cause leaking of the cluster). Furthermore, the above code has been around in that form for some time now and in fact the loop was probably entered *more* often in the past (before the 'dofree' variable was introduced there). Since when are you able to cause the livelock to happen, and are you sure it is the mb_free_ext() that is looping indefinitely? I do not know sparc64 well, but what are the semantics of atomic_cmpset_int()? I see that it is defined to use the 'casa' instruction; does atomic_cmpset_int() behave the same way as it does on i386? -Bosko -- Bosko Milekic - If I were a number, I'd be irrational. Contact Info: http://bmilekic.unixdaemons.com/contact.txt