Date: Mon, 28 Aug 2006 17:04:37 -0400 From: Randall Stewart <rrs@cisco.com> To: freebsd-net@freebsd.org Subject: Problem with uipc_mbuf.c Message-ID: <44F35A65.3080605@cisco.com>
next in thread | raw e-mail | index | archive | help
Hi all: In 6.1 the function mb_free_ext(struct mbuf *m) looked like this: ............................................... void mb_free_ext(struct mbuf *m) { u_int cnt; int dofree; /* Account for lazy ref count assign. */ if (m->m_ext.ref_cnt == NULL) dofree = 1; else dofree = 0; /* * This is tricky. We need to make sure to decrement the * refcount in a safe way but to also clean up if we're the * last reference. This method seems to do it without race. */ while (dofree == 0) { cnt = *(m->m_ext.ref_cnt); if (atomic_cmpset_int(m->m_ext.ref_cnt, cnt, cnt - 1)) { if (cnt == 1) dofree = 1; break; } } if (dofree) { /* * Do the free, should be safe. */ switch (m->m_ext.ext_type) { ................................. Other fine code that does the freeing... .................................. Now, in 7.0 we have: ------------------------------------------- void mb_free_ext(struct mbuf *m) { KASSERT((m->m_flags & M_EXT) == M_EXT, ("%s: M_EXT not set", __func__)); KASSERT(m->m_ext.ref_cnt != NULL, ("%s: ref_cnt not set", __func__)); /* Free attached storage if this mbuf is the only reference to it. */ if (*(m->m_ext.ref_cnt) == 1 || atomic_fetchadd_int(m->m_ext.ref_cnt, -1) == 0) { switch (m->m_ext.ext_type) { ------------------------------------- Other stuff that does the freeing ------------------------------------- This new code is broken... I am sad to say.. I have spent a LARGE amount of time hunting an "mbuf" leak.. and this is where I have traced it to... Now here is what I see happening.. I have two Xeon PIV machines.. Dell servers.. with Hyper threading.. One runs 6.1 Release.. the other 7.0... current. Now I am playing with SCTP.. and when I run netpipe the 7.0 machine leaks mbufs I have traced it down to the fact that SCTP uses finer grain locks.. so what is going on is the input processing is happening... and on the other CPU the reader (netpipe) is reading the data. This causes m_free()'s to be called on the same EXT's about the same time... and they don't get freed. Now to prove that it was NOT the SCTP code I went in and put a new mutex around the actual m_free and m_freem code... letting only one guy in at a time to free.. and all my leaks went away.. (my performance was also drug down a lot.. but thats besides the point)... So... has anyone else seen this? Or is SCTP the only one who is truely excercising this bug? I am thinking about restoring the old code.. since it appears to work... Any comments or help would be appreciated.. Thanks R -- Randall Stewart NSSTG - Cisco Systems Inc. 803-345-0369 <or> 815-342-5222 (cell)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?44F35A65.3080605>