Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Feb 2005 16:04:36 -0800
From:      Kris Kennaway <kris@obsecurity.org>
To:        net@FreeBSD.org
Cc:        sparc64@FreeBSD.org
Subject:   Race condition in mb_free_ext()?
Message-ID:  <20050301000436.GA33346@xor.obsecurity.org>

next in thread | raw e-mail | index | archive | help

--VS++wcV0S1rZb1Fb
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

I'm seeing an easily-provoked livelock on quad-CPU sparc64 machines
running RELENG_5.  It's hard to get a good trace because the processes
running on other CPUs cannot be traced from DDB, but I've been lucky a
few times:

db> show alllocks
Process 15 (swi1: net) thread 0xfffff8001fb07480 (100008)
exclusive sleep mutex so_snd r = 0 (0xfffff800178432a8) locked @ netinet/tcp_input.c:2189
exclusive sleep mutex inp (tcpinp) r = 0 (0xfffff800155c3b08) locked @ netinet/tcp_input.c:744
exclusive sleep mutex tcp r = 0 (0xc0bdf788) locked @ netinet/tcp_input.c:617
db> wh 15
Tracing pid 15 tid 100008 td 0xfffff8001fb07480
sab_intr() at sab_intr+0x40
psycho_intr_stub() at psycho_intr_stub+0x8
intr_fast() at intr_fast+0x88
-- interrupt level=0xd pil=0 %o7=0xc01a0040 --
mb_free_ext() at mb_free_ext+0x28
sbdrop_locked() at sbdrop_locked+0x19c
tcp_input() at tcp_input+0x2aa0
ip_input() at ip_input+0x964
netisr_processqueue() at netisr_processqueue+0x7c
swi_net() at swi_net+0x120
ithread_loop() at ithread_loop+0x24c
fork_exit() at fork_exit+0xd4
fork_trampoline() at fork_trampoline+0x8
db>

That code is here in mb_free_ext():

        /*
         * This is tricky.  We need to make sure to decrement the
         * refcount in a safe way but to also clean up if we're the
         * last reference.  This method seems to do it without race.
         */
        while (dofree == 0) {
                cnt = *(m->m_ext.ref_cnt);
                if (atomic_cmpset_int(m->m_ext.ref_cnt, cnt, cnt - 1)) {
                        if (cnt == 1)
                                dofree = 1;
                        break;
                }
        }

mb_free_ext+0x24:       casa            0x4 , %g2, %g1
mb_free_ext+0x28:       subcc           %g1, %g2, %g0

which is inside the atomic_cmpset_int (i.e. it's probably spinning in
the loop).

Can anyone see if there's a problem with this code, or perhaps the
sparc64 implementation of atomic_cmpset_int()?

Kris









--VS++wcV0S1rZb1Fb
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (FreeBSD)

iD8DBQFCI7GUWry0BWjoQKURAlD7AJ972l7rDX+G0cG95Iv2pqEVRINnrQCdHQeP
fItGM33s+lUrRQehQkKJx8I=
=TG2u
-----END PGP SIGNATURE-----

--VS++wcV0S1rZb1Fb--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050301000436.GA33346>