Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 7 Jul 2015 12:10:19 +0200
From:      Philippe Jalaber <pjalaber@gmail.com>
To:        freebsd-hackers@freebsd.org
Subject:   adaptive rwlock deadlock
Message-ID:  <CA%2Bi3ByK8TLb6cRCw3dJgGYCb81ENE=HrgsDX%2BMM-=yVn8P1hgg@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi,

I am facing a strange problem using the network stack and adaptive rwlocks
running Freebsd 9.3.
Basically I can reproduce the problem with 3 threads:

1) thread 1 has taken the rwlock of structure inpcb in exclusive mode in
tcp_input.c. This thread also runs my own code and repeatedly takes a
rwlock (called g_rwlock) in shared mode and releases it, until a shared
object is marked not "busy" any more:

rwlock(inp_lock);
....
do { // thread is active waiting in the loop
    rlock(g_rwlock);
    o = find();
    if ( o == NULL )
        break;
    busy = o.busy;
    if (o != NULL && busy)
        runlock(g_rwlock);
} while ( busy );

if ( o != NULL )
{
    // do something with o
    ....
}
runlock(g_rwlock);
....

2) thread 2 wants to set the shared object as "ready". So it tries to take
g_rwlock in exclusive mode and is blocked in _rw_wlock_hard@kern_rwlock.c:815
"turnstile_wait(ts, rw_owner(rw), TS_EXCLUSIVE_QUEUE)" because thread 1 has
already taken it in shared mode:

wlock(g_rwlock);
o = find();
if ( o != NULL )
    o.busy = 1;
wunlock(g_rwlock);

// o is busy so work on it without any lock
....

wlock(g_rwlock); // thread is blocked here
o.busy = 0;
maybe_delete(o);
wunlock(g_rwlock);

3) thread 3 spins on the same inpcb rwlock than thread 1 in
_rw_wlock_hard@kern_rwlock.c:721 "while ((struct
thread*)RW_OWNER(rw->rw_lock) == owner && TD_IS_RUNNING(owner)) "


My target machine has two cpus.
Thread 1 is pinned to cpu 0.
Thread 2 and Thread 3 are pinned to cpu 1.
Thread 1 and Thread 2 have a priority of 28.
Thread 3 has a priority of 127

Now what seems to happen is that when thread 1 calls runlock(g_rwlock), it
calls turnstile_broadcast@kern_rwlock.c:650, but thread 2 never regains
control because thread 3 is spinning on the inpcb rwlock. Also the
condition TD_IS_RUNNING(owner) is always true because thread 1 is active
waiting in a loop. So the 3 threads deadlock.
Note that if I compile the kernel without adaptive rwlocks it works without
any problem.
A workaround is to add a call to "sched_relinquish(curthread)" in thread 1
in the loop just after the call to runlock.

I am also wondering about the code in _rw_runlock after
"turnstile_broadcast(ts, queue)". Isn't the flag RW_LOCK_WRITE_WAITERS
definitely lost if the other thread which is blocked in turnstile_wait
never regains control ?

Thank you for your time,
Regards,
Philippe



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2Bi3ByK8TLb6cRCw3dJgGYCb81ENE=HrgsDX%2BMM-=yVn8P1hgg>