Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Jul 2015 16:13:27 +0200
From:      Philippe Jalaber <pjalaber@gmail.com>
To:        freebsd-hackers@freebsd.org
Cc:        jhb@freebsd.org, attilio@freebsd.org
Subject:   Re: adaptive rwlock deadlock
Message-ID:  <CA%2Bi3By%2BYeWgXLzPYVcU5dRpqB%2BEvGaQJ9AhWj4TAoqQiEaaGSw@mail.gmail.com>
In-Reply-To: <CA%2Bi3ByK8TLb6cRCw3dJgGYCb81ENE=HrgsDX%2BMM-=yVn8P1hgg@mail.gmail.com>
References:  <CA%2Bi3ByK8TLb6cRCw3dJgGYCb81ENE=HrgsDX%2BMM-=yVn8P1hgg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
2015-07-07 12:10 GMT+02:00 Philippe Jalaber <pjalaber@gmail.com>:

> Hi,
>
> I am facing a strange problem using the network stack and adaptive rwlocks
> running Freebsd 9.3.
> Basically I can reproduce the problem with 3 threads:
>
> 1) thread 1 has taken the rwlock of structure inpcb in exclusive mode in
> tcp_input.c. This thread also runs my own code and repeatedly takes a
> rwlock (called g_rwlock) in shared mode and releases it, until a shared
> object is marked not "busy" any more:
>
> rwlock(inp_lock);
> ....
> do { // thread is active waiting in the loop
>     rlock(g_rwlock);
>     o = find();
>     if ( o == NULL )
>         break;
>     busy = o.busy;
>     if (o != NULL && busy)
>         runlock(g_rwlock);
> } while ( busy );
>
> if ( o != NULL )
> {
>     // do something with o
>     ....
> }
> runlock(g_rwlock);
> ....
>
> 2) thread 2 wants to set the shared object as "ready". So it tries to take
> g_rwlock in exclusive mode and is blocked in _rw_wlock_hard@kern_rwlock.c:815
> "turnstile_wait(ts, rw_owner(rw), TS_EXCLUSIVE_QUEUE)" because thread 1 has
> already taken it in shared mode:
>
> wlock(g_rwlock);
> o = find();
> if ( o != NULL )
>     o.busy = 1;
> wunlock(g_rwlock);
>
> // o is busy so work on it without any lock
> ....
>
> wlock(g_rwlock); // thread is blocked here
> o.busy = 0;
> maybe_delete(o);
> wunlock(g_rwlock);
>
> 3) thread 3 spins on the same inpcb rwlock than thread 1 in
> _rw_wlock_hard@kern_rwlock.c:721 "while ((struct
> thread*)RW_OWNER(rw->rw_lock) == owner && TD_IS_RUNNING(owner)) "
>
>
> My target machine has two cpus.
> Thread 1 is pinned to cpu 0.
> Thread 2 and Thread 3 are pinned to cpu 1.
> Thread 1 and Thread 2 have a priority of 28.
> Thread 3 has a priority of 127
>
> Now what seems to happen is that when thread 1 calls runlock(g_rwlock), it
> calls turnstile_broadcast@kern_rwlock.c:650, but thread 2 never regains
> control because thread 3 is spinning on the inpcb rwlock. Also the
> condition TD_IS_RUNNING(owner) is always true because thread 1 is active
> waiting in a loop. So the 3 threads deadlock.
> Note that if I compile the kernel without adaptive rwlocks it works
> without any problem.
> A workaround is to add a call to "sched_relinquish(curthread)" in thread 1
> in the loop just after the call to runlock.
>
> I am also wondering about the code in _rw_runlock after
> "turnstile_broadcast(ts, queue)". Isn't the flag RW_LOCK_WRITE_WAITERS
> definitely lost if the other thread which is blocked in turnstile_wait
> never regains control ?
>
> Thank you for your time,
> Regards,
> Philippe
>
>
the sched_relinquish workaround does not seem to work every time.
one possible solution (which seems to work) is to rlock/runlock in thread
1, and if the busy flag is set, then take the lock in exclusive mode, like
this:

shared_count = 0;
rwlock(inp_lock);
....
do { // thread is active waiting in the loop
    if ( shared_count == 0 )
        rlock(g_rwlock);
    else
        wlock(g_rwlock);
    o = find();
    if ( o == NULL )
        break;
    busy = o.busy;
    if (o != NULL && busy)
    {
        if ( shared_count == 0 )
             runlock(g_rwlock);
        else
             wunlock(g_rwlock);
        shared_count++;
    }
} while ( busy );

if ( o != NULL )
{
    // do something with o
    ....
}
if ( shared_count == 0 )
    runlock(g_rwlock);
else
    wunlock(g_rwlock);


with this code, deadlock does not happen anymore but I don't really see
why. Any idea ?

Thanks,
Philippe



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CA%2Bi3By%2BYeWgXLzPYVcU5dRpqB%2BEvGaQJ9AhWj4TAoqQiEaaGSw>