Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 04 Aug 2015 13:10:50 -0700
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-hackers@freebsd.org
Cc:        Philippe Jalaber <pjalaber@gmail.com>
Subject:   Re: adaptive rwlock deadlock
Message-ID:  <2768515.JZVZhYiQVE@ralph.baldwin.cx>
In-Reply-To: <CA%2Bi3ByK8TLb6cRCw3dJgGYCb81ENE=HrgsDX%2BMM-=yVn8P1hgg@mail.gmail.com>
References:  <CA%2Bi3ByK8TLb6cRCw3dJgGYCb81ENE=HrgsDX%2BMM-=yVn8P1hgg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, July 07, 2015 12:10:19 PM Philippe Jalaber wrote:
> Hi,
> 
> I am facing a strange problem using the network stack and adaptive rwlocks
> running Freebsd 9.3.
> Basically I can reproduce the problem with 3 threads:
> 
> 1) thread 1 has taken the rwlock of structure inpcb in exclusive mode in
> tcp_input.c. This thread also runs my own code and repeatedly takes a
> rwlock (called g_rwlock) in shared mode and releases it, until a shared
> object is marked not "busy" any more:
> 
> rwlock(inp_lock);
> ....
> do { // thread is active waiting in the loop
>     rlock(g_rwlock);
>     o = find();
>     if ( o == NULL )
>         break;
>     busy = o.busy;
>     if (o != NULL && busy)
>         runlock(g_rwlock);
> } while ( busy );
> 
> if ( o != NULL )
> {
>     // do something with o
>     ....
> }
> runlock(g_rwlock);
> ....
> 
> 2) thread 2 wants to set the shared object as "ready". So it tries to take
> g_rwlock in exclusive mode and is blocked in _rw_wlock_hard@kern_rwlock.c:815
> "turnstile_wait(ts, rw_owner(rw), TS_EXCLUSIVE_QUEUE)" because thread 1 has
> already taken it in shared mode:
> 
> wlock(g_rwlock);
> o = find();
> if ( o != NULL )
>     o.busy = 1;
> wunlock(g_rwlock);
> 
> // o is busy so work on it without any lock
> ....
> 
> wlock(g_rwlock); // thread is blocked here
> o.busy = 0;
> maybe_delete(o);
> wunlock(g_rwlock);
> 
> 3) thread 3 spins on the same inpcb rwlock than thread 1 in
> _rw_wlock_hard@kern_rwlock.c:721 "while ((struct
> thread*)RW_OWNER(rw->rw_lock) == owner && TD_IS_RUNNING(owner)) "
> 
> 
> My target machine has two cpus.
> Thread 1 is pinned to cpu 0.
> Thread 2 and Thread 3 are pinned to cpu 1.
> Thread 1 and Thread 2 have a priority of 28.
> Thread 3 has a priority of 127
> 
> Now what seems to happen is that when thread 1 calls runlock(g_rwlock), it
> calls turnstile_broadcast@kern_rwlock.c:650, but thread 2 never regains
> control because thread 3 is spinning on the inpcb rwlock. Also the
> condition TD_IS_RUNNING(owner) is always true because thread 1 is active
> waiting in a loop. So the 3 threads deadlock.
> Note that if I compile the kernel without adaptive rwlocks it works without
> any problem.
> A workaround is to add a call to "sched_relinquish(curthread)" in thread 1
> in the loop just after the call to runlock.

It sounds like we are not forcing a preemption on CPU 1 in this case via
sched_add().

For SCHED_4BSD you could try the 'FULL_PREEMPTION' kernel option.
For ULE you can adjust 'preempt_thresh' on the fly, though I think the
default setting should actually still work.

Can you use KTR or some such to determine if IPI_PREEMPT is being sent by
CPU 0 to CPU 1 in this case?

> I am also wondering about the code in _rw_runlock after
> "turnstile_broadcast(ts, queue)". Isn't the flag RW_LOCK_WRITE_WAITERS
> definitely lost if the other thread which is blocked in turnstile_wait
> never regains control ?

All the write waiters are awakened by a broadcast (as opposed to a signal
operation).  They are on the run queue, not on the turnstile queue anymore,
so there aren't any write waiters left (the bit only tracks if there are
waiters on the turnstile).

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2768515.JZVZhYiQVE>