From owner-freebsd-threads@FreeBSD.ORG Wed Oct 27 22:30:17 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 235C516A4CE; Wed, 27 Oct 2004 22:30:17 +0000 (GMT) Received: from mail.ntplx.net (mail.ntplx.net [204.213.176.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id AA86943D2D; Wed, 27 Oct 2004 22:30:16 +0000 (GMT) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) i9RMUFKw022657; Wed, 27 Oct 2004 18:30:15 -0400 (EDT) Date: Wed, 27 Oct 2004 18:30:15 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: John Baldwin In-Reply-To: <200410271429.14164.jhb@FreeBSD.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.ntplx.net) cc: threads@freebsd.org Subject: Re: Infinite loop bug in libc_r on 4.x with condition variables and signals X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Daniel Eischen List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Oct 2004 22:30:17 -0000 On Wed, 27 Oct 2004, John Baldwin wrote: > On Thursday 21 October 2004 07:04 pm, Daniel Eischen wrote: > > On Thu, 21 Oct 2004, John Baldwin wrote: > > > The behavior seems more to be this: > > > > > > - thread does pthread_cond_wait*(c1) > > > - thread enqueued on c1 > > > - thread interrupted by a signal while on c1 but still in PS_RUNNING > > > > This shouldn't happen when signals are deferred. It should > > only happen when the state is PS_COND_WAIT after we've > > context switched to the scheduler. > > > > > - thread saves state which excludes the PTHREAD_FLAGS_IN_CONDQ flag > > > (among others) > > > > Right, because it assumes that the thread will be backed out of > > any mutex or CV queues prior to invoking the signal handler. > > > > > - thread calls _cond_wait_backout() if state is PS_COND_WAIT (but it's > > > not in - this case, this is the normal case though, which is why it's ok > > > to not save the CONDQ flag in the saved state above) > > > > Right. The problem is, how is the thread getting setup for a signal > > while signals are deferred and the state has not yet been changed > > from PS_RUNNING to PS_COND_WAIT? > > > > > - thread executes signal handler > > > - thread restores state > > > - pthread_condwait*() see that interrupted is 0, so don't try to remove > > > the thread from the condition variable (also, PTHREAD_FLAGS_IN_CONDQ > > > isn't set either, so we can't detect this case that way) > > > - thread returns from pthread_cond_wait() (maybe due to timeout, etc.) > > > - thread calls pthread_cond_wait*(c2) > > > - thread enqueued on c2 > > > - another thread does pthread_cond_broadcast(c2), and bewm > > > > > > My question is is it possible for the thread to get interrupted and > > > chosen to run a signal while it is on c1 somehow given my patch to defer > > > signals around the wait loops (and is that patch correct btw given the > > > above scenario?) > > > > Yes (and yes I think). Defering signals just means that the signal handler > > won't try to install a signal frame on the current thread; instead it just > > queues the signal and the scheduler will pick it up and send it to the > > correct thread. > > > > I do think signals should be deferred for condition variables so > > that setting the thread state (to PS_COND_WAIT) is atomic. > > > > It's not obvious to be where the bug is. If you had a simple > > test case to reproduce it that would help. > > FWIW, we are having (I think) the same problem on 5.3 with libpthread. The > panic there is in the mutex code about an assertion failing because a thread > is on a syncq when it is not supposed to be. David and I recently fixed some races in pthread_join() and pthread_exit() in -current libpthread. Don't know if those were responsible... Here's a test program that shows correct behavior with both libc_r and libpthread in -current. -- Dan Eischen #include #include #include #include pthread_mutex_t gm; static void handler(int sig) { printf("Thread %p Got signal %d\n", pthread_self(), sig); } static void * waiter(void *arg) { sigset_t set; sigemptyset(&set); sigaddset(&set, SIGUSR1); sigprocmask(SIG_UNBLOCK, &set, NULL); for (;;) { pthread_mutex_lock(&gm); printf("Waiter locked mutex."); sleep(1); pthread_mutex_unlock(&gm); printf("Waiter unlocked mutex."); sleep(1); } return (NULL); } int main(int argc, char *argv[]) { struct sigaction act; sigset_t set; pthread_t tid; sigemptyset(&set); sigaddset(&set, SIGUSR1); sigprocmask(SIG_BLOCK, &set, NULL); /* Install a handler for SIGUSR1. */ act.sa_handler = handler; act.sa_flags = SA_RESTART; sigfillset(&act.sa_mask); sigaction(SIGUSR1, &act, NULL); pthread_mutex_init(&gm, NULL); pthread_mutex_lock(&gm); pthread_create(&tid, NULL, waiter, NULL); for (;;) { kill(getpid(), SIGUSR1); pthread_yield(); } return (0); }