Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Jun 2008 21:44:37 +0200
From:      "Attilio Rao" <attilio@freebsd.org>
To:        kevinxlinuz <kevinxlinuz@163.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: mutex sleepq chain not owned at /usr/src/sys/kern/subr_sleepqueue.c
Message-ID:  <3bbf2fe10806101244t6627d759g404df7da58c728e5@mail.gmail.com>
In-Reply-To: <3760478.505011213039827828.JavaMail.coremail@bj163app90.163.com>
References:  <3760478.505011213039827828.JavaMail.coremail@bj163app90.163.com>

next in thread | previous in thread | raw e-mail | index | archive | help
2008/6/9, kevinxlinuz <kevinxlinuz@163.com>:
> Recently I meet a problem in freebsd 8.0/amd64.
>  See PR/124200
>  http://www.freebsd.org/cgi/query-pr.cgi?pr=124200&cat=
>
>  I try to find the reason.
>
>  in cv_broadcastpri(...),it call sleepq_lock(cvp),next it call sleepq_broadcast(cvp, SLEEPQ_CONDVAR, pri, 0).
>  in sleepq_broadcast(void *wchan, int flags, int pri, int queue),sleepqueue sq = sleepq_lookup(wchan)  /* here wchan will be checked,and sq->sq_wchan == wchan == cvp (passed from cv_broadcastpri())*/;
>  I add mtx_assert in /usr/src/sys/kern/subr_sleepqueue.c
>  sleepq_broadcast(void *wchan, int flags, int pri, int queue)
>  {
>         struct sleepqueue *sq;
>         struct thread *td;
>
>         struct sleepqueue_chain *sc;
>
>         CTR2(KTR_PROC, "sleepq_broadcast(%p, %d)", wchan, flags);
>         KASSERT(wchan != NULL, ("%s: invalid NULL wait channel", __func__));
>         MPASS((queue >= 0) && (queue < NR_SLEEPQS));
>         sq = sleepq_lookup(wchan);   //wchan == cvp, cvp from cv_broadcastpri(...),and sleepq_lock(cvp)
>        //here sq->sq_wchan == wchan == cvp
>         if (sq == NULL)
>                 return;
>         KASSERT(sq->sq_type == (flags & SLEEPQ_TYPE),
>             ("%s: mismatch between sleep/wakeup and cv_*", __func__));
>
>         /* Resume all blocked threads on the sleep queue. */
>         while (!TAILQ_EMPTY(&sq->sq_blocked[queue])) {
>                 td = TAILQ_FIRST(&sq->sq_blocked[queue]);
>                 thread_lock(td);
>         /*    ------test start---------- */
>                 sc = SC_LOOKUP(sq->sq_wchan);   //sq->sq_wchan should be wchan
>                 mtx_assert(&sc->sc_lock, MA_OWNED);   //panic here,sq->sq_wchan != wchan ? or sleepq_unlock(wchan) was called by others
>        /*    -----test end----- */
>                 sleepq_resume_thread(sq, td, pri);
>                 thread_unlock(td);
>         }
>  }
>

Hello,
We are trying to track this down but things go very slowly because I
can't reproduce the bug.
I would need you try some diagnostic patches, do you think you can
work on that with me? Can you reproduce easilly the bug?

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3bbf2fe10806101244t6627d759g404df7da58c728e5>