From owner-freebsd-current@FreeBSD.ORG Sat Aug 23 07:14:04 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3A1851065675 for ; Sat, 23 Aug 2008 07:14:04 +0000 (UTC) (envelope-from kevinxlinuz@163.com) Received: from m13-131.163.com (m13-131.163.com [220.181.13.131]) by mx1.freebsd.org (Postfix) with SMTP id 29C5D8FC0C for ; Sat, 23 Aug 2008 07:14:02 +0000 (UTC) (envelope-from kevinxlinuz@163.com) Received: from 122.234.4.133 ( 122.234.4.133 [122.234.4.133] ) by ajax-webmail-wmsvr131 (Coremail) ; Sat, 23 Aug 2008 15:14:02 +0800 (CST) Date: Sat, 23 Aug 2008 15:14:02 +0800 (CST) From: kevin To: "John Baldwin" Message-ID: <32042518.170381219475642395.JavaMail.coremail@bj163app131.163.com> In-Reply-To: <200808230003.44081.jhb@freebsd.org> References: <200808230003.44081.jhb@freebsd.org> <11617822.2511219426408994.JavaMail.coremail@bj163app64.163.com> MIME-Version: 1.0 X-Originating-IP: [192.168.192.227 (122.234.4.133)] X-Priority: 3 X-Mailer: Coremail Webmail Server Version XT_Ux_snapshot build 080718(5706.1785.1782) Copyright (c) 2002-2008 www.mailtech.cn 163com X-CM-SenderInfo: pnhyx0x0ol03r26rljoofrz/1tbisxxZQkPUismP7QABsD X-Coremail-Antispam: 1U50xBIdaVrnuAawVACjsI_Ar4v6c8GOVW06r1DJrWUAwAa7V CY0VAaVVAqrcv_Jw1UWr13M4IEnf9ElVAFpTB2q-sK649IAas0WaI_GwAC6xAIw28IcVAK 0I8IjxAxMIAIbVAYjsxI4VWUJwCS07vE5I8CrVACY4xI64kE6c02F40Ex7xfMIAIbVAv7V C0I7IYx2IY67AKxVWUJVWUGwCS07vEYx0Ex4A2jsIE14v26r1j6r4UMIAIbVCjxxvEw4Wl V2xY6xkFs20EY4vE8sxKj4xv1wCS07vEc2IjII80xcxEwVAKI48JYxBIdaVFxhVjvjDU= Content-Type: text/plain; charset=gbk Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-current@freebsd.org Subject: Re: [BUG] I think sleepqueue need to be protected in sleepq_broadcast X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Aug 2008 07:14:04 -0000 >On Friday 22 August 2008 01:33:28 pm kevinxlinuz wrote: >> Hi, >> I'm looking in the problem ( amd64/124200: kernel panic on mutex sleepq >> chain).It troubles me for a long time.I add a KASSERT in sleepq_broadcast() >> to check the sleepqueue's wait channel.At last it turn out that the >> sleepqueue's wait channel was changed before sleepq_resume_thread(). In >> sleepq_lookup(),We can easily find sq->sq_wchan == wchan.But after a short >> time,the sq->sq_wchan nolonger equal with wchan,so I think it was changed >> by other threads. > >The sleepq chain lock is already held for all of sleepq_broadcast() by the >caller (see wakeup() and cv_broadcastpri()). That said, I don't have any >other good ideas for the panic you are seeing. Do you have a crash dump? It >might be interesting to see what other thread is using that sleep queue. > Sorry, panic does not work well for me.My system has 4G mem,but only 1.6G swap.When i want to get a coredump,it freeze at last. I can easily reproduce the panic. This is some of my painc info.Without the KASSERT in sleepq_broadcast(), it panic on sleepq_resume_thread(). db>show thread 100069 Thread 100069 at 0xffffff0004c73000: proc (pid 153):0xffffff0004c6a860 name: txg_thread_enter stack: 0xfffffffea603c000-0xfffffffea603efff flags:0x4 pflags:0x200000 state:RUNNING (CPU 1) priority:120 contaniner lock:sched lock 1(0xffffffff809a7300) db>show lock 0xffffffff809a7300 class:spin mutex name:sched lock 0 flags: {SPIN,RECURSE} state:{UNOWNED} db>show thread 100082 (thread on another cpu) Thread 100082 at 0xffffff0004c76700: proc (pid 152):0xffffff0004c89430 name:txg_thread_enter stack:0xfffffffea5f9c000-0xfffffffea5f9ffff flags:0x4 pflags:0x200000 state:RUNNING (CPU 0) wmesg:tx-tx_sync_lock wchan:0xffffff0004e095b8 priority: 160 container lock:sched lock 0 (0xffffffff809a6700) db>show lock 0xffffff0004e095b8 class: sx name:tx-tx_sync_lock state:XLOCK:0xffffff0004c73000(tid 100069,pid 153,"txg_thread_enter") waiters:exclusive db>bt 100069 Tracing pid 153 tid 100069 td 0xffffff0004c73000 kdb_enter() at kdb_enter=0x3d panic() at panic+0x16c assert_mtx() at assert_mtx sleepq_resume_thread() at sleepq_resume_thread+0x96 sleepq_broadcast() at sleepq_broadcast+0x85 cv_broadcastpri() at cv_broadcastpri+0x3f txg_sync_thread() at txg_sync_thread+0x4b4 fork_exit() aat fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0,rip=0,rsp =0xfffffffea603ed30,rbp=0 --- db>bt 100082 .. _thread_lock_flags() at _thread_lock_flags+0xc9 sleepq_wait() at sleepq_wait+0x3b _sx_xlock_hard() at _sx_xlock_hard+0x1a2 _sx_xlock() at _sx_xlock+0xa0 _cv_wait() at _cv_wait+0x1de txg_thread_wait() at txg_thread_wait+0x7d txg_quiesce_thread() at txg_quiesce_thread+0xb5 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0,rip=0,rsp = 0xfffffffea5f9fd30,rbp=0 --- If i increase the swap size to 4G,will the coredump work correctly? >> sleepq_broadcast(void *wchan, int flags, int pri, int queue) >> { >> struct sleepqueue *sq; >> struct thread *td; >> int wakeup_swapper; >> >> CTR2(KTR_PROC, "sleepq_broadcast(%p, %d)", wchan, flags); >> KASSERT(wchan != NULL, ("%s: invalid NULL wait channel", >> __func__)); MPASS((queue >= 0) && (queue < NR_SLEEPQS)); >> sq = sleepq_lookup(wchan); >> if (sq == NULL) >> return (0); >> KASSERT(sq->sq_type == (flags & SLEEPQ_TYPE), >> ("%s: mismatch between sleep/wakeup and cv_*", __func__)); >> >> /* Resume all blocked threads on the sleep queue. */ >> wakeup_swapper = 0; >> while (!TAILQ_EMPTY(&sq->sq_blocked[queue])) { >> td = TAILQ_FIRST(&sq->sq_blocked[queue]); >> thread_lock(td); >> /* test */ >> KASSERT(sq->sq_wchan == wchan, >> ("%s:mismatch between wchan and sq_wchan in >> sq",__func__)); /* I find the panic here */ >> if (sleepq_resume_thread(sq, td, pri)) >> wakeup_swapper = 1; >> thread_unlock(td); >> } >> return (wakeup_swapper); >> } >> >> Thanks, >> kevin 2008/08/23 >> >> _______________________________________________ >> freebsd-current@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > > > >-- >John Baldwin >_______________________________________________ >freebsd-current@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-current >To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" -- Thanks, kevin