Date: Wed, 24 Apr 2002 13:30:01 -0400 (EDT) From: John Baldwin <jhb@FreeBSD.org> To: Matthew Jacob <mjacob@feral.com> Cc: hackers@FreeBSD.org Subject: RE: mutex owned stuff fallible? Message-ID: <XFMail.20020424133001.jhb@FreeBSD.org> In-Reply-To: <Pine.BSF.4.21.0204240912560.60421-100000@beppo>
next in thread | previous in thread | raw e-mail | index | archive | help
On 24-Apr-2002 Matthew Jacob wrote: > > > On Wed, 24 Apr 2002, John Baldwin wrote: > >> >> On 24-Apr-2002 Matthew Jacob wrote: >> > >> > This is a recent i386 SMP kernel: >> > >> > >> > panic: mutex isp not owned at ../../../kern/kern_synch.c:449 >> > cpuid = 0; lapic.id = 00000000 >> > Debugger("panic") >> > Stopped at Debugger+0x41: xorl %eax,%eax >> > db> >> > db> t >> > Debugger(c031189a) at Debugger+0x41 >> > panic(c0310ae8,c030470d,c0312018,1c1,d2d08438) at panic+0xd8 >> > _mtx_assert(d2d0843c,9,c0312018,1c1,69) at _mtx_assert+0x59 >> > msleep(d2d08438,d2d0843c,4c,c0301260,7d0) at msleep+0x157 >> > isp_mboxcmd(d2d08400,d2d19c04,f,d07dee8,0) at isp_mboxcmd+0x19c >> > isp_fw_state(d2d08400,d2d19c54,d2d08400,d2d09000,d2d08400) at >> > isp_fw_state+0x2b >> > isp_fclink_test(d2d08400,1e8480,d2d08400,d2d09000,d2d0843c) at >> > isp_fclink_test+0x5d >> > isp_control(d2d08400,4,d2d19d18) at isp_control+0x28b >> > isp_kthread(d2d08400,d2d19d48,d2d02a3c,c017b25c,0) at isp_kthread+0x6d >> > fork_exit(c017b25c,d2d08400,d2d19d48) at fork_exit+0x88 >> > fork_trampoline() at fork_trampoline+0x37 >> >> Is this code that is checked into the tree? > > Yes. > >> If so I can't see where >> isp_kthread() calls isp_control(). > > isp_fc_runstate is an inline that calls isp_control. Ah, ok. >> mtx_owned() should always work. If >> we own the lock then we were the last to write to it, so the value in our >> cache can't be stale (at least, not the thread value, the contested bit >> could be set by another CPU, but we mask off that bit when reading the >> owner, so it's value doesn't matter). If we don't own the lock, it's >> value but we don't care so long as we don't get a false positive. Since >> we would have to write out the unowned cookie before another lock could >> grab it though, we would at least have a value that up to date, so we >> wouldn't read a stale value that had us owning the lock when we didn't. > > This pp is hard to parse, but I think we're in agreement that this occurrence > is 'inconceivable'. Yes. > I am *very* puzzled. Me, too. The next time this happens, try dumping the contents of the mutex structure from ddb. The first argument to mtx_assert() and 2nd arg to msleep() is a pointer to the mutex, so you have the address. (The pointer looks right since the name was right in the panic message at least.) The first bits of the structure will be a struct lock_object which contains 3 pointers, an int, and then 2 more pointers. The next word will be the actual lock contents. You can use 'show pcpu' to get the per-CPU information containing (among other things) curthread. The value of the lock should be curthread (possibly with bits 1 or 2 set). If it is 0x4 (MTX_UNOWNED) it means the lock was released somehow. If that is the case, you can compile KTR into your kernel with lock tracing using: options KTR options KTR_COMPILE=KTR_LOCK options KTR_MASK=KTR_LOCK Then when it breaks do a 'show ktr' in ddb to get a trace of the most recent lock operations. You might want to turn on KTR_PROC as well (s/KTR_LOCK/(KTR_LOCK|KTR_PROC)/ above) so that you see when we switch processes so it is less confusing. This info might be useful to look at anyways. Hmm, I wonder if the mutex is recursed and mtx_assert() isn't printing the right error message? Hmm, nope. -- John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.20020424133001.jhb>