Date: Sun, 11 Oct 2009 21:38:33 -0700 From: Harsha <inpcb.harsha@gmail.com> To: Robert Watson <rwatson@freebsd.org> Cc: freebsd-current@freebsd.org, net@freebsd.org Subject: Re: Page fault in IFNET_WLOCK_ASSERT [if.c and pccbb.c] Message-ID: <e1b1c5880910112138x1b46ff0eo39c10691a978c164@mail.gmail.com> In-Reply-To: <alpine.BSF.2.00.0910112126050.48605@fledge.watson.org> References: <e1b1c5880910111226o65e0d1a9va975f4cd837271bb@mail.gmail.com> <alpine.BSF.2.00.0910112126050.48605@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Robert, On Sun, Oct 11, 2009 at 1:30 PM, Robert Watson <rwatson@freebsd.org> wrote: > Giant is a bit special in that the long-term sleep code in the kernel knows > to drop it when sleeping, and re-acquire when waking up. So, unlike all > other mutexes, it should be OK to hold it in this case, as Giant will simply > get dropped if the kernel has to sleep waiting on a sleepable lock. This is > because, historically in FreeBSD 3.x/4.x, the kernel was protected by a > single spinlock, which would get released whenever the kernel stopped > executing, such as during an I/O sleep. On the whole, Giant has disappeared > from the modern kernel, but where it is used, it retains those curious > historic properties. > > To break things down a bit further, IFNET_WLOCK is, itself, a bit special: > notice that in FreeBSD 8, it's actually two locks, a sleep lock, and a > mutex, which must both be acquired exclusively to ensure mutual exclusion. > if_alloc() and associated calls are also sleepable because they perform > potentially sleeping memory allocation (M_WAITOK), so it's an invariant of > any code calling interface allocation that it must be able to tolerate a > sleep. Thanks a lot for the clarification. I had assumed that the lock was non-sleepable looking at this log - Kernel page fault with the following non-sleepable locks held: exclusive rw ifnet_rw (ifnet_rw) r = 0 (0xc0f63464) locked @ /usr/src/sys/net/if.c:409 > Do you have a copy of the stack trace and fault information handy? In my > experience, a NULL pointer deref or other page fault in the locking code for > a global lock is almost always corrupted thread state, perhaps due to > tripping over another thread having locked a corrupted/freed/uninitialized > lock. We might be able to track that down by tracing other threads that > were in execution at the time of the panic. I just tried the textdump feature and I think its an awesome tool. Here is ddb.txt- http://docs.google.com/View?id=dddwnxfj_0dh4x58hc And msgbuf.txt- http://docs.google.com/View?id=dddwnxfj_1cnmrb8fw For some reason the output of show alllocks is not written into ddb.txt, though I have increased the buffer size to 2MB. Thanks, Harsha
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?e1b1c5880910112138x1b46ff0eo39c10691a978c164>
