From owner-freebsd-hackers@FreeBSD.ORG Wed May 23 01:24:37 2007 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2DDEF16A400 for ; Wed, 23 May 2007 01:24:37 +0000 (UTC) (envelope-from lstewart@room52.net) Received: from swin.edu.au (gpo5.cc.swin.edu.au [136.186.1.225]) by mx1.freebsd.org (Postfix) with ESMTP id BA89013C44B for ; Wed, 23 May 2007 01:24:36 +0000 (UTC) (envelope-from lstewart@room52.net) Received: from [136.186.229.95] (lstewart.caia.swin.edu.au [136.186.229.95]) by swin.edu.au (8.13.6.20060614/8.13.1) with ESMTP id l4N1OPpY014688; Wed, 23 May 2007 11:24:26 +1000 Message-ID: <465397FB.9080309@room52.net> Date: Wed, 23 May 2007 11:25:15 +1000 From: Lawrence Stewart User-Agent: Thunderbird 1.5.0.9 (X11/20070123) MIME-Version: 1.0 To: =?UTF-8?B?RGFnLUVybGluZyBTbcO4cmdyYXY=?= References: <4649349D.4060101@room52.net> <200705150847.38838.marc.loerner@hob.de> <46499491.2010205@room52.net> <46515DE0.20209@room52.net> <86sl9qtpd1.fsf@dwp.des.no> <4652AD8C.7000605@room52.net> <86r6p9md2n.fsf@dwp.des.no> In-Reply-To: <86r6p9md2n.fsf@dwp.des.no> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.4 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on gpo5.cc.swin.edu.au Cc: freebsd-hackers@freebsd.org Subject: Re: Writing a plain text file to disk from kernel space X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 May 2007 01:24:37 -0000 Comments inline... Dag-Erling Smørgrav wrote: > Lawrence Stewart writes: > >> After further investigation, it turns out that the pfil input hook I'm >> using, which catches packets as they traverse up the network stack, >> has no problems, and will happily write to the file using the >> kio_write function. However, in the pfil output hook, a call to >> kio_write causes a hard reset, with the following text shown on tty0: >> >> Sleeping thread (tid 100069, pid 613) owns a non-sleepable lock >> panic: sleeping thread >> > > This is a panic, not a hard reset. > > Since you are writing kernel code, I assume you have KDB/DDB in your > kernel and know how to use it. > I don't know how to use them really. Thus far I haven't had a need for really low level debugging tools... seems that may have changed though! Any good tutorials/pointers on how to get started with kernel debugging? > >> If I comment out the kio_write code and put a printf instead, there >> are no such problems, so it seems the kio_write function is doing >> something that is upsetting the kernel, but only when called from a >> function that is acting as a pfil output hook? Strikes me as odd >> behaviour. I don't understand which thread the error is in relation >> to, why that thread is sleeping or which lock it is referring to. >> > > kio_write probably blocks waiting for the write to complete. You can't > do that while holding a non-sleepable lock. > So this is where my knowledge/understanding gets very hazy... When a thread blocks waiting for some operation to complete or event to happen, the thread effectively goes to sleep, correct? Looking at the kio_write code in subr_kernio.c, I'm guessing the lock that is causing the trouble is related to the "vn_lock" function call? I don't understand though why the vnode lock would be set up in such a way that when the write blocks whilst waiting for the underlying filesystem to signal everything is ok, it causes the kernel to panic! How do I make the lock "sleepable" or make sure the thread doesn't try go to sleep whilst holding the lock? > >> I tried wrapping the call to kio_write in a mutex, in case there was a >> race condition caused by multiple threads trying to write to the file >> at the one time, but that hasn't made a difference at all. >> > > It complains about sleeping with a non-sleepable lock held, and your > solution is to add another non-sleepable lock? > I didn't realise and don't understand why a mutex is considered a non-sleepable lock? Reading the mutex man page, it seems clear that creation of a standard mutex can indeed allow an interrupt or other kernel event to preempt the current thread holding the mutex, and therefore allow the thread to sleep whilst the higher priority event is handled? Doesn't sound like it's non-sleepable to me, but I could very well be misunderstanding the terminology. All of that aside, why don't I get a "sleeping thread" panic when only the pfil input hook is put in place? If I was getting the panic when either an input or output hook was set, I wouldn't be so perplexed. But the fact that I only see this panic behaviour when the output hook (catching packets travelling down the network stack) is installed doesn't seem to add up. Any ideas? Cheers, Lawrence