From owner-freebsd-current@FreeBSD.ORG Sun Jan 4 15:35:00 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A5F2016A4CE for ; Sun, 4 Jan 2004 15:35:00 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 174B843D49 for ; Sun, 4 Jan 2004 15:34:52 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id i04NYi7E009950; Sun, 4 Jan 2004 15:34:48 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200401042334.i04NYi7E009950@gw.catspoiler.org> From: Don Lewis To: shoesoft@gmx.net In-Reply-To: <1073256379.801.24.camel@shoeserv.freebsd> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: current@FreeBSD.org Subject: Re: page fault panic tracked down (selwakeuppri()) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 04 Jan 2004 23:35:00 -0000 X-Original-Date: Sun, 4 Jan 2004 15:34:44 -0800 (PST) X-List-Received-Date: Sun, 04 Jan 2004 23:35:00 -0000 On 4 Jan, Stefan Ehmann wrote: > On Sun, 2004-01-04 at 23:24, Don Lewis wrote: >> On 4 Jan, Stefan Ehmann wrote: >> > I took out the debug options because it was just too slow. Put back >> > INVARIANTS (but no WITNESS) now and speed is nice again. >> >> This problem is more likely to be caught by INVARIANTS than WITNESS. >> >> > Applied your suggested changes which resulted in a panic. No >> > assertations were triggered though. >> >> Bummer! > > Updated to plain (= no patches/hacks) again, also put in the > DEBUG_VFS_LOCKS. > > For the first time I got a backtrace that ended in the soundcard module > - So maybe this is the right direction (on the other hand this might be > some newly introduced error) > > panic: bad bufsize > #0 doadump () at /usr/src/sys/kern/kern_shutdown.c:240 > #1 0xc04e5198 in boot (howto=256) at > /usr/src/sys/kern/kern_shutdown.c:372 > #2 0xc04e5527 in panic () at /usr/src/sys/kern/kern_shutdown.c:550 > #3 0xc07ec648 in feed_vchan_s16 () from /boot/kernel/snd_pcm.ko > #4 0xc07e2c6d in sndbuf_feed () from /boot/kernel/snd_pcm.ko > #5 0xc07e3225 in chn_wrfeed () from /boot/kernel/snd_pcm.ko > #6 0xc07e327c in chn_wrintr () from /boot/kernel/snd_pcm.ko > #7 0xc07e3990 in chn_intr () from /boot/kernel/snd_pcm.ko > #8 0xc07fca2f in csa_intr () from /boot/kernel/snd_csa.ko > #9 0xc07fb724 in csa_intr () from /boot/kernel/snd_csa.ko > #10 0xc04d1692 in ithread_loop (arg=0xc1737b00) > at /usr/src/sys/kern/kern_intr.c:544 > #11 0xc04d0684 in fork_exit (callout=0xc04d1500 , arg=0x0, > frame=0x0) at /usr/src/sys/kern/kern_fork.c:796 I think this is an important clue. I'm guessing that the KASSERT(sndbuf_getsize(src) >= count, ("bad bufsize")) in feed_vchan_s16() is getting tripped. Notice that a bit further down we have the following code: count &= ~1; bzero(b, count); [ snip ] tmp = (int16_t *)sndbuf_getbuf(src); bzero(tmp, count); As I recall from our previous debugging efforts, the data structures that are getting corrupted are getting zeroed. I suspect that either the source or b parameters to feed_vchan_s16() are bogus, causing some unrelated part of the heap to get stomped on. Because the KASSERT() is getting triggered here, I'm more suspicious of the source parameter.