From owner-freebsd-current Thu Oct 22 09:53:53 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id JAA20541 for freebsd-current-outgoing; Thu, 22 Oct 1998 09:53:53 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from panzer.plutotech.com (panzer.plutotech.com [206.168.67.125]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id JAA20528 for ; Thu, 22 Oct 1998 09:53:50 -0700 (PDT) (envelope-from ken@panzer.plutotech.com) Received: (from ken@localhost) by panzer.plutotech.com (8.9.1/8.8.5) id KAA16575; Thu, 22 Oct 1998 10:53:15 -0600 (MDT) From: "Kenneth D. Merry" Message-Id: <199810221653.KAA16575@panzer.plutotech.com> Subject: Re: cdda2wav == panic (/sys/vm/vm_page.c:516) In-Reply-To: <19981021224539.A10190@znh.org> from Zach Heilig at "Oct 21, 98 10:45:39 pm" To: zach@gaffaneys.com (Zach Heilig) Date: Thu, 22 Oct 1998 10:53:15 -0600 (MDT) Cc: FreeBSD-current@FreeBSD.ORG, dg@root.com X-Mailer: ELM [version 2.4ME+ PL28s (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Zach Heilig wrote... > This is an ELF kernel compiled from sources cvsup-ed ~01:50 GMT (Oct 22) > > Relevent hardware: > ncr0: rev 0x04 int a irq 11 on pci0.13.0 > cd0 at ncr0 bus 0 target 5 lun 0 > cd0: Removable CD-ROM SCSI2 device > cd0: 10.0MB/s transfers (10.0MHz, offset 8) > cd0: cd present [176612 x 2048 byte records] > > The ncr0 is a diamond fireport 40, and the cdr is the only device on that bus > (it is in an external case, with a terminator plugged into the passthrough > connector). It works very well burning audio/data tracks and reading data > tracks. > > I noticed this earlier today with a kernel from Oct 10. The panic with an up > to date kernel is different from the Oct 10 kernel. That kernel would usually > wait until cdda2wav exited before panic'ing (complaining about dirty pages -- > the last 5-10 megs or so of the track would be zero's after reboot), today's > kernel panics 8-10 Mbytes into the track (at least the 3 times I tried to > read an audio track). > > stack trace: [ ... ] > #0 boot (howto=256) at ../../kern/kern_shutdown.c:268 > 268 dumppcb.pcb_cr3 = rcr3(); > (kgdb) where > #0 boot (howto=256) at ../../kern/kern_shutdown.c:268 > #1 0xf01478bc in at_shutdown (function=0xf022901e , > arg=0xf48daba8, queue=-267228095) at ../../kern/kern_shutdown.c:430 > #2 0xf0126ca1 in db_panic (addr=-266451744, have_addr=0, count=-1, > modif=0xf48dab30 "") at ../../ddb/db_command.c:432 > #3 0xf0126c41 in db_command (last_cmdp=0xf0246804, cmd_table=0xf0246664, > aux_cmd_tablep=0xf025c4e4) at ../../ddb/db_command.c:332 > #4 0xf0126d06 in db_command_loop () at ../../ddb/db_command.c:454 > #5 0xf0129067 in db_trap (type=12, code=0) at ../../ddb/db_trap.c:71 > #6 0xf01ec719 in kdb_trap (type=12, code=0, regs=0xf48dac70) > at ../../i386/i386/db_interface.c:157 > #7 0xf01f6d93 in trap_fatal (frame=0xf48dac70) at ../../i386/i386/trap.c:874 > #8 0xf01f6a84 in trap_pfault (frame=0xf48dac70, usermode=0) > at ../../i386/i386/trap.c:772 > #9 0xf01f66d7 in trap (frame={tf_es = 16, tf_ds = 16, tf_edi = -236365800, > tf_esi = 8550, tf_ebp = -192041804, tf_isp = -192041832, tf_ebx = 51390, > tf_edx = 65470, tf_ecx = -192026224, tf_eax = -264085512, > tf_trapno = 12, tf_err = 0, tf_eip = -266451744, tf_cs = 8, > tf_eflags = 66054, tf_esp = 0, tf_ss = 0}) at ../../i386/i386/trap.c:396 > #10 0xf01e44e0 in vm_page_lookup (object=0xf48de990, pindex=8550) > at ../../vm/vm_page.c:516 > #11 0xf01630c7 in allocbuf (bp=0xf1e95818, size=8192) > at ../../kern/vfs_bio.c:1782 > #12 0xf0162cb2 in getblk (vp=0xf48a82c0, blkno=4275, size=8192, slpflag=0, > slptimeo=0) at ../../kern/vfs_bio.c:1557 > #13 0xf01cb09f in ffs_balloc (ap=0xf48dae98) at ../../ufs/ffs/ffs_balloc.c:297 > #14 0xf01d38a4 in ffs_write (ap=0xf48daeec) at vnode_if.h:1015 > #15 0xf016dc17 in vn_write (fp=0xf0b28640, uio=0xf48daf30, cred=0xf0a59b00) > at vnode_if.h:331 > #16 0xf014f9a2 in write (p=0xf4834e00, uap=0xf48daf84) > at ../../kern/sys_generic.c:270 > #17 0xf01f7017 in syscall (frame={tf_es = 39, tf_ds = 39, tf_edi = 805601292, > tf_esi = 805601292, tf_ebp = 805730652, tf_isp = -192041004, > tf_ebx = 129360, tf_edx = 805601292, tf_ecx = 0, tf_eax = 4, > tf_trapno = 7, tf_err = 2, tf_eip = 671874136, tf_cs = 31, > tf_eflags = 582, tf_esp = -272640664, tf_ss = 39}) > at ../../i386/i386/trap.c:1031 > #18 0xf01ed06c in Xint0x80_syscall () > (kgdb) This is a known problem. Daniel O'Conner first reported it with 2.2.7 and CAM. See PR kern/8112. I was also able to reproduce the problem under -current/CAM last month. I haven't messed with it since. Here's the stack trace from my panic last month (Sept. 8th): ================================================================== login: vm_page_free: pindex(63), busy(0), PG_BUSY(1), hold(9) panic: vm_page_free: freeing busy page mp_lock = 01000001; cpuid = 1; lapic.id = 00000000 Debugger("panic") Stopped at _Debugger+0x35: movb $0,_in_Debugger.98 db> trace _Debugger(f0134343) at _Debugger+0x35 _panic(f01f17ff,f054241c,80000000,f83f3ea8,f01f19b0) at _panic+0x8d _vm_page_freechk_and_unqueue(f054241c) at _vm_page_freechk_and_unqueue+0x6e _vm_page_free(f054241c,f8976220,0,f83f3ed4,f01eeffd) at _vm_page_free+0x1c _vm_object_terminate(f8976220,f4dd85f0,f091e220,f189e440,f83f3ee8) at _vm_object_terminate+0xb7 _vm_object_deallocate(f8976220,f4dd85f0,0,f83f3f00,f01423fe) at _vm_object_deallocate+0x1c9 _shm_deallocate_segment(f4dd85f0,f189e440,0,f8344cc0,f83f3f1c) at _shm_deallocate_segment+0x12 _shm_delete_mapping(f8344cc0,f189e440) at _shm_delete_mapping+0x6e _shmexit(f8344cc0) at _shmexit+0x29 _exit1(f8344cc0,0,f83f3fb4,f020dbdf,f8344cc0) at _exit1+0x1bc _exit(f8344cc0,f83f3f94,200c2060,ffffffff,0) at _exit+0x14 _syscall(27,27,0,ffffffff,efbfd2f4) at _syscall+0x187 _Xsyscall() at _Xsyscall+0x55 --- syscall 0x1, eip = 0x200b1c4d, esp = 0xefbfd2e0, ebp = 0xefbfd2f4 --- db> panic panic: from debugger mp_lock = 01000002; cpuid = 1; lapic.id = 00000000 boot() called on cpu#1 ================================================================== It looks like your panic is somewhat different from the one I saw. Daniel O'Conner was able to work around this by hacking cdda2wav so it didn't remove shared memory segments. However, he got the same panic later when he tried to remove the shared memory segments by hand. From your later mail, it looks like you've found other ways to work around it. If I knew what the problem was, I would have probably fixed it by now. :) I think it will take someone knowledgeable about the VM system to fix this, so I'm CCing this to David. :) The CAM passthrough driver uses vmapbuf() and vunmapbuf() (via cam_periph_mapmem()) to map data segments into and out of kernel virtual memory. My guess is that this, in combination with cdda2wav's shared memory usage, exposes some VM bug. Anyway, hopefully someone can shed some light on this. Ken -- Kenneth Merry ken@plutotech.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message