From owner-freebsd-hackers Tue Jul 24 15: 0:16 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from jordan.llnl.gov (jordan.llnl.gov [128.115.36.14]) by hub.freebsd.org (Postfix) with ESMTP id 0226837B407 for ; Tue, 24 Jul 2001 15:00:06 -0700 (PDT) (envelope-from alley1@llnl.gov) Received: (from wea@localhost) by jordan.llnl.gov (8.11.4/8.11.4) id f6OM04g00517 for freebsd-hackers@freebsd.org; Tue, 24 Jul 2001 15:00:04 -0700 (PDT) Date: Tue, 24 Jul 2001 15:00:04 -0700 (PDT) From: Ed Alley Message-Id: <200107242200.f6OM04g00517@jordan.llnl.gov> To: freebsd-hackers@freebsd.org Subject: Kernel panic reading non-fixated CD Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG RE: Kernel panic reading a non-fixated CD I am running FreeBSD 4.3 with an IDE HP cd-writer 9500 series. I have been successfully making CD's unsing burncd since I installed it. However, I mistakenly tried to mount a CD which I failed to fixate and I got a kernel panic. I was able to de-bug the kernel code and found out where the problem is. I have included a patch which works for me and would like to hear whether it is sufficient or what I should do next. I found out through my investigations into this that the ATAPI interface isn't followed closely by manufactures. For instance before we installed this HP CDRW we had installed a Yamaha CDRW which displayed other problems (among them is that it won't fixate using burncd under FreeBSD). In addition my CDROM on my home computer which is running FreeBSD 4.2 doesn't cause a panic when I try to mount a non-fixated CD it just refuses to do it. So ATAPI of one manufacturer is not ATAPI of another. The problem with what I am doing is that most (if not everybody) reading this will not have my hardware configuration to test this problem on. So I have included part of my gdb session below so you can see how I came up with my patch. So here is the panic message that I get when I try to mount the non-fixated CD; you can see that it is a page fault: (kgdb) symbol-file kernel.debug Reading symbols from kernel.debug...done. (kgdb) exec-file /var/crash.gdb/kernel.0 (kgdb) core-file /var/crash.gdb/vmcore.0 IdlePTD 2711552 initial pcb at 221800 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0xc0d96000 fault code = supervisor write, page not present instruction pointer = 0x8:0xc01b6c2e stack pointer = 0x10:0xc0206f10 frame pointer = 0x10:0xc0206f20 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = Idle interrupt mask = bio trap number = 12 panic: page fault Here is the trace of the corpse: (kgdb) where #0 dumpsys () at ../../kern/kern_shutdown.c:469 #1 0xc01389c3 in boot (howto=256) at ../../kern/kern_shutdown.c:309 #2 0xc0138d40 in poweroff_wait (junk=0xc01ff28f, howto=0) at ../../kern/kern_shutdown.c:556 #3 0xc01d68f1 in trap_fatal (frame=0xc0206ed0, eva=3235471360) at ../../i386/i386/trap.c:951 #4 0xc01d65c9 in trap_pfault (frame=0xc0206ed0, usermode=0, eva=3235471360) at ../../i386/i386/trap.c:844 #5 0xc01d61af in trap (frame={tf_fs = -65520, tf_es = -973537264, tf_ds = 6488080, tf_edi = -1059495936, tf_esi = 32768, tf_ebp = -1071616224, tf_isp = -1071616260, tf_ebx = -1059685120, tf_edx = 368, tf_ecx = 7168, tf_eax = -1060624128, tf_trapno = 12, tf_err = 2, tf_eip = -1071944658, tf_cs = 8, tf_eflags = 66054, tf_esp = -1063045216, tf_ss = -1059685120}) at ../../i386/i386/trap.c:443 #6 0xc01b6c2e in atapi_read (request=0xc0d67d00, length=32768) at machine/cpufunc.h:222 #7 0xc01b66cb in atapi_interrupt (request=0xc0d67d00) at ../../dev/ata/atapi-all.c:391 #8 0xc01afcee in ata_intr (data=0xc0c82900) at ../../dev/ata/ata-all.c:1154 (kgdb) The routine atapi_read() is where the error occured. By poking around I discovered that the bytecount request was enormous: print request->bytecount $1 = 4294934528 (kgdb) x/x &request->bytecount 0xc0d67d18: 0xffff8000 x/d &request->bytecount 0xc0d67d18: -32768 (kgdb) So you can see that 32768 was subtracted off of an unsigned zero! If the first request was for bytecount zero then atapi_read() will read nothing but subtract size = 32768 from bytecount before returning. Since bytecount is unsigned this causes the roll over to a big number. The next call then attempts to read a bytecount of over 4G. My patch is very simple: In atapi-all.c in routine atapi_interrupt() for case ATAPI_P_READ I cast bytecount to a long and check for zero or negative. If it is zero or negative I write an error message and break out. This avoids atapi_read() and returns with and error message. The patch must be executed in /usr/src/sys/dev/ata as: patch -p < patch.file Patch file: *** atapi-all.c.orig Tue Jul 24 13:21:03 2001 --- atapi-all.c Tue Jul 24 13:28:45 2001 *************** *** 382,387 **** --- 382,393 ---- return ATA_OP_CONTINUES; case ATAPI_P_READ: + if ((long)request->bytecount <= 0) { + printf("%s: %s trying to read with bytecount = %d\n", + atp->devname, atapi_cmd2str(atp->cmd), + (long)request->bytecount); + break; + } if (!(request->flags & ATPR_F_READ)) { request->result = inb(atp->controller->ioaddr + ATA_ERROR); printf("%s: %s trying to read on write buffer\n", Thank-you in anticipation for your comments. I am a newbie at kernel debugging, so if I have done anything stupid please go easy on me. :) Ed Alley wea@llnl.gov To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message