From owner-freebsd-bugs@FreeBSD.ORG Fri Jul 18 13:20:06 2003 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4EC1D37B401 for ; Fri, 18 Jul 2003 13:20:06 -0700 (PDT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8DEE643F85 for ; Fri, 18 Jul 2003 13:20:05 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.9/8.12.9) with ESMTP id h6IKK5Up094966 for ; Fri, 18 Jul 2003 13:20:05 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.9/8.12.9/Submit) id h6IKK5Q8094965; Fri, 18 Jul 2003 13:20:05 -0700 (PDT) Date: Fri, 18 Jul 2003 13:20:05 -0700 (PDT) Message-Id: <200307182020.h6IKK5Q8094965@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Daniel Lang Subject: Re: kern/54616: System hangs writing CD-Rs with "atapicam" device X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Daniel Lang List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Jul 2003 20:20:06 -0000 The following reply was made to PR kern/54616; it has been noted by GNATS. From: Daniel Lang To: freebsd-gnats-submit@freebsd.org Cc: dl@leo.org Subject: Re: kern/54616: System hangs writing CD-Rs with "atapicam" device Date: Fri, 18 Jul 2003 22:11:31 +0200 Ok,=20 own followup to my PR, I have more data: To Marius suggestions: 1. try cdrdao with WITHOUT_SCGLIB=3Dyes: I did that and found that it did not make any differce, so libscg stuff is probably not the cause. 2. try DMA mode for ATAPI CD-RW: I also tried that, by booting with hw.ata.atapi_dma=3D1. Burning the cd's went faster, but the problem still happened. (I've tried all four combinations of these options) Then, I did more debugging with remote GDB: I broke into DDB/GDB during a hang. Then of course the=20 backtrace was useless, as described before. However, I've set a Breakpoint in camisr() and continued. Soon the breakpoint was reached, and I tried to get some more info. I'm not sure if that was successful... Here is a transcript of my gdb session: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3Dremote kGDB transcript=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D (kgdb) b camisr Breakpoint 1 at 0xc0127958: file /usr/src/sys/cam/cam_xpt.c, line 6298. (kgdb) c Continuing. Breakpoint 1, camisr (queue=3D0xc02f3250) at /usr/src/sys/cam/cam_xpt.c:6298 6298 (kgdb) bt #0 camisr (queue=3D0xc02f3250) at /usr/src/sys/cam/cam_xpt.c:6298 #1 0xc0127945 in swi_cambio () at /usr/src/sys/cam/cam_xpt.c:6288 #2 0xc0253aa3 in doreti_swi () (kgdb) p queue $1 =3D (cam_isrq_t *) 0xc02f3250 (kgdb) p *queue $2 =3D {tqh_first =3D 0x0, tqh_last =3D 0xc143f814} (kgdb) p tqh_last No symbol "tqh_last" in current context. (kgdb) n 6390 { (kgdb) bt #0 camisr (queue=3D0xc02f3250) at /usr/src/sys/cam/cam_xpt.c:6390 #1 0xc0127945 in swi_cambio () at /usr/src/sys/cam/cam_xpt.c:6288 #2 0xc0253aa3 in doreti_swi () (kgdb) n 0xc0127945 in swi_cambio () at /usr/src/sys/cam/cam_xpt.c:6288 6288 spi3caps &=3D inq_data->spi3data; (kgdb) n 0xc0253aa3 in doreti_swi () (kgdb) n Single stepping until exit from function doreti_swi,=20 which has no line number information. Ignoring packet error, continuing... Reply contains invalid hex digit 116 (kgdb)=20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D After issuing "next" in doreti_swi the system wedged. The console screen filled with garbled characters, a continuing beep came from the box and I hit the reset asap. The same happened also during a second attempt. Somehow the line numbers seemed garbled, because camisr() is claimed to be in cam_xpt.c:6390, but in fact the file doesn't have that many lines. Hmm maybe I've missed something. I only copied the kernel sources /usr/src/sys maybe something else=20 like /usr/src/include should have been copied as well.... Anyway any structure I wanted to examine, was claimed not to be in context. Then I did something else to get more info, and at least=20 got some confirmation to a suspicion: I've rebuilt the kernel with CAMDEBUG options. All debug options turned on, but restricted to CAM_DEBUG_BUS=3D2 (the atapicam channel bus with the CDRW). Of course during a burn session, zillions of debug messages scrolled by. Still I hoped to get a hint, what happened during a hang. I could not grasp what happened immediately before, unfortunately, but I could determine, that during a hang, the system hangs in an, as it seems, endless loop in camisr(). CAMDEBUG just printed endless camisr() lines as the hang happened again. So I hope with this information, some of the atapicam and the scsi folks have an idea where to look at. Best regards, Daniel --=20 IRCnet: Mr-Spock - ceterum censeo Microsoftinem esse delendam - =20 Daniel Lang * dl@leo.org * +49 89 289 18532 * http://www.leo.org/~dl/