Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 2 Aug 2000 11:04:37 -0600
From:      "Kenneth D. Merry" <ken@kdm.org>
To:        Alexander Leidinger <Alexander@Leidinger.net>
Cc:        scsi@FreeBSD.ORG, Matthew Jacob <mjacob@feral.com>
Subject:   Re: Fwd: Panic in xpt_setup_ccb (cam_xpt.c)
Message-ID:  <20000802110437.A35016@panzer.kdm.org>
In-Reply-To: <200007301035.MAA15223@Magelan.Leidinger.net>; from Alexander@Leidinger.net on Sun, Jul 30, 2000 at 12:35:50PM %2B0200
References:  <200007301035.MAA15223@Magelan.Leidinger.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Jul 30, 2000 at 12:35:50 +0200, Alexander Leidinger wrote:
> Hi,
> 
> forwarded to -scsi at request of Matthew Jacob (Note: I'm not subscribed
> to -scsi). My hardware is:
> ---snip---
> ahc0: <Adaptec aic7880 Ultra SCSI adapter> port 0xb000-0xb0ff mem 0xd9800000-0xd9800fff irq 9 at device 6.0 on pci0
> ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
> cd0 at ahc0 bus 0 target 1 lun 0
> cd0: <TEAC CD-R55S 1.0Q> Removable CD-ROM SCSI-2 device 
> cd0: 10.000MB/s transfers (10.000MHz, offset 15)
> cd1 at ahc0 bus 0 target 2 lun 0
> cd1: <PIONEER CD-ROM DR-U16S 1.01> Removable CD-ROM SCSI-2 device 
> cd1: 20.000MB/s transfers (20.000MHz, offset 15)
> ---snip---
> 
> The panic happened with cd0.
> 
> Matthew:
> It didn't happen because of syncing prior to a reboot, it was a
> "dd if=/dev/cd0c of=normal_file_on_ad0" (it's a "damaged" cd (someone
> bought crappy CD-R's and I have to get some data out of them) and I
> tried to duplicate the contents with dd). I was able to reproduce it,
> and I'm sure I'm able to reproduce it with an actual kernel (at the
> moment I'm recovering from a loss of / because of some shit from M$, ->
> the crashdump is lost).

I think Nick Hibma's analysis of this on -current was on target.

The most likely explanation for this is that the device has gone away while
there was an I/O pending.  In this case, it was the error recovery code
that hit the problem.

When a device goes away, we return any I/O queued to the peripheral driver
(in this case the cd driver) with errors, but we don't return any I/O
queued to the card or the device.

So the peripheral and path stored in the CCB have been freed, but the CCB
is still outstanding, and the error recovery code is using a freed path.

This is just a guess at what is going on, though, since we don't know for
sure that the device has gone away.

If you can get dmesg output from before when the panic happens, that would
be helpful.  If the device has gone away for some reason, you'll see "lost
device" and "removing device entry" printed on the screen.

I'm not sure how the device could have gone away, though, since we don't
make the device go away automatically anymore.  It used to be that if we
got selection timeouts, the device would be automatically removed.

We don't do that anymore, since it seemed to cause problems, so the only
way a device can go away is if it doesn't make it through the probe stage
or if someone initiates a rescan.  (usually via camcontrol(8))

[ ... ]

> #12 0xc02650bf in trap (frame={tf_fs = 16, tf_es = -791543792, 
>       tf_ds = -1072562160, tf_edi = 1, tf_esi = -1056803840, 
>       tf_ebp = -791482800, tf_isp = -791482820, tf_ebx = 0, tf_edx = 64, 
>       tf_ecx = -791482776, tf_eax = 1, tf_trapno = 12, tf_err = 0, 
>       tf_eip = -1072541595, tf_cs = 8, tf_eflags = 2163330, 
>       tf_esp = -791482664, tf_ss = -1072555584}) at ../../i386/i386/trap.c:427
> #13 0xc0125065 in xpt_setup_ccb (ccb_h=0xd0d2ee68, path=0x40, priority=1)
>     at ../../cam/cam_xpt.c:3734
> #14 0xc01219c0 in cam_release_devq (path=0x40, relsim_flags=0, openings=0, 
>     timeout=0, getcount_only=0) at ../../cam/cam_periph.c:855
> ---Type <return> to continue, or q <return> to quit---
> #15 0xc0121b37 in camperiphdone (periph=0xc1021480, done_ccb=0xc1027400)
>     at ../../cam/cam_periph.c:1021
> #16 0xc0127997 in camisr (queue=0xc03189b0) at ../../cam/cam_xpt.c:6328
> #17 0xc01277a9 in swi_cambio () at ../../cam/cam_xpt.c:6231
> #18 0xc025b900 in splz_swi ()
> #19 0xc01a7451 in softclock () at ../../kern/kern_timeout.c:131
> #20 0xc025b85f in doreti_swi ()
> Cannot access memory at address 0x91992874.
> (kgdb) up 13
> #13 0xc0125065 in xpt_setup_ccb (ccb_h=0xd0d2ee68, path=0x40, priority=1)
>     at ../../cam/cam_xpt.c:3734
> 3734            ccb_h->path = path;
> (kgdb) 
> (kgdb) print path
> $1 = (struct cam_path *) 0x0
> (kgdb) print ccb_h
> $2 = (struct ccb_hdr *) 0x0
> (kgdb) up          
> #14 0xc01219c0 in cam_release_devq (path=0x40, relsim_flags=0, openings=0, 
>     timeout=0, getcount_only=0) at ../../cam/cam_periph.c:855
> 855             xpt_setup_ccb(&crs.ccb_h, path,
> (kgdb) print path
> $5 = (struct cam_path *) 0x40
> (kgdb) print *path
> Cannot access memory at address 0x40.
> ---snip---
> 
> It's a kernel from yesterday.
[ ... ]

Ken
-- 
Kenneth Merry
ken@kdm.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000802110437.A35016>