Date: Mon, 24 Nov 2003 15:08:03 -0800 From: Richard Bass <rbass@netraverse.com> To: aic7xxx@freebsd.org Subject: Deadlock on linux 2.6.0-test10 Message-ID: <3FC28F53.1060407@netraverse.com>
next in thread | raw e-mail | index | archive | help
I am not sure whether this is the right mailing list to report this to, but please excuse me if not. There was a deadlock introduced in going from linux-2.6.0-test9 to linux-2.6.0-test10 The hardware info is as follows (when printed out by the test9 kernel): ------------------------- scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.35 <Adaptec 19160B Ultra160 SCSI adapter> aic7892: Ultra160 Wide Channel A, SCSI Id=8, 32/253 SCBs (scsi0:A:1): 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit) (scsi0:A:4): 10.000MB/s transfers (10.000MHz, offset 16) Vendor: QUANTUM Model: ATLAS_V_18_WLS Rev: 0230 Type: Direct-Access ANSI SCSI revision: 03 scsi0:A:1:0: Tagged Queuing enabled. Depth 253 Vendor: TOSHIBA Model: CD-ROM XM-6401TA Rev: 1009 Type: CD-ROM ANSI SCSI revision: 02 SCSI device sda: 35861388 512-byte hdwr sectors (18361 MB) SCSI device sda: drive cache: write back sda: sda1 sda2 -------------------------- Now, the change that got made was in the generic Linux code, but after doing a little hunting around, it looks like maybe the problem is in the Adaptec driver. In any event, here is the traceback: ahc_linux_register_host ahc_lock(ahd, &s) spin_lock_irqsave(&ahc->platform_data->spin_lock, *flags); (obtains the spinlock) scsi_assign_lock(host, &ahc->platform_data->spin_lock); (assigns ahc->platform_data->spin_lock to shost->host_lock ahc_linux_initialize_scsi_bus ahc_reset_channel(ahc, 'A', /*initiate_reset*/TRUE); ahc_send_async scsi_report_device_reset shost_for_each_device(sdev, shost) __scsi_iterate_devices spin_lock_irqsave(shost->host_lock, flags); ^^^ DEADLOCK The change was that shost_for_each_device() now uses __scsi_iterate_devices() which goes and gets the host_lock spinlock. It kind of looked like you shouldn't call ahc_send_async() with a lock, but I could be wrong here. The problem may be in the generic stuff. If so, I am sure the aic7xxx maintainer can better explain what is going wrong there. Hope this helps, Richard <rwb> -- Richard W. Bass Systems Software Architect NeTraverse, Inc.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3FC28F53.1060407>