From owner-freebsd-bugs@FreeBSD.ORG Wed Dec 3 15:40:03 2008 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CB6CD1065672 for ; Wed, 3 Dec 2008 15:40:03 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8EE1C8FC12 for ; Wed, 3 Dec 2008 15:40:03 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id mB3Fe3E6002584 for ; Wed, 3 Dec 2008 15:40:03 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id mB3Fe3c0002583; Wed, 3 Dec 2008 15:40:03 GMT (envelope-from gnats) Date: Wed, 3 Dec 2008 15:40:03 GMT Message-Id: <200812031540.mB3Fe3c0002583@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Jaakko Heinonen Cc: Subject: Re: kern/88823: [modules] [atapicam] atapicam - kernel trap 12 on loading and unloading X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Jaakko Heinonen List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Dec 2008 15:40:04 -0000 The following reply was made to PR kern/88823; it has been noted by GNATS. From: Jaakko Heinonen To: bug-followup@FreeBSD.org, per.qu@email.it Cc: freebsd-scsi@FreeBSD.org Subject: Re: kern/88823: [modules] [atapicam] atapicam - kernel trap 12 on loading and unloading Date: Wed, 3 Dec 2008 17:32:58 +0200 Hi, There is a CAM(4)/pass(4) bug which causes passcleanup() (in sys/cam/scsi/scsi_pass.c) to call destroy_dev(9) with the device mutex held. It's not allowed to call destroy_dev() with sleepable locks held. Here's the call trace: destroy_dev(c7b28400,0,c569f754,c7b15080,f46c6a38,...) at destroy_dev+0x10 passcleanup(c7b15080,c0b8f83b,c0bdf975,c585d058,c0e5afe0,...) at passcleanup+0x2e camperiphfree(c7b15080,0,f46c6a58,c0477b7d,c7b15080,...) at camperiphfree+0xc2 cam_periph_invalidate(c7b15080,c59328d0,f46c6a8c,c0492b4a,c7b15080,...) at cam_periph_invalidate+0x3e cam_periph_async(c7b15080,100,f46c6b18,0,0,...) at cam_periph_async+0x2d passasync(c7b15080,100,f46c6b18,0,c7ae0000,...) at passasync+0xca xpt_async_bcast(0,4,c0b6dbbf,11a5,c7b428c0,...) at xpt_async_bcast+0x32 xpt_async(100,f46c6b18,0,10,c575ccb8,...) at xpt_async+0x194 xpt_bus_deregister(0,0,c7b75b30,378,c577fc00,...) at xpt_bus_deregister+0x4e free_softc(c577fe64,0,c7b75b30,103,c7b18100,...) at free_softc+0xe1 atapi_cam_detach(c7b18100,c7b30858,c0caa340,9a4,1,...) at atapi_cam_detach+0x7f device_detach(c7b18100,c081bf09,c7691840,1,c7b760f8,...) at device_detach+0x8c devclass_delete_driver(c554b6c0,c7b7610c,c0bd0dfd,2d,0,...) at devclass_delete_driver+0x91 driver_module_handler(c7692040,1,c7b760f8,ef,c7692040,...) at driver_module_handler+0xdf module_unload(c7692040,0,253,250,f46c6c40,...) at module_unload+0x75 linker_file_unload(c7ab9d00,0,c0bcf326,400,c7b73000,...) at linker_file_unload+0xc9 kern_kldunload(c5957460,6,0,f46c6d2c,c0b11ff3,...) at kern_kldunload+0xd5 kldunloadf(c5957460,f46c6cf8,8,c0bd96d0,c0cad660,...) at kldunloadf+0x2b Calling xpt_bus_deregister() in atapicam results this code path. xpt_bus_deregister() must be called with the device mutex held. Following change fixes the atapicam problem; however the patch may be incorrect because I am not sure if passcleanup() is always called with the lock held. I have tried the patch with atapicam(4) and umass(4) (both use pass(4)). %%% Index: sys/cam/scsi/scsi_pass.c =================================================================== --- sys/cam/scsi/scsi_pass.c (revision 185331) +++ sys/cam/scsi/scsi_pass.c (working copy) @@ -167,7 +167,9 @@ passcleanup(struct cam_periph *periph) devstat_remove_entry(softc->device_stats); + mtx_unlock(periph->sim->mtx); destroy_dev(softc->dev); + mtx_lock(periph->sim->mtx); if (bootverbose) { xpt_print(periph->path, "removing device entry\n"); %%% There are also other bugs involved in unloading the atapicam module. * If there are pending hcbs kernel will panic on unload. There's an obvious bug in free_softc(): it uses TAILQ_FOREACH() instead of TAILQ_FOREACH_SAFE(). However fixing that is not enough. There are additional problem(s) and I don't have a fix for them. Here's a patch that changes it to refuse to detach if there are pending hcbs: %%% Index: sys/dev/ata/atapi-cam.c =================================================================== --- sys/dev/ata/atapi-cam.c (revision 185519) +++ sys/dev/ata/atapi-cam.c (working copy) @@ -254,6 +254,13 @@ atapi_cam_detach(device_t dev) struct atapi_xpt_softc *scp = device_get_softc(dev); mtx_lock(&scp->state_lock); + /* + * XXX: Detaching when pending hcbs exist is broken. + */ + if (!TAILQ_EMPTY(&scp->pending_hcbs)) { + mtx_unlock(&scp->state_lock); + return (EBUSY); + } xpt_freeze_simq(scp->sim, 1 /*count*/); scp->flags |= DETACHING; mtx_unlock(&scp->state_lock); @@ -882,11 +889,11 @@ free_hcb(struct atapi_hcb *hcb) static void free_softc(struct atapi_xpt_softc *scp) { - struct atapi_hcb *hcb; + struct atapi_hcb *hcb, *tmp_hcb; if (scp != NULL) { mtx_lock(&scp->state_lock); - TAILQ_FOREACH(hcb, &scp->pending_hcbs, chain) { + TAILQ_FOREACH_SAFE(hcb, &scp->pending_hcbs, chain, tmp_hcb) { free_hcb_and_ccb_done(hcb, CAM_UNREC_HBA_ERROR); } if (scp->path != NULL) { %%% * cd(4) doesn't tolerate well disappearing devices. There's code in cdinvalidate() to invalidate further I/O operations but calling for example d_close causes a crash. Thus you can't unmount a file system after the device has disappeared. This patch makes it to survive unmounting. %%% Index: sys/cam/scsi/scsi_cd.c =================================================================== --- sys/cam/scsi/scsi_cd.c (revision 185331) +++ sys/cam/scsi/scsi_cd.c (working copy) @@ -382,6 +382,9 @@ cdoninvalidate(struct cam_periph *periph camq_remove(&softc->changer->devq, softc->pinfo.index); disk_gone(softc->disk); + softc->disk->d_drv1 = NULL; + softc->disk->d_close = NULL; /* allow closing the disk */ + xpt_print(periph->path, "lost device\n"); } %%% -- Jaakko