From owner-freebsd-scsi@FreeBSD.ORG Mon Jan 2 11:02:59 2006 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E061F16A420 for ; Mon, 2 Jan 2006 11:02:59 +0000 (GMT) (envelope-from owner-bugmaster@freebsd.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id BA02543D78 for ; Mon, 2 Jan 2006 11:02:49 +0000 (GMT) (envelope-from owner-bugmaster@freebsd.org) Received: from freefall.freebsd.org (peter@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id k02B2mMB037654 for ; Mon, 2 Jan 2006 11:02:49 GMT (envelope-from owner-bugmaster@freebsd.org) Received: (from peter@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id k02B2lrS037647 for freebsd-scsi@freebsd.org; Mon, 2 Jan 2006 11:02:47 GMT (envelope-from owner-bugmaster@freebsd.org) Date: Mon, 2 Jan 2006 11:02:47 GMT Message-Id: <200601021102.k02B2lrS037647@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: peter set sender to owner-bugmaster@freebsd.org using -f From: FreeBSD bugmaster To: freebsd-scsi@FreeBSD.org Cc: Subject: Current problem reports assigned to you X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Jan 2006 11:03:00 -0000 Current FreeBSD problem reports Critical problems Serious problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2001/05/03] kern/27059 scsi [sym] SCSI subsystem hangs under heavy lo o [2001/06/29] kern/28508 scsi problems with backup to Tandberg SLR40 st o [2002/06/17] kern/39388 scsi ncr/sym drivers fail with 53c810 and more o [2002/07/22] kern/40895 scsi wierd kernel / device driver bug o [2003/05/24] kern/52638 scsi [panic] SCSI U320 on SMP server won't run s [2003/09/30] kern/57398 scsi [mly] Current fails to install on mly(4) o [2003/12/26] kern/60598 scsi wire down of scsi devices conflicts with o [2003/12/27] kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C81 s [2004/01/10] kern/61165 scsi [panic] kernel page fault after calling c o [2004/12/02] kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5 o [2005/06/04] kern/81887 scsi [aac] Adaptec SCSI 2130S aac0: GetDeviceP o [2005/12/12] kern/90282 scsi [sym] SCSI bus resets cause loss of ch de 12 problems total. Non-critical problems S Submitted Tracker Resp. Description ------------------------------------------------------------------------------- o [2000/12/06] kern/23314 scsi aic driver fails to detect Adaptec 1520B o [2001/08/15] kern/29727 scsi [amr] [patch] amr_enquiry3 structure in a o [2002/02/23] kern/35234 scsi World access to /dev/pass? (for scanner) o [2002/06/02] kern/38828 scsi [feature request] DPT PM2012B/90 doesn't o [2002/10/29] kern/44587 scsi dev/dpt/dpt.h is missing defines required o [2003/10/01] kern/57469 scsi [scsi] [patch] Quirk for Conner CP3500 o [2005/01/12] kern/76178 scsi [ahd] Problem with ahd and large SCSI Rai 7 problems total. From owner-freebsd-scsi@FreeBSD.ORG Fri Jan 6 02:12:14 2006 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 98B6E16A41F for ; Fri, 6 Jan 2006 02:12:14 +0000 (GMT) (envelope-from iedowse@iedowse.com) Received: from nowhere.iedowse.com (nowhere.iedowse.com [82.195.144.75]) by mx1.FreeBSD.org (Postfix) with SMTP id E599243D48 for ; Fri, 6 Jan 2006 02:12:13 +0000 (GMT) (envelope-from iedowse@iedowse.com) Received: from localhost ([127.0.0.1] helo=iedowse.com) by nowhere.iedowse.com via local-iedowse id ; 6 Jan 2006 02:12:12 +0000 (GMT) To: freebsd-scsi@freebsd.org Date: Fri, 06 Jan 2006 02:12:12 +0000 From: Ian Dowse Message-ID: <200601060212.aa00704@nowhere.iedowse.com> Subject: Patch for CAM SIM removal panics X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jan 2006 02:12:14 -0000 The following patch, which is also at http://people.freebsd.org/~iedowse/cam_remove.diff attempts to improve CAM's handling of SIMs going away while operations are still in progress. This appears to be particularly helpful at probe time for umass devices, as it avoids most of the usual camisr() panics. The change is not particularly clean, so any suggestions welcome - basically when the SIM goes away it tries to guarantee that all new CCBs will completed immediately with a CAM_DEV_NOT_THERE status, and it also ensures that xpt_schedule() immediately calls the peripheral periph_start() routine when the SIM is gone. xpt_bus_deregister() now tries quite hard to flush all out all outstanding operations. Additionally it fixes a problem where mid-probe devices could be added after their AC_LOST_DEVICE message was sent, resulting in the devices never going away (this seemed to happen a lot with "pass" devices). The change does add some conditional code to the normal code path, so it will have a small negative impact on performance. Ian Index: cam_periph.c =================================================================== RCS file: /dump/FreeBSD-CVS/src/sys/cam/cam_periph.c,v retrieving revision 1.60 diff -u -r1.60 cam_periph.c --- cam_periph.c 1 Jul 2005 15:21:29 -0000 1.60 +++ cam_periph.c 6 Jan 2006 01:17:19 -0000 @@ -1656,6 +1656,8 @@ case CAM_NO_HBA: case CAM_PROVIDE_FAIL: case CAM_REQ_TOO_BIG: + case CAM_LUN_INVALID: + case CAM_TID_INVALID: error = EINVAL; break; case CAM_SCSI_BUS_RESET: Index: cam_xpt.c =================================================================== RCS file: /dump/FreeBSD-CVS/src/sys/cam/cam_xpt.c,v retrieving revision 1.156 diff -u -r1.156 cam_xpt.c --- cam_xpt.c 16 Sep 2005 01:26:17 -0000 1.156 +++ cam_xpt.c 6 Jan 2006 01:33:52 -0000 @@ -682,6 +682,18 @@ static struct intr_config_hook *xpt_config_hook; +static void dead_sim_action(struct cam_sim *sim, union ccb *ccb); +static void dead_sim_poll(struct cam_sim *sim); + +/* Dummy SIM that is used when the real one has gone. */ +static struct cam_sim cam_dead_sim = { + .sim_action = dead_sim_action, + .sim_poll = dead_sim_poll, + .sim_name = "dead_sim", +}; + +#define SIM_DEAD(sim) ((sim) == &cam_dead_sim) + /* Registered busses */ static TAILQ_HEAD(,cam_eb) xpt_busses; static u_int bus_generation; @@ -3055,12 +3067,22 @@ case XPT_ENG_EXEC: { struct cam_path *path; + struct cam_sim *ccbsim; int s; int runq; path = start_ccb->ccb_h.path; s = splsoftcam(); + if (SIM_DEAD(path->bus->sim)) { + /* The SIM has gone; just execute the CCB directly. */ + cam_ccbq_send_ccb(&path->device->ccbq, start_ccb); + ccbsim = start_ccb->ccb_h.path->bus->sim; + (*(ccbsim->sim_action))(ccbsim, start_ccb); + splx(s); + break; + } + cam_ccbq_insert_ccb(&path->device->ccbq, start_ccb); if (path->device->qfrozen_cnt == 0) runq = xpt_schedule_dev_sendq(path->bus, path->device); @@ -3641,8 +3663,8 @@ dev->ccbq.devq_openings--; dev->ccbq.dev_openings--; - while((devq->send_openings <= 0 || dev->ccbq.dev_openings < 0) - && (--timeout > 0)) { + while(((devq != NULL && devq->send_openings <= 0) || + dev->ccbq.dev_openings < 0) && (--timeout > 0)) { DELAY(1000); (*(sim->sim_poll))(sim); camisr(&cam_bioq); @@ -3684,6 +3706,7 @@ xpt_schedule(struct cam_periph *perph, u_int32_t new_priority) { struct cam_ed *device; + union ccb *work_ccb; int s; int runq; @@ -3702,6 +3725,16 @@ new_priority); } runq = 0; + } else if (SIM_DEAD(perph->path->bus->sim)) { + /* The SIM is gone so just call periph_start directly. */ + work_ccb = xpt_get_ccb(perph->path->device); + splx(s); + if (work_ccb == NULL) + return; /* XXX */ + xpt_setup_ccb(&work_ccb->ccb_h, perph->path, new_priority); + perph->pinfo.priority = new_priority; + perph->periph_start(perph, work_ccb); + return; } else { /* New entry on the queue */ CAM_DEBUG(perph->path, CAM_DEBUG_SUBTRACE, @@ -4337,6 +4370,10 @@ } else { SLIST_INSERT_HEAD(&ccb_freeq, &free_ccb->ccb_h, xpt_links.sle); } + if (bus->sim->devq == NULL) { + splx(s); + return; + } bus->sim->devq->alloc_openings++; bus->sim->devq->alloc_active--; /* XXX Turn this into an inline function - xpt_run_device?? */ @@ -4422,6 +4459,12 @@ xpt_bus_deregister(path_id_t pathid) { struct cam_path bus_path; + struct cam_ed *device; + struct cam_ed_qinfo *qinfo; + struct cam_devq *devq; + struct cam_periph *periph; + struct cam_sim *ccbsim; + union ccb *work_ccb; cam_status status; GIANT_REQUIRED; @@ -4433,11 +4476,51 @@ xpt_async(AC_LOST_DEVICE, &bus_path, NULL); xpt_async(AC_PATH_DEREGISTERED, &bus_path, NULL); - + + /* The SIM may be gone, so use a dummy SIM for any stray operations. */ + devq = bus_path.bus->sim->devq; + bus_path.bus->sim = &cam_dead_sim; + + /* Execute any pending operations now. */ + while ((qinfo = (struct cam_ed_qinfo *)camq_remove(&devq->send_queue, + CAMQ_HEAD)) != NULL || + (qinfo = (struct cam_ed_qinfo *)camq_remove(&devq->alloc_queue, + CAMQ_HEAD)) != NULL) { + do { + device = qinfo->device; + work_ccb = cam_ccbq_peek_ccb(&device->ccbq, CAMQ_HEAD); + if (work_ccb != NULL) { + devq->active_dev = device; + cam_ccbq_remove_ccb(&device->ccbq, work_ccb); + cam_ccbq_send_ccb(&device->ccbq, work_ccb); + ccbsim = work_ccb->ccb_h.path->bus->sim; + (*(ccbsim->sim_action))(ccbsim, work_ccb); + } + + periph = (struct cam_periph *)camq_remove(&device->drvq, + CAMQ_HEAD); + if (periph != NULL) + xpt_schedule(periph, periph->pinfo.priority); + } while (work_ccb != NULL || periph != NULL); + } + + /* Make sure all completed CCBs are processed. */ + while (!TAILQ_EMPTY(&cam_bioq)) { + camisr(&cam_bioq); + + /* Repeat the async's for the benefit of any new devices. */ + xpt_async(AC_LOST_DEVICE, &bus_path, NULL); + xpt_async(AC_PATH_DEREGISTERED, &bus_path, NULL); + } + /* Release the reference count held while registered. */ xpt_release_bus(bus_path.bus); xpt_release_path(&bus_path); + /* Recheck for more completed CCBs. */ + while (!TAILQ_EMPTY(&cam_bioq)) + camisr(&cam_bioq); + return (CAM_REQ_CMP); } @@ -5021,6 +5104,9 @@ struct cam_devq *devq; cam_status status; + if (SIM_DEAD(bus->sim)) + return (NULL); + /* Make space for us in the device queue on our bus */ devq = bus->sim->devq; status = cam_devq_resize(devq, devq->alloc_queue.array_size + 1); @@ -5131,9 +5217,11 @@ TAILQ_REMOVE(&target->ed_entries, device,links); target->generation++; xpt_max_ccbs -= device->ccbq.devq_openings; - /* Release our slot in the devq */ - devq = bus->sim->devq; - cam_devq_resize(devq, devq->alloc_queue.array_size - 1); + if (!SIM_DEAD(bus->sim)) { + /* Release our slot in the devq */ + devq = bus->sim->devq; + cam_devq_resize(devq, devq->alloc_queue.array_size - 1); + } splx(s); camq_fini(&device->drvq); camq_fini(&device->ccbq.queue); @@ -7096,8 +7184,10 @@ s = splcam(); cam_ccbq_ccb_done(&dev->ccbq, (union ccb *)ccb_h); - ccb_h->path->bus->sim->devq->send_active--; - ccb_h->path->bus->sim->devq->send_openings++; + if (!SIM_DEAD(ccb_h->path->bus->sim)) { + ccb_h->path->bus->sim->devq->send_active--; + ccb_h->path->bus->sim->devq->send_openings++; + } splx(s); if (((dev->flags & CAM_DEV_REL_ON_COMPLETE) != 0 @@ -7145,3 +7235,16 @@ } splx(s); } + +static void +dead_sim_action(struct cam_sim *sim, union ccb *ccb) +{ + + ccb->ccb_h.status = CAM_DEV_NOT_THERE; + xpt_done(ccb); +} + +static void +dead_sim_poll(struct cam_sim *sim) +{ +} From owner-freebsd-scsi@FreeBSD.ORG Fri Jan 6 02:59:51 2006 Return-Path: X-Original-To: freebsd-scsi@freebsd.org Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7737716A41F for ; Fri, 6 Jan 2006 02:59:51 +0000 (GMT) (envelope-from lydianconcepts@gmail.com) Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.204]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0055843D45 for ; Fri, 6 Jan 2006 02:59:50 +0000 (GMT) (envelope-from lydianconcepts@gmail.com) Received: by zproxy.gmail.com with SMTP id i11so2822627nzi for ; Thu, 05 Jan 2006 18:59:50 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=WrP94TkW3ZaIy6VEdmzU7vet/D8Evw3ZL2xMesJMNP0sAYazqITD+cH9UtZmG65nr38iXbVuwMOgza5tPI684a4v5OMdEgiIZFByoILs8Nr1qfdKpVQTdSZ1bFJL7BguR5GYaSJEDQsj7TypXDM6QJMBJSI0+SH7YbavWC4PkwA= Received: by 10.65.52.3 with SMTP id e3mr1768627qbk; Thu, 05 Jan 2006 18:59:49 -0800 (PST) Received: by 10.65.155.20 with HTTP; Thu, 5 Jan 2006 18:59:49 -0800 (PST) Message-ID: <7579f7fb0601051859q68fb90a5u95634f796939831a@mail.gmail.com> Date: Thu, 5 Jan 2006 18:59:49 -0800 From: Matthew Jacob To: Ian Dowse In-Reply-To: <200601060212.aa00704@nowhere.iedowse.com> MIME-Version: 1.0 References: <200601060212.aa00704@nowhere.iedowse.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-scsi@freebsd.org Subject: Re: Patch for CAM SIM removal panics X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jan 2006 02:59:51 -0000 This looks interesting. I wonder how this works with FC devices going away?