From owner-freebsd-scsi@FreeBSD.ORG Mon Jan 31 11:07:09 2011 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 093C410656B1 for ; Mon, 31 Jan 2011 11:07:09 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id EC76D8FC33 for ; Mon, 31 Jan 2011 11:07:08 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p0VB78W1091881 for ; Mon, 31 Jan 2011 11:07:08 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p0VB78to091879 for freebsd-scsi@FreeBSD.org; Mon, 31 Jan 2011 11:07:08 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 31 Jan 2011 11:07:08 GMT Message-Id: <201101311107.p0VB78to091879@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-scsi@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-scsi@FreeBSD.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Jan 2011 11:07:09 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/153361 scsi [ciss] Smart Array 5300 boot/detect drive problem o kern/152250 scsi [ciss] [patch] Kernel panic when hw.ciss.expose_hidden o kern/151564 scsi [ciss] ciss(4) should increase CISS_MAX_LOGICAL to 10 o docs/151336 scsi Missing documentation of scsi_ and ata_ functions in c o kern/148083 scsi [aac] Strange device reporting o kern/147704 scsi [mpt] sys/dev/mpt: new chip revision, partially unsupp o kern/146287 scsi [ciss] ciss(4) cannot see more than one SmartArray con o kern/145768 scsi [mpt] can't perform I/O on SAS based SAN disk in freeb o kern/144648 scsi [aac] Strange values of speed and bus width in dmesg o kern/144301 scsi [ciss] [hang] HP proliant server locks when using ciss o kern/142351 scsi [mpt] LSILogic driver performance problems o kern/141934 scsi [cam] [patch] add support for SEAGATE DAT Scopion 130 o kern/134488 scsi [mpt] MPT SCSI driver probes max. 8 LUNs per device o kern/132250 scsi [ciss] ciss driver does not support more then 15 drive o kern/132206 scsi [mpt] system panics on boot when mirroring and 2nd dri o kern/130621 scsi [mpt] tranfer rate is inscrutable slow when use lsi213 o kern/129602 scsi [ahd] ahd(4) gets confused and wedges SCSI bus o kern/128452 scsi [sa] [panic] Accessing SCSI tape drive randomly crashe o kern/128245 scsi [scsi] "inquiry data fails comparison at DV1 step" [re o kern/127927 scsi [isp] isp(4) target driver crashes kernel when set up o kern/127717 scsi [ata] [patch] [request] - support write cache toggling o kern/124667 scsi [amd] [panic] FreeBSD-7 kernel page faults at amd-scsi o kern/123674 scsi [ahc] ahc driver dumping o kern/123520 scsi [ahd] unable to boot from net while using ahd o sparc/121676 scsi [iscsi] iscontrol do not connect iscsi-target on sparc o kern/120487 scsi [sg] scsi_sg incompatible with scanners o kern/120247 scsi [mpt] FreeBSD 6.3 and LSI Logic 1030 = only 3.300MB/s o kern/114597 scsi [sym] System hangs at SCSI bus reset with dual HBAs o kern/110847 scsi [ahd] Tyan U320 onboard problem with more than 3 disks o kern/99954 scsi [ahc] reading from DVD failes on 6.x [regression] o kern/94838 scsi Kernel panic while mounting SD card with lock switch o o kern/92798 scsi [ahc] SCSI problem with timeouts o kern/90282 scsi [sym] SCSI bus resets cause loss of ch device o kern/76178 scsi [ahd] Problem with ahd and large SCSI Raid system o kern/74627 scsi [ahc] [hang] Adaptec 2940U2W Can't boot 5.3 s kern/61165 scsi [panic] kernel page fault after calling cam_send_ccb o kern/60641 scsi [sym] Sporadic SCSI bus resets with 53C810 under load o kern/60598 scsi wire down of scsi devices conflicts with config s kern/57398 scsi [mly] Current fails to install on mly(4) based RAID di o bin/57088 scsi [cam] [patch] for a possible fd leak in libcam.c o kern/52638 scsi [panic] SCSI U320 on SMP server won't run faster than o kern/44587 scsi dev/dpt/dpt.h is missing defines required for DPT_HAND o kern/40895 scsi wierd kernel / device driver bug o kern/39388 scsi ncr/sym drivers fail with 53c810 and more than 256MB m o kern/35234 scsi World access to /dev/pass? (for scanner) requires acce 45 problems total. From owner-freebsd-scsi@FreeBSD.ORG Thu Feb 3 16:44:31 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9C612106566C; Thu, 3 Feb 2011 16:44:31 +0000 (UTC) (envelope-from joachim@tingvold.com) Received: from smtp.domeneshop.no (smtp.domeneshop.no [194.63.248.54]) by mx1.freebsd.org (Postfix) with ESMTP id 5246C8FC1D; Thu, 3 Feb 2011 16:44:31 +0000 (UTC) Received: from aannecy-552-1-139-161.w86-200.abo.wanadoo.fr ([86.200.147.161] helo=keklolwtf.home) by smtp.domeneshop.no with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1Pl2IH-00045V-Tp; Thu, 03 Feb 2011 17:44:30 +0100 Mime-Version: 1.0 (Apple Message framework v1076) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes From: Joachim Tingvold In-Reply-To: Date: Thu, 3 Feb 2011 17:44:24 +0100 Content-Transfer-Encoding: 7bit Message-Id: <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> References: <4D2DAA45.30602@FreeBSD.org> <41C64262-4300-4187-B5FD-04A5EFB7F87C@tingvold.com> <20110113203750.GA39494@nargothrond.kdm.org> <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> To: freebsd-scsi@freebsd.org X-Mailer: Apple Mail (2.1076) Cc: Alexander Motin , "Kenneth D. Merry" Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Feb 2011 16:44:31 -0000 On Fri, Jan 21, 2011, at 20:15:15PM GMT+01:00, Joachim Tingvold wrote: >> It does look like the out of chain problem was fixed by increasing >> the >> number, so that's good at least. > > Yes. For now, at least. (-: So, it didn't last for long; . [jocke@filserver ~]$ cat /sys/dev/mps/mpsvar.h|grep "MPS_CHAIN_FRAMES" #define MPS_CHAIN_FRAMES 2048 It seems to be happening when copying larger files from 'zroot' to 'storage' (that is, files over 1.5GB in size), or when moving files right after each other (f.ex. using scripts, or moving folders with many files of medium size (300MB+)). Moving a file now-and-then (of sizes no larger than 1.5GB) doesn't seem to trigger it. -- Joachim From owner-freebsd-scsi@FreeBSD.ORG Thu Feb 3 22:10:58 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2A1A7106564A; Thu, 3 Feb 2011 22:10:58 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id C916F8FC1F; Thu, 3 Feb 2011 22:10:57 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id p13MAulZ025581; Thu, 3 Feb 2011 15:10:56 -0700 (MST) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id p13MAuuZ025580; Thu, 3 Feb 2011 15:10:56 -0700 (MST) (envelope-from ken) Date: Thu, 3 Feb 2011 15:10:56 -0700 From: "Kenneth D. Merry" To: Joachim Tingvold Message-ID: <20110203221056.GA25389@nargothrond.kdm.org> References: <41C64262-4300-4187-B5FD-04A5EFB7F87C@tingvold.com> <20110113203750.GA39494@nargothrond.kdm.org> <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="gKMricLos+KVdGMg" Content-Disposition: inline In-Reply-To: <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> User-Agent: Mutt/1.4.2i Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Feb 2011 22:10:58 -0000 --gKMricLos+KVdGMg Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Feb 03, 2011 at 17:44:24 +0100, Joachim Tingvold wrote: > On Fri, Jan 21, 2011, at 20:15:15PM GMT+01:00, Joachim Tingvold wrote: > >>It does look like the out of chain problem was fixed by increasing > >>the > >>number, so that's good at least. > > > >Yes. For now, at least. (-: > > So, it didn't last for long; > . > > [jocke@filserver ~]$ cat /sys/dev/mps/mpsvar.h|grep > "MPS_CHAIN_FRAMES" > #define MPS_CHAIN_FRAMES 2048 > > It seems to be happening when copying larger files from 'zroot' to > 'storage' (that is, files over 1.5GB in size), or when moving files > right after each other (f.ex. using scripts, or moving folders with > many files of medium size (300MB+)). Moving a file now-and-then (of > sizes no larger than 1.5GB) doesn't seem to trigger it. That's strange that you're still running into the problem with that many chain frames. In my tests, I haven't seen more than 80 chain frames used. You can probably work around it again by bumping up the number of chain frames (e.g. to 3072 or 4096), but it would probably be good to make sure there isn't a leak in the driver somewhere. I've attached a patch that has a number of debugging sysctls, a change from gibbs@ that has to do with device removal, and some other extra debugging cruft. (i.e. this patch won't go into the tree as-is, it's just for debugging.) Try running this, and then do 'sysctl hw.mps' and let's see what your low water mark is for free chain elements. We'll also want to make sure your chain_free value is about equal to MPS_CHAIN_FRAMES when the system is idle. On my system with a LSI 9201-16i controller, I see: hw.mps.1.debug_level: 0 hw.mps.1.allow_multiple_tm_cmds: 0 hw.mps.1.io_cmds_active: 24 hw.mps.1.io_cmds_highwater: 252 hw.mps.1.chain_free: 1024 hw.mps.1.chain_free_lowwater: 948 hw.mps.1.chain_alloc_fail: 0 I know what the root cause is for this bug, I just haven't had time to fix it. Unfortunately I've been chasing bugs in the driver that are a little higher priority for $REAL_JOB. The good news at least is that any fixes we make will go back into FreeBSD. Ken -- Kenneth Merry ken@FreeBSD.ORG --gKMricLos+KVdGMg Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="mps_debug_diffs.20110203.txt" Index: mps.c =================================================================== --- mps.c (revision 218241) +++ mps.c (working copy) @@ -387,6 +387,15 @@ mps_dprint(sc, MPS_TRACE, "%s\n", __func__); + if (sc->mps_flags & MPS_FLAGS_ATTACH_DONE) + mtx_assert(&sc->mps_mtx, MA_OWNED); + + if ((cm->cm_desc.Default.SMID < 1) + || (cm->cm_desc.Default.SMID >= sc->num_reqs)) { + mps_printf(sc, "%s: invalid SMID %d, desc %#x %#x\n", + __func__, cm->cm_desc.Default.SMID, + cm->cm_desc.Words.High, cm->cm_desc.Words.Low); + } mps_regwrite(sc, MPI2_REQUEST_DESCRIPTOR_POST_LOW_OFFSET, cm->cm_desc.Words.Low); mps_regwrite(sc, MPI2_REQUEST_DESCRIPTOR_POST_HIGH_OFFSET, @@ -732,6 +741,7 @@ chain->chain_busaddr = sc->chain_busaddr + i * sc->facts->IOCRequestFrameSize * 4; mps_free_chain(sc, chain); + sc->chain_free_lowwater++; } /* XXX Need to pick a more precise value */ @@ -811,7 +821,7 @@ int mps_attach(struct mps_softc *sc) { - int i, error; + int i, error, old_debug; char tmpstr[80], tmpstr2[80]; /* @@ -846,15 +856,35 @@ if (sc->sysctl_tree == NULL) return (ENOMEM); - SYSCTL_ADD_UINT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), OID_AUTO, "debug_level", CTLFLAG_RW, &sc->mps_debug, 0, "mps debug level"); - SYSCTL_ADD_UINT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), OID_AUTO, "allow_multiple_tm_cmds", CTLFLAG_RW, &sc->allow_multiple_tm_cmds, 0, "allow multiple simultaneous task management cmds"); + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "io_cmds_active", CTLFLAG_RD, + &sc->io_cmds_active, 0, "number of currently active commands"); + + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "io_cmds_highwater", CTLFLAG_RD, + &sc->io_cmds_highwater, 0, "maximum active commands seen"); + + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "chain_free", CTLFLAG_RD, + &sc->chain_free, 0, "number of free chain elements"); + + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "chain_free_lowwater", CTLFLAG_RD, + &sc->chain_free_lowwater, 0,"lowest number of free chain elements"); + + SYSCTL_ADD_UQUAD(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "chain_alloc_fail", CTLFLAG_RD, + &sc->chain_alloc_fail, "chain allocation failures"); + if ((error = mps_transition_ready(sc)) != 0) return (error); @@ -863,7 +893,10 @@ if ((error = mps_get_iocfacts(sc, sc->facts)) != 0) return (error); + old_debug = sc->mps_debug; + sc->mps_debug = MPS_INFO; mps_print_iocfacts(sc, sc->facts); + sc->mps_debug = old_debug; mps_printf(sc, "Firmware: %02d.%02d.%02d.%02d\n", sc->facts->FWVersion.Struct.Major, @@ -895,9 +928,12 @@ sc->num_reqs = MIN(MPS_REQ_FRAMES, sc->facts->RequestCredit); sc->num_replies = MIN(MPS_REPLY_FRAMES + MPS_EVT_REPLY_FRAMES, sc->facts->MaxReplyDescriptorPostQueueDepth) - 1; + mps_printf(sc, "num_reqs %d, num_replies %d\n", sc->num_reqs, + sc->num_replies); TAILQ_INIT(&sc->req_list); TAILQ_INIT(&sc->chain_list); TAILQ_INIT(&sc->tm_list); + TAILQ_INIT(&sc->io_list); if (((error = mps_alloc_queues(sc)) != 0) || ((error = mps_alloc_replies(sc)) != 0) || @@ -967,6 +1003,8 @@ error = EINVAL; } + sc->mps_flags |= MPS_FLAGS_ATTACH_DONE; + return (error); } @@ -1299,8 +1337,11 @@ if (cm->cm_complete != NULL) cm->cm_complete(sc, cm); - if (cm->cm_flags & MPS_CM_FLAGS_WAKEUP) + if (cm->cm_flags & MPS_CM_FLAGS_WAKEUP) { + mps_printf(sc, "%s: waking up %p\n", __func__, + cm); wakeup(cm); + } } desc->Words.Low = 0xffffffff; Index: mps_sas.c =================================================================== --- mps_sas.c (revision 218241) +++ mps_sas.c (working copy) @@ -486,7 +486,10 @@ return; } + mps_dprint(sc, MPS_INFO, "Preparing to remove target %d\n", targ->tid); + req = (MPI2_SCSI_TASK_MANAGE_REQUEST *)cm->cm_req; + memset(req, 0, sizeof(*req)); req->DevHandle = targ->handle; req->Function = MPI2_FUNCTION_SCSI_TASK_MGMT; req->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET; @@ -507,6 +510,7 @@ MPI2_SCSI_TASK_MANAGE_REPLY *reply; MPI2_SAS_IOUNIT_CONTROL_REQUEST *req; struct mpssas_target *targ; + struct mps_command *next_cm; uint16_t handle; mps_dprint(sc, MPS_TRACE, "%s\n", __func__); @@ -523,11 +527,13 @@ return; } - mps_printf(sc, "Reset aborted %d commands\n", reply->TerminationCount); + mps_dprint(sc, MPS_INFO, "Reset aborted %u commands\n", + reply->TerminationCount); mps_free_reply(sc, cm->cm_reply_data); /* Reuse the existing command */ req = (MPI2_SAS_IOUNIT_CONTROL_REQUEST *)cm->cm_req; + memset(req, 0, sizeof(*req)); req->Function = MPI2_FUNCTION_SAS_IO_UNIT_CONTROL; req->Operation = MPI2_SAS_OP_REMOVE_DEVICE; req->DevHandle = handle; @@ -539,6 +545,17 @@ mps_map_command(sc, cm); mps_dprint(sc, MPS_INFO, "clearing target handle 0x%04x\n", handle); + TAILQ_FOREACH_SAFE(cm, &sc->io_list, cm_link, next_cm) { + union ccb *ccb; + + if (cm->cm_targ->handle != handle) + continue; + + mps_dprint(sc, MPS_INFO, "Completing missed command %p\n", cm); + ccb = cm->cm_complete_data; + ccb->ccb_h.status = CAM_DEV_NOT_THERE; + mpssas_scsiio_complete(sc, cm); + } targ = mpssas_find_target(sc->sassc, 0, handle); if (targ != NULL) { targ->handle = 0x0; @@ -1349,6 +1366,7 @@ } req = (MPI2_SCSI_IO_REQUEST *)cm->cm_req; + bzero(req, sizeof(*req)); req->DevHandle = targ->handle; req->Function = MPI2_FUNCTION_SCSI_IO_REQUEST; req->MsgFlags = 0; @@ -1430,6 +1448,11 @@ cm->cm_complete_data = ccb; cm->cm_targ = targ; + sc->io_cmds_active++; + if (sc->io_cmds_active > sc->io_cmds_highwater) + sc->io_cmds_highwater = sc->io_cmds_active; + + TAILQ_INSERT_TAIL(&sc->io_list, cm, cm_link); callout_reset(&cm->cm_callout, (ccb->ccb_h.timeout * hz) / 1000, mpssas_scsiio_timeout, cm); @@ -1449,6 +1472,8 @@ mps_dprint(sc, MPS_TRACE, "%s\n", __func__); callout_stop(&cm->cm_callout); + TAILQ_REMOVE(&sc->io_list, cm, cm_link); + sc->io_cmds_active--; sassc = sc->sassc; ccb = cm->cm_complete_data; @@ -1470,8 +1495,10 @@ /* Take the fast path to completion */ if (cm->cm_reply == NULL) { - ccb->ccb_h.status = CAM_REQ_CMP; - ccb->csio.scsi_status = SCSI_STATUS_OK; + if ((ccb->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_INPROG) { + ccb->ccb_h.status = CAM_REQ_CMP; + ccb->csio.scsi_status = SCSI_STATUS_OK; + } mps_free_command(sc, cm); xpt_done(ccb); return; @@ -1526,7 +1553,16 @@ break; case MPI2_IOCSTATUS_SCSI_IOC_TERMINATED: case MPI2_IOCSTATUS_SCSI_EXT_TERMINATED: +#if 0 ccb->ccb_h.status = CAM_REQ_ABORTED; +#endif + mps_printf(sc, "(%d:%d:%d) terminated ioc %x scsi %x state %x " + "xfer %u\n", xpt_path_path_id(ccb->ccb_h.path), + xpt_path_target_id(ccb->ccb_h.path), + xpt_path_lun_id(ccb->ccb_h.path), + rep->IOCStatus, rep->SCSIStatus, rep->SCSIState, + rep->TransferCount); + ccb->ccb_h.status = CAM_REQUEUE_REQ; break; case MPI2_IOCSTATUS_INVALID_SGL: mps_print_scsiio_cmd(sc, cm); @@ -1904,7 +1940,6 @@ xpt_done(ccb); } - #endif /* __FreeBSD_version >= 900026 */ static void Index: mps_table.c =================================================================== --- mps_table.c (revision 218241) +++ mps_table.c (working copy) @@ -46,6 +46,8 @@ #include #include +#include +#include #include #include @@ -486,8 +488,16 @@ mps_print_scsiio_cmd(struct mps_softc *sc, struct mps_command *cm) { MPI2_SCSI_IO_REQUEST *req; + union ccb *ccb; req = (MPI2_SCSI_IO_REQUEST *)cm->cm_req; + printf("SCSI command [SMID %d]: len %u data %p flags %#x\n", + cm->cm_desc.Default.SMID, cm->cm_length, cm->cm_data, + cm->cm_flags); + ccb = (union ccb *)cm->cm_complete_data; + scsi_sense_print(&ccb->csio); mps_print_sgl(sc, cm, req->SGLOffset0); + + hexdump(req, sizeof(*req), "mps: ", 0); } Index: mpsvar.h =================================================================== --- mpsvar.h (revision 218241) +++ mpsvar.h (working copy) @@ -119,9 +119,15 @@ #define MPS_FLAGS_MSI (1 << 1) #define MPS_FLAGS_BUSY (1 << 2) #define MPS_FLAGS_SHUTDOWN (1 << 3) +#define MPS_FLAGS_ATTACH_DONE (1 << 4) u_int mps_debug; u_int allow_multiple_tm_cmds; int tm_cmds_active; + int io_cmds_active; + int io_cmds_highwater; + int chain_free; + int chain_free_lowwater; + uint64_t chain_alloc_fail; struct sysctl_ctx_list sysctl_ctx; struct sysctl_oid *sysctl_tree; struct mps_command *commands; @@ -133,6 +139,7 @@ TAILQ_HEAD(, mps_command) req_list; TAILQ_HEAD(, mps_chain) chain_list; TAILQ_HEAD(, mps_command) tm_list; + TAILQ_HEAD(, mps_command) io_list; int replypostindex; int replyfreeindex; @@ -228,8 +235,13 @@ { struct mps_chain *chain; - if ((chain = TAILQ_FIRST(&sc->chain_list)) != NULL) + if ((chain = TAILQ_FIRST(&sc->chain_list)) != NULL) { TAILQ_REMOVE(&sc->chain_list, chain, chain_link); + sc->chain_free--; + if (sc->chain_free < sc->chain_free_lowwater) + sc->chain_free_lowwater = sc->chain_free; + } else + sc->chain_alloc_fail++; return (chain); } @@ -239,6 +251,7 @@ #if 0 bzero(chain->chain, 128); #endif + sc->chain_free++; TAILQ_INSERT_TAIL(&sc->chain_list, chain, chain_link); } Index: mps_user.c =================================================================== --- mps_user.c (revision 218241) +++ mps_user.c (working copy) @@ -400,10 +400,12 @@ if (cmd->len == 0) return (EINVAL); + printf("%s: about to copyin firmware\n", __func__); error = copyin(cmd->buf, cm->cm_data, cmd->len); if (error != 0) return (error); + printf("%s: about to init sge\n", __func__); mpi_init_sge(cm, req, &req->SGL); bzero(&tc, sizeof tc); @@ -425,7 +427,10 @@ tc.ImageSize = cmd->len; cm->cm_flags |= MPS_CM_FLAGS_DATAOUT; + cm->cm_max_segs = 1; + printf("%s: about to push sge\n", __func__); + return (mps_push_sge(cm, &tc, sizeof tc, 0)); } @@ -595,7 +600,7 @@ hdr = (MPI2_REQUEST_HEADER *)cm->cm_req; - mps_dprint(sc, MPS_INFO, "mps_user_command: req %p %d rpl %p %d\n", + mps_printf(sc, "mps_user_command: req %p %d rpl %p %d\n", cmd->req, cmd->req_len, cmd->rpl, cmd->rpl_len ); if (cmd->req_len > (int)sc->facts->IOCRequestFrameSize * 4) { @@ -606,17 +611,11 @@ if (err != 0) goto RetFreeUnlocked; - mps_dprint(sc, MPS_INFO, "mps_user_command: Function %02X " + mps_printf(sc, "mps_user_command: Function %02X " "MsgFlags %02X\n", hdr->Function, hdr->MsgFlags ); - err = mps_user_setup_request(cm, cmd); - if (err != 0) { - mps_printf(sc, "mps_user_command: unsupported function 0x%X\n", - hdr->Function ); - goto RetFreeUnlocked; - } - if (cmd->len > 0) { + mps_printf(sc, "%s: allocating %d bytes\n", __func__, cmd->len); buf = malloc(cmd->len, M_MPSUSER, M_WAITOK|M_ZERO); cm->cm_data = buf; cm->cm_length = cmd->len; @@ -625,6 +624,13 @@ cm->cm_length = 0; } + err = mps_user_setup_request(cm, cmd); + if (err != 0) { + mps_printf(sc, "mps_user_command: unsupported function 0x%X\n", + hdr->Function ); + goto RetFreeUnlocked; + } + cm->cm_flags = MPS_CM_FLAGS_SGE_SIMPLE | MPS_CM_FLAGS_WAKEUP; cm->cm_desc.Default.RequestFlags = MPI2_REQ_DESCRIPT_FLAGS_DEFAULT_TYPE; @@ -653,7 +659,7 @@ copyout(rpl, cmd->rpl, sz); if (buf != NULL) copyout(buf, cmd->buf, cmd->len); - mps_dprint(sc, MPS_INFO, "mps_user_command: reply size %d\n", sz ); + mps_printf(sc, "mps_user_command: reply size %d\n", sz ); RetFreeUnlocked: mps_lock(sc); --gKMricLos+KVdGMg-- From owner-freebsd-scsi@FreeBSD.ORG Thu Feb 3 23:01:11 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 225E410656B2 for ; Thu, 3 Feb 2011 23:01:11 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id DF3338FC08 for ; Thu, 3 Feb 2011 23:01:10 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id p13N1AAD026048; Thu, 3 Feb 2011 16:01:10 -0700 (MST) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id p13N19O2026047; Thu, 3 Feb 2011 16:01:09 -0700 (MST) (envelope-from ken) Date: Thu, 3 Feb 2011 16:01:09 -0700 From: "Kenneth D. Merry" To: Borja Marcos Message-ID: <20110203230109.GA25718@nargothrond.kdm.org> References: <20110120170046.GA40879@nargothrond.kdm.org> <13EDCBDE-6C2C-4CE5-9B0E-7FCF6FB02FA1@sarenet.es> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <13EDCBDE-6C2C-4CE5-9B0E-7FCF6FB02FA1@sarenet.es> User-Agent: Mutt/1.4.2i Cc: freebsd-scsi@freebsd.org Subject: Re: mps questions X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Feb 2011 23:01:11 -0000 On Wed, Jan 26, 2011 at 11:16:57 +0100, Borja Marcos wrote: > > On Jan 20, 2011, at 6:00 PM, Kenneth D. Merry wrote: > > > If you see all the disks in the LSI BIOS, it most likely means there is a > > driver bug that is keeping FreeBSD from seeing all the disks. > > > > Try booting with -v (boot -v at the boot loader prompt) and send the full > > dmesg output. That will cause the driver to print some additional > > information when it probes. > > > > After that, try booting with hw.mps.N.debug_level=1 set in loader.conf, > > where N is either 0 or 1, depending on which mps instance is having the > > problem. > > > > You can turn it off via sysctl after you boot. > > Sorry, more complete output follows. boot -v and debug enabled for the driver. > > I've omitted plenty of "mps0: writing postindex NNN" messages It looks like the driver does see 8 disks initially. Here are the SAS device pages, interleaved with their repsective PHY pages. The last two disks don't have PHY pages associated with them, so that means that the probe failed to get the PHY page from the controller for those devices. Try booting with hw.mps.N.debug_level=4 That should give us a printout with the IOC status, and perhaps it'll be clearer why those devices are failing to probe. They're in slots 6 and 7 of the enclosure, by the way. I'm not sure how that maps to their physical location. > mps0: SAS Device Page 0 : > Slot: 0 > EnclosureHandle: 0x2 > SASAddress: 0x5000c5002c4f7d2d > ParentDevHandle: 0xa > PhyNum: 2 > AccessStatus: 0x0 > DevHandle: 0xb > AttachedPhyIdentifier: 0x0 > ZoneGroup: 0 > DeviceInfo: c01,End Device > Flags: 0x1 > PhysicalPort: 0 > MaxPortConnections: 1 > DeviceName: 0x0 > PortGroups: 4 > DmaGroup: 0 > ControlGroup: 4 > mps0: SAS PHY Page 0 : > OwnerDevHandle: 0x0001 > AttachedDevHandle: 0x000a > AttachedPhyIdentifier: 21 > AttachedPhyInfo Reason: Loss DWORD Sync (0x14) > ProgrammedLinkRate: 6.0Gbps (0xa8) > HwLinkRate: 6.0Gbps (0xa8) > ChangeCount: 1 > Flags: 0x1 > PhyInfo Reason: Power On (0x10000) > NegotiatedLinkRate: 6.0Gbps (0xaa) > mps0: Found device ,End Device> <6.0Gbps> <0x000b> <2/0> > mps0: Triggering rescan of 0:1:-1 > mps0: SAS Device Page 0 : > Slot: 4 > EnclosureHandle: 0x2 > SASAddress: 0x5000c5002c4f6ec1 > ParentDevHandle: 0xa > PhyNum: 3 > AccessStatus: 0x0 > DevHandle: 0xc > AttachedPhyIdentifier: 0x0 > ZoneGroup: 0 > DeviceInfo: c01,End Device > Flags: 0x1 > PhysicalPort: 0 > MaxPortConnections: 1 > DeviceName: 0x0 > PortGroups: 4 > DmaGroup: 0 > ControlGroup: 4 > mps0: SAS PHY Page 0 : > OwnerDevHandle: 0x0001 > AttachedDevHandle: 0x000a > AttachedPhyIdentifier: 20 > AttachedPhyInfo Reason: Loss DWORD Sync (0x14) > ProgrammedLinkRate: 6.0Gbps (0xa8) > HwLinkRate: 6.0Gbps (0xa8) > ChangeCount: 1 > Flags: 0x1 > PhyInfo Reason: Power On (0x10000) > NegotiatedLinkRate: 6.0Gbps (0xaa) > mps0: Found device ,End Device> <6.0Gbps> <0x000c> <2/4> > mps0: Triggering rescan of 0:4:-1 > mps0: SAS Device Page 0 : > Slot: 5 > EnclosureHandle: 0x2 > SASAddress: 0x5000c5002c4f7425 > ParentDevHandle: 0xa > PhyNum: 4 > AccessStatus: 0x0 > DevHandle: 0xd > AttachedPhyIdentifier: 0x0 > ZoneGroup: 0 > DeviceInfo: c01,End Device > Flags: 0x1 > PhysicalPort: 0 > MaxPortConnections: 1 > DeviceName: 0x0 > PortGroups: 4 > DmaGroup: 0 > ControlGroup: 4 > mps0: SAS PHY Page 0 : > OwnerDevHandle: 0x0001 > AttachedDevHandle: 0x000a > AttachedPhyIdentifier: 15 > AttachedPhyInfo Reason: Loss DWORD Sync (0x14) > ProgrammedLinkRate: 6.0Gbps (0xa8) > HwLinkRate: 6.0Gbps (0xa8) > ChangeCount: 1 > Flags: 0x1 > PhyInfo Reason: Power On (0x10000) > NegotiatedLinkRate: 6.0Gbps (0xaa) > mps0: Found device ,End Device> <6.0Gbps> <0x000d> <2/5> > mps0: Triggering rescan of 0:5:-1 > mps0: SAS Device Page 0 : > Slot: 1 > EnclosureHandle: 0x2 > SASAddress: 0x5000c5002c4f7875 > ParentDevHandle: 0xa > PhyNum: 5 > AccessStatus: 0x0 > DevHandle: 0xe > AttachedPhyIdentifier: 0x0 > ZoneGroup: 0 > DeviceInfo: c01,End Device > Flags: 0x1 > PhysicalPort: 0 > MaxPortConnections: 1 > DeviceName: 0x0 > PortGroups: 4 > DmaGroup: 0 > ControlGroup: 4 > mps0: SAS PHY Page 0 : > OwnerDevHandle: 0x0001 > AttachedDevHandle: 0x000a > AttachedPhyIdentifier: 14 > AttachedPhyInfo Reason: Loss DWORD Sync (0x14) > ProgrammedLinkRate: 6.0Gbps (0xa8) > HwLinkRate: 6.0Gbps (0xa8) > ChangeCount: 1 > Flags: 0x1 > PhyInfo Reason: Power On (0x10000) > NegotiatedLinkRate: 6.0Gbps (0xaa) > mps0: Found device ,End Device> <6.0Gbps> <0x000e> <2/1> > mps0: Triggering rescan of 0:2:-1 > mps0: SAS Device Page 0 : > Slot: 2 > EnclosureHandle: 0x2 > SASAddress: 0x5000c5002c54578d > ParentDevHandle: 0xa > PhyNum: 6 > AccessStatus: 0x0 > DevHandle: 0xf > AttachedPhyIdentifier: 0x0 > ZoneGroup: 0 > DeviceInfo: c01,End Device > Flags: 0x1 > PhysicalPort: 0 > MaxPortConnections: 1 > DeviceName: 0x0 > PortGroups: 4 > DmaGroup: 0 > ControlGroup: 4 > mps0: SAS PHY Page 0 : > OwnerDevHandle: 0x0001 > AttachedDevHandle: 0x000a > AttachedPhyIdentifier: 13 > AttachedPhyInfo Reason: Loss DWORD Sync (0x14) > ProgrammedLinkRate: 6.0Gbps (0xa8) > HwLinkRate: 6.0Gbps (0xa8) > ChangeCount: 1 > Flags: 0x1 > PhyInfo Reason: Power On (0x10000) > NegotiatedLinkRate: 6.0Gbps (0xaa) > mps0: Found device ,End Device> <6.0Gbps> <0x000f> <2/2> > mps0: Triggering rescan of 0:3:-1 > mps0: SAS Device Page 0 : > Slot: 3 > EnclosureHandle: 0x2 > SASAddress: 0x5000c5002c545701 > ParentDevHandle: 0xa > PhyNum: 7 > AccessStatus: 0x0 > DevHandle: 0x10 > AttachedPhyIdentifier: 0x0 > ZoneGroup: 0 > DeviceInfo: c01,End Device > Flags: 0x1 > PhysicalPort: 0 > MaxPortConnections: 1 > DeviceName: 0x0 > PortGroups: 4 > DmaGroup: 0 > ControlGroup: 4 > mps0: SAS PHY Page 0 : > OwnerDevHandle: 0x0001 > AttachedDevHandle: 0x000a > AttachedPhyIdentifier: 12 > AttachedPhyInfo Reason: Loss DWORD Sync (0x14) > ProgrammedLinkRate: 6.0Gbps (0xa8) > HwLinkRate: 6.0Gbps (0xa8) > ChangeCount: 1 > Flags: 0x1 > PhyInfo Reason: Power On (0x10000) > NegotiatedLinkRate: 6.0Gbps (0xaa) > mps0: Found device ,End Device> <6.0Gbps> <0x0010> <2/3> > mps0: Triggering rescan of 0:6:-1 These two disks don't have PHY pages: > mps0: SAS Device Page 0 : > Slot: 7 > EnclosureHandle: 0x2 > SASAddress: 0x5000c5002c5454bd > ParentDevHandle: 0xa > PhyNum: 10 > AccessStatus: 0x0 > DevHandle: 0x11 > AttachedPhyIdentifier: 0x0 > ZoneGroup: 0 > DeviceInfo: c01,End Device > Flags: 0x1 > PhysicalPort: 0 > MaxPortConnections: 1 > DeviceName: 0x0 > PortGroups: 4 > DmaGroup: 0 > ControlGroup: 4 > mps0: SAS Device Page 0 : > Slot: 6 > EnclosureHandle: 0x2 > SASAddress: 0x5000c5002c4f5561 > ParentDevHandle: 0xa > PhyNum: 16 > AccessStatus: 0x0 > DevHandle: 0x12 > AttachedPhyIdentifier: 0x0 > ZoneGroup: 0 > DeviceInfo: c01,End Device > Flags: 0x1 > PhysicalPort: 0 > MaxPortConnections: 1 > DeviceName: 0x0 > PortGroups: 4 > DmaGroup: 0 > ControlGroup: 4 Ken -- Kenneth Merry ken@FreeBSD.ORG From owner-freebsd-scsi@FreeBSD.ORG Thu Feb 3 23:53:14 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DF7E9106564A; Thu, 3 Feb 2011 23:53:14 +0000 (UTC) (envelope-from joachim@tingvold.com) Received: from smtp.domeneshop.no (smtp.domeneshop.no [194.63.248.54]) by mx1.freebsd.org (Postfix) with ESMTP id 985328FC18; Thu, 3 Feb 2011 23:53:14 +0000 (UTC) Received: from aannecy-552-1-139-161.w86-200.abo.wanadoo.fr ([86.200.147.161] helo=keklolwtf.home) by smtp.domeneshop.no with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1Pl8zA-00066g-OV; Fri, 04 Feb 2011 00:53:13 +0100 Mime-Version: 1.0 (Apple Message framework v1076) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes From: Joachim Tingvold In-Reply-To: <20110203221056.GA25389@nargothrond.kdm.org> Date: Fri, 4 Feb 2011 00:53:07 +0100 Content-Transfer-Encoding: 7bit Message-Id: References: <41C64262-4300-4187-B5FD-04A5EFB7F87C@tingvold.com> <20110113203750.GA39494@nargothrond.kdm.org> <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> <20110203221056.GA25389@nargothrond.kdm.org> To: Kenneth D. Merry X-Mailer: Apple Mail (2.1076) Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Feb 2011 23:53:15 -0000 On Thu, Feb 03, 2011, at 23:10:56PM GMT+01:00, Kenneth D. Merry wrote: > I've attached a patch that has a number of debugging sysctls, a > change from > gibbs@ that has to do with device removal, and some other extra > debugging > cruft. (i.e. this patch won't go into the tree as-is, it's just for > debugging.) Index: mps.c |=================================================================== |--- mps.c (revision 218241) |+++ mps.c (working copy) -------------------------- Patching file mps.c using Plan A... Hunk #1 succeeded at 386 (offset -1 lines). Hunk #2 succeeded at 734 (offset -7 lines). Hunk #3 succeeded at 812 (offset -9 lines). Hunk #4 failed at 847. The patch is looking for SYSCTL_ADD_UINT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), [...] SYSCTL_ADD_UINT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), [...] but my mps.c has SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), [...] SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), [...] That, and all the offsets; maybe my mps-driver is a bit outdated? Maybe this is a stupid question, but I'm no FreeBSD-expert. (-: -- Joachim From owner-freebsd-scsi@FreeBSD.ORG Fri Feb 4 00:02:51 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 77FAE1065679; Fri, 4 Feb 2011 00:02:51 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id 92AF78FC14; Fri, 4 Feb 2011 00:02:50 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id p1402new026695; Thu, 3 Feb 2011 17:02:49 -0700 (MST) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id p1402nYo026694; Thu, 3 Feb 2011 17:02:49 -0700 (MST) (envelope-from ken) Date: Thu, 3 Feb 2011 17:02:49 -0700 From: "Kenneth D. Merry" To: Joachim Tingvold Message-ID: <20110204000249.GA26663@nargothrond.kdm.org> References: <20110113203750.GA39494@nargothrond.kdm.org> <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> <20110203221056.GA25389@nargothrond.kdm.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="4Ckj6UjgE2iN1+kY" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2i Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Feb 2011 00:02:51 -0000 --4Ckj6UjgE2iN1+kY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Fri, Feb 04, 2011 at 00:53:07 +0100, Joachim Tingvold wrote: > On Thu, Feb 03, 2011, at 23:10:56PM GMT+01:00, Kenneth D. Merry wrote: > >I've attached a patch that has a number of debugging sysctls, a > >change from > >gibbs@ that has to do with device removal, and some other extra > >debugging > >cruft. (i.e. this patch won't go into the tree as-is, it's just for > >debugging.) > > Index: mps.c > |=================================================================== > |--- mps.c (revision 218241) > |+++ mps.c (working copy) > -------------------------- > Patching file mps.c using Plan A... > Hunk #1 succeeded at 386 (offset -1 lines). > Hunk #2 succeeded at 734 (offset -7 lines). > Hunk #3 succeeded at 812 (offset -9 lines). > Hunk #4 failed at 847. > > The patch is looking for > > SYSCTL_ADD_UINT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), > [...] > SYSCTL_ADD_UINT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), > [...] > > but my mps.c has > > SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), > [...] > SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), > [...] > > That, and all the offsets; maybe my mps-driver is a bit outdated? > Maybe this is a stupid question, but I'm no FreeBSD-expert. (-: That change went in a few weeks ago. Try this patch, perhaps it will work. Ken -- Kenneth Merry ken@FreeBSD.ORG --4Ckj6UjgE2iN1+kY Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="mps_debug_diffs.20110203.2.txt" Index: mps.c =================================================================== --- mps.c (revision 218241) +++ mps.c (working copy) @@ -387,6 +387,15 @@ mps_dprint(sc, MPS_TRACE, "%s\n", __func__); + if (sc->mps_flags & MPS_FLAGS_ATTACH_DONE) + mtx_assert(&sc->mps_mtx, MA_OWNED); + + if ((cm->cm_desc.Default.SMID < 1) + || (cm->cm_desc.Default.SMID >= sc->num_reqs)) { + mps_printf(sc, "%s: invalid SMID %d, desc %#x %#x\n", + __func__, cm->cm_desc.Default.SMID, + cm->cm_desc.Words.High, cm->cm_desc.Words.Low); + } mps_regwrite(sc, MPI2_REQUEST_DESCRIPTOR_POST_LOW_OFFSET, cm->cm_desc.Words.Low); mps_regwrite(sc, MPI2_REQUEST_DESCRIPTOR_POST_HIGH_OFFSET, @@ -732,6 +741,7 @@ chain->chain_busaddr = sc->chain_busaddr + i * sc->facts->IOCRequestFrameSize * 4; mps_free_chain(sc, chain); + sc->chain_free_lowwater++; } /* XXX Need to pick a more precise value */ @@ -811,7 +821,7 @@ int mps_attach(struct mps_softc *sc) { - int i, error; + int i, error, old_debug; char tmpstr[80], tmpstr2[80]; /* @@ -855,6 +865,26 @@ &sc->allow_multiple_tm_cmds, 0, "allow multiple simultaneous task management cmds"); + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "io_cmds_active", CTLFLAG_RD, + &sc->io_cmds_active, 0, "number of currently active commands"); + + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "io_cmds_highwater", CTLFLAG_RD, + &sc->io_cmds_highwater, 0, "maximum active commands seen"); + + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "chain_free", CTLFLAG_RD, + &sc->chain_free, 0, "number of free chain elements"); + + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "chain_free_lowwater", CTLFLAG_RD, + &sc->chain_free_lowwater, 0,"lowest number of free chain elements"); + + SYSCTL_ADD_UQUAD(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "chain_alloc_fail", CTLFLAG_RD, + &sc->chain_alloc_fail, "chain allocation failures"); + if ((error = mps_transition_ready(sc)) != 0) return (error); @@ -863,7 +893,10 @@ if ((error = mps_get_iocfacts(sc, sc->facts)) != 0) return (error); + old_debug = sc->mps_debug; + sc->mps_debug = MPS_INFO; mps_print_iocfacts(sc, sc->facts); + sc->mps_debug = old_debug; mps_printf(sc, "Firmware: %02d.%02d.%02d.%02d\n", sc->facts->FWVersion.Struct.Major, @@ -895,9 +928,12 @@ sc->num_reqs = MIN(MPS_REQ_FRAMES, sc->facts->RequestCredit); sc->num_replies = MIN(MPS_REPLY_FRAMES + MPS_EVT_REPLY_FRAMES, sc->facts->MaxReplyDescriptorPostQueueDepth) - 1; + mps_printf(sc, "num_reqs %d, num_replies %d\n", sc->num_reqs, + sc->num_replies); TAILQ_INIT(&sc->req_list); TAILQ_INIT(&sc->chain_list); TAILQ_INIT(&sc->tm_list); + TAILQ_INIT(&sc->io_list); if (((error = mps_alloc_queues(sc)) != 0) || ((error = mps_alloc_replies(sc)) != 0) || @@ -967,6 +1003,8 @@ error = EINVAL; } + sc->mps_flags |= MPS_FLAGS_ATTACH_DONE; + return (error); } @@ -1299,8 +1337,11 @@ if (cm->cm_complete != NULL) cm->cm_complete(sc, cm); - if (cm->cm_flags & MPS_CM_FLAGS_WAKEUP) + if (cm->cm_flags & MPS_CM_FLAGS_WAKEUP) { + mps_printf(sc, "%s: waking up %p\n", __func__, + cm); wakeup(cm); + } } desc->Words.Low = 0xffffffff; Index: mps_sas.c =================================================================== --- mps_sas.c (revision 218241) +++ mps_sas.c (working copy) @@ -486,7 +486,10 @@ return; } + mps_dprint(sc, MPS_INFO, "Preparing to remove target %d\n", targ->tid); + req = (MPI2_SCSI_TASK_MANAGE_REQUEST *)cm->cm_req; + memset(req, 0, sizeof(*req)); req->DevHandle = targ->handle; req->Function = MPI2_FUNCTION_SCSI_TASK_MGMT; req->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET; @@ -507,6 +510,7 @@ MPI2_SCSI_TASK_MANAGE_REPLY *reply; MPI2_SAS_IOUNIT_CONTROL_REQUEST *req; struct mpssas_target *targ; + struct mps_command *next_cm; uint16_t handle; mps_dprint(sc, MPS_TRACE, "%s\n", __func__); @@ -523,11 +527,13 @@ return; } - mps_printf(sc, "Reset aborted %d commands\n", reply->TerminationCount); + mps_dprint(sc, MPS_INFO, "Reset aborted %u commands\n", + reply->TerminationCount); mps_free_reply(sc, cm->cm_reply_data); /* Reuse the existing command */ req = (MPI2_SAS_IOUNIT_CONTROL_REQUEST *)cm->cm_req; + memset(req, 0, sizeof(*req)); req->Function = MPI2_FUNCTION_SAS_IO_UNIT_CONTROL; req->Operation = MPI2_SAS_OP_REMOVE_DEVICE; req->DevHandle = handle; @@ -539,6 +545,17 @@ mps_map_command(sc, cm); mps_dprint(sc, MPS_INFO, "clearing target handle 0x%04x\n", handle); + TAILQ_FOREACH_SAFE(cm, &sc->io_list, cm_link, next_cm) { + union ccb *ccb; + + if (cm->cm_targ->handle != handle) + continue; + + mps_dprint(sc, MPS_INFO, "Completing missed command %p\n", cm); + ccb = cm->cm_complete_data; + ccb->ccb_h.status = CAM_DEV_NOT_THERE; + mpssas_scsiio_complete(sc, cm); + } targ = mpssas_find_target(sc->sassc, 0, handle); if (targ != NULL) { targ->handle = 0x0; @@ -1349,6 +1366,7 @@ } req = (MPI2_SCSI_IO_REQUEST *)cm->cm_req; + bzero(req, sizeof(*req)); req->DevHandle = targ->handle; req->Function = MPI2_FUNCTION_SCSI_IO_REQUEST; req->MsgFlags = 0; @@ -1430,6 +1448,11 @@ cm->cm_complete_data = ccb; cm->cm_targ = targ; + sc->io_cmds_active++; + if (sc->io_cmds_active > sc->io_cmds_highwater) + sc->io_cmds_highwater = sc->io_cmds_active; + + TAILQ_INSERT_TAIL(&sc->io_list, cm, cm_link); callout_reset(&cm->cm_callout, (ccb->ccb_h.timeout * hz) / 1000, mpssas_scsiio_timeout, cm); @@ -1449,6 +1472,8 @@ mps_dprint(sc, MPS_TRACE, "%s\n", __func__); callout_stop(&cm->cm_callout); + TAILQ_REMOVE(&sc->io_list, cm, cm_link); + sc->io_cmds_active--; sassc = sc->sassc; ccb = cm->cm_complete_data; @@ -1470,8 +1495,10 @@ /* Take the fast path to completion */ if (cm->cm_reply == NULL) { - ccb->ccb_h.status = CAM_REQ_CMP; - ccb->csio.scsi_status = SCSI_STATUS_OK; + if ((ccb->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_INPROG) { + ccb->ccb_h.status = CAM_REQ_CMP; + ccb->csio.scsi_status = SCSI_STATUS_OK; + } mps_free_command(sc, cm); xpt_done(ccb); return; @@ -1526,7 +1553,16 @@ break; case MPI2_IOCSTATUS_SCSI_IOC_TERMINATED: case MPI2_IOCSTATUS_SCSI_EXT_TERMINATED: +#if 0 ccb->ccb_h.status = CAM_REQ_ABORTED; +#endif + mps_printf(sc, "(%d:%d:%d) terminated ioc %x scsi %x state %x " + "xfer %u\n", xpt_path_path_id(ccb->ccb_h.path), + xpt_path_target_id(ccb->ccb_h.path), + xpt_path_lun_id(ccb->ccb_h.path), + rep->IOCStatus, rep->SCSIStatus, rep->SCSIState, + rep->TransferCount); + ccb->ccb_h.status = CAM_REQUEUE_REQ; break; case MPI2_IOCSTATUS_INVALID_SGL: mps_print_scsiio_cmd(sc, cm); @@ -1904,7 +1940,6 @@ xpt_done(ccb); } - #endif /* __FreeBSD_version >= 900026 */ static void Index: mps_table.c =================================================================== --- mps_table.c (revision 218241) +++ mps_table.c (working copy) @@ -46,6 +46,8 @@ #include #include +#include +#include #include #include @@ -486,8 +488,16 @@ mps_print_scsiio_cmd(struct mps_softc *sc, struct mps_command *cm) { MPI2_SCSI_IO_REQUEST *req; + union ccb *ccb; req = (MPI2_SCSI_IO_REQUEST *)cm->cm_req; + printf("SCSI command [SMID %d]: len %u data %p flags %#x\n", + cm->cm_desc.Default.SMID, cm->cm_length, cm->cm_data, + cm->cm_flags); + ccb = (union ccb *)cm->cm_complete_data; + scsi_sense_print(&ccb->csio); mps_print_sgl(sc, cm, req->SGLOffset0); + + hexdump(req, sizeof(*req), "mps: ", 0); } Index: mpsvar.h =================================================================== --- mpsvar.h (revision 218241) +++ mpsvar.h (working copy) @@ -119,9 +119,15 @@ #define MPS_FLAGS_MSI (1 << 1) #define MPS_FLAGS_BUSY (1 << 2) #define MPS_FLAGS_SHUTDOWN (1 << 3) +#define MPS_FLAGS_ATTACH_DONE (1 << 4) u_int mps_debug; u_int allow_multiple_tm_cmds; int tm_cmds_active; + int io_cmds_active; + int io_cmds_highwater; + int chain_free; + int chain_free_lowwater; + uint64_t chain_alloc_fail; struct sysctl_ctx_list sysctl_ctx; struct sysctl_oid *sysctl_tree; struct mps_command *commands; @@ -133,6 +139,7 @@ TAILQ_HEAD(, mps_command) req_list; TAILQ_HEAD(, mps_chain) chain_list; TAILQ_HEAD(, mps_command) tm_list; + TAILQ_HEAD(, mps_command) io_list; int replypostindex; int replyfreeindex; @@ -228,8 +235,13 @@ { struct mps_chain *chain; - if ((chain = TAILQ_FIRST(&sc->chain_list)) != NULL) + if ((chain = TAILQ_FIRST(&sc->chain_list)) != NULL) { TAILQ_REMOVE(&sc->chain_list, chain, chain_link); + sc->chain_free--; + if (sc->chain_free < sc->chain_free_lowwater) + sc->chain_free_lowwater = sc->chain_free; + } else + sc->chain_alloc_fail++; return (chain); } @@ -239,6 +251,7 @@ #if 0 bzero(chain->chain, 128); #endif + sc->chain_free++; TAILQ_INSERT_TAIL(&sc->chain_list, chain, chain_link); } Index: mps_user.c =================================================================== --- mps_user.c (revision 218241) +++ mps_user.c (working copy) @@ -400,10 +400,12 @@ if (cmd->len == 0) return (EINVAL); + printf("%s: about to copyin firmware\n", __func__); error = copyin(cmd->buf, cm->cm_data, cmd->len); if (error != 0) return (error); + printf("%s: about to init sge\n", __func__); mpi_init_sge(cm, req, &req->SGL); bzero(&tc, sizeof tc); @@ -425,7 +427,10 @@ tc.ImageSize = cmd->len; cm->cm_flags |= MPS_CM_FLAGS_DATAOUT; + cm->cm_max_segs = 1; + printf("%s: about to push sge\n", __func__); + return (mps_push_sge(cm, &tc, sizeof tc, 0)); } @@ -595,7 +600,7 @@ hdr = (MPI2_REQUEST_HEADER *)cm->cm_req; - mps_dprint(sc, MPS_INFO, "mps_user_command: req %p %d rpl %p %d\n", + mps_printf(sc, "mps_user_command: req %p %d rpl %p %d\n", cmd->req, cmd->req_len, cmd->rpl, cmd->rpl_len ); if (cmd->req_len > (int)sc->facts->IOCRequestFrameSize * 4) { @@ -606,17 +611,11 @@ if (err != 0) goto RetFreeUnlocked; - mps_dprint(sc, MPS_INFO, "mps_user_command: Function %02X " + mps_printf(sc, "mps_user_command: Function %02X " "MsgFlags %02X\n", hdr->Function, hdr->MsgFlags ); - err = mps_user_setup_request(cm, cmd); - if (err != 0) { - mps_printf(sc, "mps_user_command: unsupported function 0x%X\n", - hdr->Function ); - goto RetFreeUnlocked; - } - if (cmd->len > 0) { + mps_printf(sc, "%s: allocating %d bytes\n", __func__, cmd->len); buf = malloc(cmd->len, M_MPSUSER, M_WAITOK|M_ZERO); cm->cm_data = buf; cm->cm_length = cmd->len; @@ -625,6 +624,13 @@ cm->cm_length = 0; } + err = mps_user_setup_request(cm, cmd); + if (err != 0) { + mps_printf(sc, "mps_user_command: unsupported function 0x%X\n", + hdr->Function ); + goto RetFreeUnlocked; + } + cm->cm_flags = MPS_CM_FLAGS_SGE_SIMPLE | MPS_CM_FLAGS_WAKEUP; cm->cm_desc.Default.RequestFlags = MPI2_REQ_DESCRIPT_FLAGS_DEFAULT_TYPE; @@ -653,7 +659,7 @@ copyout(rpl, cmd->rpl, sz); if (buf != NULL) copyout(buf, cmd->buf, cmd->len); - mps_dprint(sc, MPS_INFO, "mps_user_command: reply size %d\n", sz ); + mps_printf(sc, "mps_user_command: reply size %d\n", sz ); RetFreeUnlocked: mps_lock(sc); --4Ckj6UjgE2iN1+kY-- From owner-freebsd-scsi@FreeBSD.ORG Fri Feb 4 00:20:15 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 65AA1106564A; Fri, 4 Feb 2011 00:20:15 +0000 (UTC) (envelope-from joachim@tingvold.com) Received: from smtp.domeneshop.no (smtp.domeneshop.no [194.63.248.54]) by mx1.freebsd.org (Postfix) with ESMTP id 1F7C38FC18; Fri, 4 Feb 2011 00:20:15 +0000 (UTC) Received: from aannecy-552-1-139-161.w86-200.abo.wanadoo.fr ([86.200.147.161] helo=keklolwtf.home) by smtp.domeneshop.no with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1Pl9PK-00063R-2Q; Fri, 04 Feb 2011 01:20:14 +0100 Mime-Version: 1.0 (Apple Message framework v1076) Content-Type: text/plain; charset=us-ascii; format=flowed From: Joachim Tingvold In-Reply-To: <20110204000249.GA26663@nargothrond.kdm.org> Date: Fri, 4 Feb 2011 01:20:10 +0100 Content-Transfer-Encoding: 7bit Message-Id: <5B54C6A9-02C8-48B2-B693-012B8C828DA4@tingvold.com> References: <20110113203750.GA39494@nargothrond.kdm.org> <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> <20110203221056.GA25389@nargothrond.kdm.org> <20110204000249.GA26663@nargothrond.kdm.org> To: Kenneth D. Merry X-Mailer: Apple Mail (2.1076) Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Feb 2011 00:20:15 -0000 On Fri, Feb 04, 2011, at 01:02:49AM GMT+01:00, Kenneth D. Merry wrote: > That change went in a few weeks ago. Ahh, okay. I'll try updating it this weekend, then. > Try this patch, perhaps it will work. No errors this time, but it didn't seem to do anything; [root@filserver ~]# sysctl hw.mps hw.mps.disable_msi: 0 hw.mps.disable_msix: 0 hw.mps.0.debug_level: 0 hw.mps.0.allow_multiple_tm_cmds: 0 (-: -- Joachim From owner-freebsd-scsi@FreeBSD.ORG Fri Feb 4 00:25:37 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AF2561065740; Fri, 4 Feb 2011 00:25:37 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id 74FF38FC19; Fri, 4 Feb 2011 00:25:37 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id p140PbFr026943; Thu, 3 Feb 2011 17:25:37 -0700 (MST) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id p140PbS4026942; Thu, 3 Feb 2011 17:25:37 -0700 (MST) (envelope-from ken) Date: Thu, 3 Feb 2011 17:25:37 -0700 From: "Kenneth D. Merry" To: Joachim Tingvold Message-ID: <20110204002537.GA26904@nargothrond.kdm.org> References: <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> <20110203221056.GA25389@nargothrond.kdm.org> <20110204000249.GA26663@nargothrond.kdm.org> <5B54C6A9-02C8-48B2-B693-012B8C828DA4@tingvold.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5B54C6A9-02C8-48B2-B693-012B8C828DA4@tingvold.com> User-Agent: Mutt/1.4.2i Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Feb 2011 00:25:37 -0000 On Fri, Feb 04, 2011 at 01:20:10 +0100, Joachim Tingvold wrote: > On Fri, Feb 04, 2011, at 01:02:49AM GMT+01:00, Kenneth D. Merry wrote: > >That change went in a few weeks ago. > > Ahh, okay. I'll try updating it this weekend, then. > > >Try this patch, perhaps it will work. > > No errors this time, but it didn't seem to do anything; > > [root@filserver ~]# sysctl hw.mps > hw.mps.disable_msi: 0 > hw.mps.disable_msix: 0 > hw.mps.0.debug_level: 0 > hw.mps.0.allow_multiple_tm_cmds: 0 Hmm. Lots of possiblilities, but a few things to look at: - Did you do a buildkernel or a make cleandepend && make depend && make? - Look at the timestamp on mps.o in your kernel build directory - touch mps.c, rebuild your kernel, and look to see that it got built - 'grep io_cmds_active *' in sys/dev/mps, and make sure the patch got applied. - make sure you installed the kernel in the right place - make sure you booted the correct kernel (look at uname -a to see when it was built) Ken -- Kenneth Merry ken@FreeBSD.ORG From owner-freebsd-scsi@FreeBSD.ORG Fri Feb 4 09:32:23 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 49112106564A; Fri, 4 Feb 2011 09:32:23 +0000 (UTC) (envelope-from joachim@tingvold.com) Received: from smtp.domeneshop.no (smtp.domeneshop.no [194.63.248.54]) by mx1.freebsd.org (Postfix) with ESMTP id 04E798FC0C; Fri, 4 Feb 2011 09:32:22 +0000 (UTC) Received: from aannecy-552-1-139-161.w86-200.abo.wanadoo.fr ([86.200.147.161] helo=keklolwtf.home) by smtp.domeneshop.no with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1PlI1d-0006HL-Ce; Fri, 04 Feb 2011 10:32:21 +0100 Mime-Version: 1.0 (Apple Message framework v1076) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes From: Joachim Tingvold In-Reply-To: <20110204002537.GA26904@nargothrond.kdm.org> Date: Fri, 4 Feb 2011 10:32:17 +0100 Content-Transfer-Encoding: 7bit Message-Id: <1E541FAC-24CB-4AB1-AF9B-020D16B0B195@tingvold.com> References: <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> <20110203221056.GA25389@nargothrond.kdm.org> <20110204000249.GA26663@nargothrond.kdm.org> <5B54C6A9-02C8-48B2-B693-012B8C828DA4@tingvold.com> <20110204002537.GA26904@nargothrond.kdm.org> To: Kenneth D. Merry X-Mailer: Apple Mail (2.1076) Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Feb 2011 09:32:23 -0000 On Fri, Feb 04, 2011, at 01:25:37AM GMT+01:00, Kenneth D. Merry wrote: > Hmm. Lots of possiblilities, but a few things to look at: > > - Did you do a buildkernel or a make cleandepend && make depend && > make? > - Look at the timestamp on mps.o in your kernel build directory > - touch mps.c, rebuild your kernel, and look to see that it got built > - 'grep io_cmds_active *' in sys/dev/mps, and make sure the patch got > applied. > - make sure you installed the kernel in the right place > - make sure you booted the correct kernel (look at uname -a to see > when it > was built) Okay, now I feel like an idiot; I haven't rebuilt any kernel at any time, so the 1024->2048 change of 'MPS_CHAIN_FRAMES' probably had no effect (literally). Now, I've run into another problem; [root@filserver /usr/src]# make buildworld && make buildkernel [...] /usr/src/sys/dev/ath/if_ath.c:3895:5: error: "NOTYET" is not defined *** Error code 1 Stop in /usr/obj/usr/src/sys/GENERIC. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. -- Joachim From owner-freebsd-scsi@FreeBSD.ORG Fri Feb 4 09:52:56 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F26D5106566C; Fri, 4 Feb 2011 09:52:55 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 48F788FC16; Fri, 4 Feb 2011 09:52:54 +0000 (UTC) Received: by bwz12 with SMTP id 12so2496027bwz.13 for ; Fri, 04 Feb 2011 01:52:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:sender:message-id:date:from:user-agent :mime-version:to:cc:subject:references:in-reply-to :x-enigmail-version:content-type:content-transfer-encoding; bh=gQFRd/X06U4TbZCsRu8Mq/EamQF7ubSSMhyl7tE/FPg=; b=QoR7TKnecQp1SgHgU5Jv/GCW0nOSGZbk169WcKbbxyf3NDfoDI4H/rIGLZZypovZeW tQp/nsQWEDUL1P4QAL2iokhztuNB/K6v0RkgWy/PDOIe0HSzvVlFGAb0z4qWwJUC5JhG PMzNd44iKI6gZQ15DT3ZrlW7rYw6nE7mUu1FE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; b=uAN1qwxrtLBC7QvDmiauiwo8rORP7jercPaKuQgyPJClVWzqf4t59xKWXb74/R1S51 krOwy01OFKvGSE/lzh/nKSIrjEx9UWUMiBmzfLB2kft+kmki5U2bvUwaH5aVP4V910zY q0w2ecakEkAr2+3asIJrIKmGtiPADlo7/BuJE= Received: by 10.204.135.217 with SMTP id o25mr5858768bkt.15.1296813173989; Fri, 04 Feb 2011 01:52:53 -0800 (PST) Received: from mavbook2.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id v1sm275809bkt.5.2011.02.04.01.52.51 (version=SSLv3 cipher=RC4-MD5); Fri, 04 Feb 2011 01:52:52 -0800 (PST) Sender: Alexander Motin Message-ID: <4D4BCC3A.1090402@FreeBSD.org> Date: Fri, 04 Feb 2011 11:51:54 +0200 From: Alexander Motin User-Agent: Thunderbird 2.0.0.23 (X11/20091212) MIME-Version: 1.0 To: Joachim Tingvold References: <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> <20110203221056.GA25389@nargothrond.kdm.org> <20110204000249.GA26663@nargothrond.kdm.org> <5B54C6A9-02C8-48B2-B693-012B8C828DA4@tingvold.com> <20110204002537.GA26904@nargothrond.kdm.org> <1E541FAC-24CB-4AB1-AF9B-020D16B0B195@tingvold.com> In-Reply-To: <1E541FAC-24CB-4AB1-AF9B-020D16B0B195@tingvold.com> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-scsi@freebsd.org, "Kenneth D. Merry" Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Feb 2011 09:52:56 -0000 Joachim Tingvold wrote: > On Fri, Feb 04, 2011, at 01:25:37AM GMT+01:00, Kenneth D. Merry wrote: >> Hmm. Lots of possiblilities, but a few things to look at: >> >> - Did you do a buildkernel or a make cleandepend && make depend && make? >> - Look at the timestamp on mps.o in your kernel build directory >> - touch mps.c, rebuild your kernel, and look to see that it got built >> - 'grep io_cmds_active *' in sys/dev/mps, and make sure the patch got >> applied. >> - make sure you installed the kernel in the right place >> - make sure you booted the correct kernel (look at uname -a to see >> when it >> was built) > > Okay, now I feel like an idiot; I haven't rebuilt any kernel at any > time, so the 1024->2048 change of 'MPS_CHAIN_FRAMES' probably had no > effect (literally). > > Now, I've run into another problem; > > [root@filserver /usr/src]# make buildworld && make buildkernel > [...] > /usr/src/sys/dev/ath/if_ath.c:3895:5: error: "NOTYET" is not defined > *** Error code 1 > > Stop in /usr/obj/usr/src/sys/GENERIC. > *** Error code 1 > > Stop in /usr/src. > *** Error code 1 > > Stop in /usr/src. It was already fixed. Update your sources and retry. -- Alexander Motin From owner-freebsd-scsi@FreeBSD.ORG Fri Feb 4 13:35:23 2011 Return-Path: Delivered-To: freebsd-scsi@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E9500106564A; Fri, 4 Feb 2011 13:35:23 +0000 (UTC) (envelope-from joachim@tingvold.com) Received: from smtp.domeneshop.no (smtp.domeneshop.no [194.63.248.54]) by mx1.freebsd.org (Postfix) with ESMTP id 6F5A78FC18; Fri, 4 Feb 2011 13:35:23 +0000 (UTC) Received: from aannecy-552-1-139-161.w86-200.abo.wanadoo.fr ([86.200.147.161] helo=keklolwtf.home) by smtp.domeneshop.no with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1PlLoo-0000DY-7Y; Fri, 04 Feb 2011 14:35:22 +0100 Mime-Version: 1.0 (Apple Message framework v1076) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes From: Joachim Tingvold In-Reply-To: <4D4BCC3A.1090402@FreeBSD.org> Date: Fri, 4 Feb 2011 14:35:17 +0100 Content-Transfer-Encoding: 7bit Message-Id: <927E9E50-D4D4-45BD-AE0C-C16C7A6B58AE@tingvold.com> References: <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> <20110203221056.GA25389@nargothrond.kdm.org> <20110204000249.GA26663@nargothrond.kdm.org> <5B54C6A9-02C8-48B2-B693-012B8C828DA4@tingvold.com> <20110204002537.GA26904@nargothrond.kdm.org> <1E541FAC-24CB-4AB1-AF9B-020D16B0B195@tingvold.com> <4D4BCC3A.1090402@FreeBSD.org> To: Alexander Motin X-Mailer: Apple Mail (2.1076) Cc: freebsd-scsi@FreeBSD.org, "Kenneth D. Merry" Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Feb 2011 13:35:24 -0000 On Fri, Feb 04, 2011, at 10:51:54AM GMT+01:00, Alexander Motin wrote: > It was already fixed. Update your sources and retry. Ahhh, thanks for the heads up. I updated it when I sent the email, and the fix came just an hour later. -- Joachim From owner-freebsd-scsi@FreeBSD.ORG Fri Feb 4 13:50:41 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 03A071065674; Fri, 4 Feb 2011 13:50:41 +0000 (UTC) (envelope-from joachim@tingvold.com) Received: from smtp.domeneshop.no (smtp.domeneshop.no [194.63.248.54]) by mx1.freebsd.org (Postfix) with ESMTP id B19A58FC0C; Fri, 4 Feb 2011 13:50:40 +0000 (UTC) Received: from aannecy-552-1-139-161.w86-200.abo.wanadoo.fr ([86.200.147.161] helo=keklolwtf.home) by smtp.domeneshop.no with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1PlM3b-0003M1-2M; Fri, 04 Feb 2011 14:50:39 +0100 Mime-Version: 1.0 (Apple Message framework v1076) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes From: Joachim Tingvold In-Reply-To: <20110203221056.GA25389@nargothrond.kdm.org> Date: Fri, 4 Feb 2011 14:50:34 +0100 Content-Transfer-Encoding: 7bit Message-Id: References: <41C64262-4300-4187-B5FD-04A5EFB7F87C@tingvold.com> <20110113203750.GA39494@nargothrond.kdm.org> <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> <20110203221056.GA25389@nargothrond.kdm.org> To: Kenneth D. Merry X-Mailer: Apple Mail (2.1076) Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Feb 2011 13:50:41 -0000 On Thu, Feb 03, 2011, at 23:10:56PM GMT+01:00, Kenneth D. Merry wrote: > Try running this, and then do 'sysctl hw.mps' and let's see what > your low > water mark is for free chain elements. We'll also want to make sure > your > chain_free value is about equal to MPS_CHAIN_FRAMES when the system is > idle. On my system with a LSI 9201-16i controller, I see: While idle; [jocke@filserver ~]$ sysctl hw.mps.0 hw.mps.0.debug_level: 0 hw.mps.0.allow_multiple_tm_cmds: 0 hw.mps.0.io_cmds_active: 0 hw.mps.0.io_cmds_highwater: 53 hw.mps.0.chain_free: 2048 hw.mps.0.chain_free_lowwater: 2029 hw.mps.0.chain_alloc_fail: 0 After I copied a 22G file from 'storage' to 'zroot'; [jocke@filserver ~]$ sysctl hw.mps.0 hw.mps.0.debug_level: 0 hw.mps.0.allow_multiple_tm_cmds: 0 hw.mps.0.io_cmds_active: 0 hw.mps.0.io_cmds_highwater: 99 hw.mps.0.chain_free: 2048 hw.mps.0.chain_free_lowwater: 2029 hw.mps.0.chain_alloc_fail: 0 After I copied a 22G file from 'zroot' to 'storage'; [jocke@filserver ~]$ sysctl hw.mps.0 hw.mps.0.debug_level: 0 hw.mps.0.allow_multiple_tm_cmds: 0 hw.mps.0.io_cmds_active: 0 hw.mps.0.io_cmds_highwater: 897 hw.mps.0.chain_free: 2048 hw.mps.0.chain_free_lowwater: 2029 hw.mps.0.chain_alloc_fail: 0 -- Joachim From owner-freebsd-scsi@FreeBSD.ORG Fri Feb 4 18:00:25 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D7D7B10656A6; Fri, 4 Feb 2011 18:00:25 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id 5EA6F8FC22; Fri, 4 Feb 2011 18:00:12 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id p14I0BPb038167; Fri, 4 Feb 2011 11:00:11 -0700 (MST) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id p14I0BOY038166; Fri, 4 Feb 2011 11:00:11 -0700 (MST) (envelope-from ken) Date: Fri, 4 Feb 2011 11:00:11 -0700 From: "Kenneth D. Merry" To: Joachim Tingvold Message-ID: <20110204180011.GA38067@nargothrond.kdm.org> References: <20110113203750.GA39494@nargothrond.kdm.org> <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> <20110203221056.GA25389@nargothrond.kdm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2i Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Feb 2011 18:00:26 -0000 On Fri, Feb 04, 2011 at 14:50:34 +0100, Joachim Tingvold wrote: > On Thu, Feb 03, 2011, at 23:10:56PM GMT+01:00, Kenneth D. Merry wrote: > >Try running this, and then do 'sysctl hw.mps' and let's see what > >your low > >water mark is for free chain elements. We'll also want to make sure > >your > >chain_free value is about equal to MPS_CHAIN_FRAMES when the system is > >idle. On my system with a LSI 9201-16i controller, I see: > > While idle; > > [jocke@filserver ~]$ sysctl hw.mps.0 > hw.mps.0.debug_level: 0 > hw.mps.0.allow_multiple_tm_cmds: 0 > hw.mps.0.io_cmds_active: 0 > hw.mps.0.io_cmds_highwater: 53 > hw.mps.0.chain_free: 2048 > hw.mps.0.chain_free_lowwater: 2029 > hw.mps.0.chain_alloc_fail: 0 > > After I copied a 22G file from 'storage' to 'zroot'; > > [jocke@filserver ~]$ sysctl hw.mps.0 > hw.mps.0.debug_level: 0 > hw.mps.0.allow_multiple_tm_cmds: 0 > hw.mps.0.io_cmds_active: 0 > hw.mps.0.io_cmds_highwater: 99 > hw.mps.0.chain_free: 2048 > hw.mps.0.chain_free_lowwater: 2029 > hw.mps.0.chain_alloc_fail: 0 > > After I copied a 22G file from 'zroot' to 'storage'; > > [jocke@filserver ~]$ sysctl hw.mps.0 > hw.mps.0.debug_level: 0 > hw.mps.0.allow_multiple_tm_cmds: 0 > hw.mps.0.io_cmds_active: 0 > hw.mps.0.io_cmds_highwater: 897 Looks like there were a lot of commands active at once. > hw.mps.0.chain_free: 2048 > hw.mps.0.chain_free_lowwater: 2029 > hw.mps.0.chain_alloc_fail: 0 But no more than 19 chain elements used at any one time. Perhaps it could depend on memory fragmentation somewhat. Over time you may see the low water mark go down a bit. The good news is that it doesn't look like we have a leak. Ken -- Kenneth Merry ken@FreeBSD.ORG