From owner-freebsd-scsi@FreeBSD.ORG Thu Feb 3 22:10:58 2011 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2A1A7106564A; Thu, 3 Feb 2011 22:10:58 +0000 (UTC) (envelope-from ken@kdm.org) Received: from nargothrond.kdm.org (nargothrond.kdm.org [70.56.43.81]) by mx1.freebsd.org (Postfix) with ESMTP id C916F8FC1F; Thu, 3 Feb 2011 22:10:57 +0000 (UTC) Received: from nargothrond.kdm.org (localhost [127.0.0.1]) by nargothrond.kdm.org (8.14.2/8.14.2) with ESMTP id p13MAulZ025581; Thu, 3 Feb 2011 15:10:56 -0700 (MST) (envelope-from ken@nargothrond.kdm.org) Received: (from ken@localhost) by nargothrond.kdm.org (8.14.2/8.14.2/Submit) id p13MAuuZ025580; Thu, 3 Feb 2011 15:10:56 -0700 (MST) (envelope-from ken) Date: Thu, 3 Feb 2011 15:10:56 -0700 From: "Kenneth D. Merry" To: Joachim Tingvold Message-ID: <20110203221056.GA25389@nargothrond.kdm.org> References: <41C64262-4300-4187-B5FD-04A5EFB7F87C@tingvold.com> <20110113203750.GA39494@nargothrond.kdm.org> <20110114001758.GA12793@nargothrond.kdm.org> <07392102-4584-4690-9188-5202728CC7CA@tingvold.com> <20110120155746.GA22515@nargothrond.kdm.org> <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="gKMricLos+KVdGMg" Content-Disposition: inline In-Reply-To: <070C12D5-A54F-4A48-A151-EBA16EF32A13@tingvold.com> User-Agent: Mutt/1.4.2i Cc: freebsd-scsi@freebsd.org, Alexander Motin Subject: Re: mps0-troubles X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Feb 2011 22:10:58 -0000 --gKMricLos+KVdGMg Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Feb 03, 2011 at 17:44:24 +0100, Joachim Tingvold wrote: > On Fri, Jan 21, 2011, at 20:15:15PM GMT+01:00, Joachim Tingvold wrote: > >>It does look like the out of chain problem was fixed by increasing > >>the > >>number, so that's good at least. > > > >Yes. For now, at least. (-: > > So, it didn't last for long; > . > > [jocke@filserver ~]$ cat /sys/dev/mps/mpsvar.h|grep > "MPS_CHAIN_FRAMES" > #define MPS_CHAIN_FRAMES 2048 > > It seems to be happening when copying larger files from 'zroot' to > 'storage' (that is, files over 1.5GB in size), or when moving files > right after each other (f.ex. using scripts, or moving folders with > many files of medium size (300MB+)). Moving a file now-and-then (of > sizes no larger than 1.5GB) doesn't seem to trigger it. That's strange that you're still running into the problem with that many chain frames. In my tests, I haven't seen more than 80 chain frames used. You can probably work around it again by bumping up the number of chain frames (e.g. to 3072 or 4096), but it would probably be good to make sure there isn't a leak in the driver somewhere. I've attached a patch that has a number of debugging sysctls, a change from gibbs@ that has to do with device removal, and some other extra debugging cruft. (i.e. this patch won't go into the tree as-is, it's just for debugging.) Try running this, and then do 'sysctl hw.mps' and let's see what your low water mark is for free chain elements. We'll also want to make sure your chain_free value is about equal to MPS_CHAIN_FRAMES when the system is idle. On my system with a LSI 9201-16i controller, I see: hw.mps.1.debug_level: 0 hw.mps.1.allow_multiple_tm_cmds: 0 hw.mps.1.io_cmds_active: 24 hw.mps.1.io_cmds_highwater: 252 hw.mps.1.chain_free: 1024 hw.mps.1.chain_free_lowwater: 948 hw.mps.1.chain_alloc_fail: 0 I know what the root cause is for this bug, I just haven't had time to fix it. Unfortunately I've been chasing bugs in the driver that are a little higher priority for $REAL_JOB. The good news at least is that any fixes we make will go back into FreeBSD. Ken -- Kenneth Merry ken@FreeBSD.ORG --gKMricLos+KVdGMg Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="mps_debug_diffs.20110203.txt" Index: mps.c =================================================================== --- mps.c (revision 218241) +++ mps.c (working copy) @@ -387,6 +387,15 @@ mps_dprint(sc, MPS_TRACE, "%s\n", __func__); + if (sc->mps_flags & MPS_FLAGS_ATTACH_DONE) + mtx_assert(&sc->mps_mtx, MA_OWNED); + + if ((cm->cm_desc.Default.SMID < 1) + || (cm->cm_desc.Default.SMID >= sc->num_reqs)) { + mps_printf(sc, "%s: invalid SMID %d, desc %#x %#x\n", + __func__, cm->cm_desc.Default.SMID, + cm->cm_desc.Words.High, cm->cm_desc.Words.Low); + } mps_regwrite(sc, MPI2_REQUEST_DESCRIPTOR_POST_LOW_OFFSET, cm->cm_desc.Words.Low); mps_regwrite(sc, MPI2_REQUEST_DESCRIPTOR_POST_HIGH_OFFSET, @@ -732,6 +741,7 @@ chain->chain_busaddr = sc->chain_busaddr + i * sc->facts->IOCRequestFrameSize * 4; mps_free_chain(sc, chain); + sc->chain_free_lowwater++; } /* XXX Need to pick a more precise value */ @@ -811,7 +821,7 @@ int mps_attach(struct mps_softc *sc) { - int i, error; + int i, error, old_debug; char tmpstr[80], tmpstr2[80]; /* @@ -846,15 +856,35 @@ if (sc->sysctl_tree == NULL) return (ENOMEM); - SYSCTL_ADD_UINT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), OID_AUTO, "debug_level", CTLFLAG_RW, &sc->mps_debug, 0, "mps debug level"); - SYSCTL_ADD_UINT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), OID_AUTO, "allow_multiple_tm_cmds", CTLFLAG_RW, &sc->allow_multiple_tm_cmds, 0, "allow multiple simultaneous task management cmds"); + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "io_cmds_active", CTLFLAG_RD, + &sc->io_cmds_active, 0, "number of currently active commands"); + + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "io_cmds_highwater", CTLFLAG_RD, + &sc->io_cmds_highwater, 0, "maximum active commands seen"); + + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "chain_free", CTLFLAG_RD, + &sc->chain_free, 0, "number of free chain elements"); + + SYSCTL_ADD_INT(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "chain_free_lowwater", CTLFLAG_RD, + &sc->chain_free_lowwater, 0,"lowest number of free chain elements"); + + SYSCTL_ADD_UQUAD(&sc->sysctl_ctx, SYSCTL_CHILDREN(sc->sysctl_tree), + OID_AUTO, "chain_alloc_fail", CTLFLAG_RD, + &sc->chain_alloc_fail, "chain allocation failures"); + if ((error = mps_transition_ready(sc)) != 0) return (error); @@ -863,7 +893,10 @@ if ((error = mps_get_iocfacts(sc, sc->facts)) != 0) return (error); + old_debug = sc->mps_debug; + sc->mps_debug = MPS_INFO; mps_print_iocfacts(sc, sc->facts); + sc->mps_debug = old_debug; mps_printf(sc, "Firmware: %02d.%02d.%02d.%02d\n", sc->facts->FWVersion.Struct.Major, @@ -895,9 +928,12 @@ sc->num_reqs = MIN(MPS_REQ_FRAMES, sc->facts->RequestCredit); sc->num_replies = MIN(MPS_REPLY_FRAMES + MPS_EVT_REPLY_FRAMES, sc->facts->MaxReplyDescriptorPostQueueDepth) - 1; + mps_printf(sc, "num_reqs %d, num_replies %d\n", sc->num_reqs, + sc->num_replies); TAILQ_INIT(&sc->req_list); TAILQ_INIT(&sc->chain_list); TAILQ_INIT(&sc->tm_list); + TAILQ_INIT(&sc->io_list); if (((error = mps_alloc_queues(sc)) != 0) || ((error = mps_alloc_replies(sc)) != 0) || @@ -967,6 +1003,8 @@ error = EINVAL; } + sc->mps_flags |= MPS_FLAGS_ATTACH_DONE; + return (error); } @@ -1299,8 +1337,11 @@ if (cm->cm_complete != NULL) cm->cm_complete(sc, cm); - if (cm->cm_flags & MPS_CM_FLAGS_WAKEUP) + if (cm->cm_flags & MPS_CM_FLAGS_WAKEUP) { + mps_printf(sc, "%s: waking up %p\n", __func__, + cm); wakeup(cm); + } } desc->Words.Low = 0xffffffff; Index: mps_sas.c =================================================================== --- mps_sas.c (revision 218241) +++ mps_sas.c (working copy) @@ -486,7 +486,10 @@ return; } + mps_dprint(sc, MPS_INFO, "Preparing to remove target %d\n", targ->tid); + req = (MPI2_SCSI_TASK_MANAGE_REQUEST *)cm->cm_req; + memset(req, 0, sizeof(*req)); req->DevHandle = targ->handle; req->Function = MPI2_FUNCTION_SCSI_TASK_MGMT; req->TaskType = MPI2_SCSITASKMGMT_TASKTYPE_TARGET_RESET; @@ -507,6 +510,7 @@ MPI2_SCSI_TASK_MANAGE_REPLY *reply; MPI2_SAS_IOUNIT_CONTROL_REQUEST *req; struct mpssas_target *targ; + struct mps_command *next_cm; uint16_t handle; mps_dprint(sc, MPS_TRACE, "%s\n", __func__); @@ -523,11 +527,13 @@ return; } - mps_printf(sc, "Reset aborted %d commands\n", reply->TerminationCount); + mps_dprint(sc, MPS_INFO, "Reset aborted %u commands\n", + reply->TerminationCount); mps_free_reply(sc, cm->cm_reply_data); /* Reuse the existing command */ req = (MPI2_SAS_IOUNIT_CONTROL_REQUEST *)cm->cm_req; + memset(req, 0, sizeof(*req)); req->Function = MPI2_FUNCTION_SAS_IO_UNIT_CONTROL; req->Operation = MPI2_SAS_OP_REMOVE_DEVICE; req->DevHandle = handle; @@ -539,6 +545,17 @@ mps_map_command(sc, cm); mps_dprint(sc, MPS_INFO, "clearing target handle 0x%04x\n", handle); + TAILQ_FOREACH_SAFE(cm, &sc->io_list, cm_link, next_cm) { + union ccb *ccb; + + if (cm->cm_targ->handle != handle) + continue; + + mps_dprint(sc, MPS_INFO, "Completing missed command %p\n", cm); + ccb = cm->cm_complete_data; + ccb->ccb_h.status = CAM_DEV_NOT_THERE; + mpssas_scsiio_complete(sc, cm); + } targ = mpssas_find_target(sc->sassc, 0, handle); if (targ != NULL) { targ->handle = 0x0; @@ -1349,6 +1366,7 @@ } req = (MPI2_SCSI_IO_REQUEST *)cm->cm_req; + bzero(req, sizeof(*req)); req->DevHandle = targ->handle; req->Function = MPI2_FUNCTION_SCSI_IO_REQUEST; req->MsgFlags = 0; @@ -1430,6 +1448,11 @@ cm->cm_complete_data = ccb; cm->cm_targ = targ; + sc->io_cmds_active++; + if (sc->io_cmds_active > sc->io_cmds_highwater) + sc->io_cmds_highwater = sc->io_cmds_active; + + TAILQ_INSERT_TAIL(&sc->io_list, cm, cm_link); callout_reset(&cm->cm_callout, (ccb->ccb_h.timeout * hz) / 1000, mpssas_scsiio_timeout, cm); @@ -1449,6 +1472,8 @@ mps_dprint(sc, MPS_TRACE, "%s\n", __func__); callout_stop(&cm->cm_callout); + TAILQ_REMOVE(&sc->io_list, cm, cm_link); + sc->io_cmds_active--; sassc = sc->sassc; ccb = cm->cm_complete_data; @@ -1470,8 +1495,10 @@ /* Take the fast path to completion */ if (cm->cm_reply == NULL) { - ccb->ccb_h.status = CAM_REQ_CMP; - ccb->csio.scsi_status = SCSI_STATUS_OK; + if ((ccb->ccb_h.status & CAM_STATUS_MASK) == CAM_REQ_INPROG) { + ccb->ccb_h.status = CAM_REQ_CMP; + ccb->csio.scsi_status = SCSI_STATUS_OK; + } mps_free_command(sc, cm); xpt_done(ccb); return; @@ -1526,7 +1553,16 @@ break; case MPI2_IOCSTATUS_SCSI_IOC_TERMINATED: case MPI2_IOCSTATUS_SCSI_EXT_TERMINATED: +#if 0 ccb->ccb_h.status = CAM_REQ_ABORTED; +#endif + mps_printf(sc, "(%d:%d:%d) terminated ioc %x scsi %x state %x " + "xfer %u\n", xpt_path_path_id(ccb->ccb_h.path), + xpt_path_target_id(ccb->ccb_h.path), + xpt_path_lun_id(ccb->ccb_h.path), + rep->IOCStatus, rep->SCSIStatus, rep->SCSIState, + rep->TransferCount); + ccb->ccb_h.status = CAM_REQUEUE_REQ; break; case MPI2_IOCSTATUS_INVALID_SGL: mps_print_scsiio_cmd(sc, cm); @@ -1904,7 +1940,6 @@ xpt_done(ccb); } - #endif /* __FreeBSD_version >= 900026 */ static void Index: mps_table.c =================================================================== --- mps_table.c (revision 218241) +++ mps_table.c (working copy) @@ -46,6 +46,8 @@ #include #include +#include +#include #include #include @@ -486,8 +488,16 @@ mps_print_scsiio_cmd(struct mps_softc *sc, struct mps_command *cm) { MPI2_SCSI_IO_REQUEST *req; + union ccb *ccb; req = (MPI2_SCSI_IO_REQUEST *)cm->cm_req; + printf("SCSI command [SMID %d]: len %u data %p flags %#x\n", + cm->cm_desc.Default.SMID, cm->cm_length, cm->cm_data, + cm->cm_flags); + ccb = (union ccb *)cm->cm_complete_data; + scsi_sense_print(&ccb->csio); mps_print_sgl(sc, cm, req->SGLOffset0); + + hexdump(req, sizeof(*req), "mps: ", 0); } Index: mpsvar.h =================================================================== --- mpsvar.h (revision 218241) +++ mpsvar.h (working copy) @@ -119,9 +119,15 @@ #define MPS_FLAGS_MSI (1 << 1) #define MPS_FLAGS_BUSY (1 << 2) #define MPS_FLAGS_SHUTDOWN (1 << 3) +#define MPS_FLAGS_ATTACH_DONE (1 << 4) u_int mps_debug; u_int allow_multiple_tm_cmds; int tm_cmds_active; + int io_cmds_active; + int io_cmds_highwater; + int chain_free; + int chain_free_lowwater; + uint64_t chain_alloc_fail; struct sysctl_ctx_list sysctl_ctx; struct sysctl_oid *sysctl_tree; struct mps_command *commands; @@ -133,6 +139,7 @@ TAILQ_HEAD(, mps_command) req_list; TAILQ_HEAD(, mps_chain) chain_list; TAILQ_HEAD(, mps_command) tm_list; + TAILQ_HEAD(, mps_command) io_list; int replypostindex; int replyfreeindex; @@ -228,8 +235,13 @@ { struct mps_chain *chain; - if ((chain = TAILQ_FIRST(&sc->chain_list)) != NULL) + if ((chain = TAILQ_FIRST(&sc->chain_list)) != NULL) { TAILQ_REMOVE(&sc->chain_list, chain, chain_link); + sc->chain_free--; + if (sc->chain_free < sc->chain_free_lowwater) + sc->chain_free_lowwater = sc->chain_free; + } else + sc->chain_alloc_fail++; return (chain); } @@ -239,6 +251,7 @@ #if 0 bzero(chain->chain, 128); #endif + sc->chain_free++; TAILQ_INSERT_TAIL(&sc->chain_list, chain, chain_link); } Index: mps_user.c =================================================================== --- mps_user.c (revision 218241) +++ mps_user.c (working copy) @@ -400,10 +400,12 @@ if (cmd->len == 0) return (EINVAL); + printf("%s: about to copyin firmware\n", __func__); error = copyin(cmd->buf, cm->cm_data, cmd->len); if (error != 0) return (error); + printf("%s: about to init sge\n", __func__); mpi_init_sge(cm, req, &req->SGL); bzero(&tc, sizeof tc); @@ -425,7 +427,10 @@ tc.ImageSize = cmd->len; cm->cm_flags |= MPS_CM_FLAGS_DATAOUT; + cm->cm_max_segs = 1; + printf("%s: about to push sge\n", __func__); + return (mps_push_sge(cm, &tc, sizeof tc, 0)); } @@ -595,7 +600,7 @@ hdr = (MPI2_REQUEST_HEADER *)cm->cm_req; - mps_dprint(sc, MPS_INFO, "mps_user_command: req %p %d rpl %p %d\n", + mps_printf(sc, "mps_user_command: req %p %d rpl %p %d\n", cmd->req, cmd->req_len, cmd->rpl, cmd->rpl_len ); if (cmd->req_len > (int)sc->facts->IOCRequestFrameSize * 4) { @@ -606,17 +611,11 @@ if (err != 0) goto RetFreeUnlocked; - mps_dprint(sc, MPS_INFO, "mps_user_command: Function %02X " + mps_printf(sc, "mps_user_command: Function %02X " "MsgFlags %02X\n", hdr->Function, hdr->MsgFlags ); - err = mps_user_setup_request(cm, cmd); - if (err != 0) { - mps_printf(sc, "mps_user_command: unsupported function 0x%X\n", - hdr->Function ); - goto RetFreeUnlocked; - } - if (cmd->len > 0) { + mps_printf(sc, "%s: allocating %d bytes\n", __func__, cmd->len); buf = malloc(cmd->len, M_MPSUSER, M_WAITOK|M_ZERO); cm->cm_data = buf; cm->cm_length = cmd->len; @@ -625,6 +624,13 @@ cm->cm_length = 0; } + err = mps_user_setup_request(cm, cmd); + if (err != 0) { + mps_printf(sc, "mps_user_command: unsupported function 0x%X\n", + hdr->Function ); + goto RetFreeUnlocked; + } + cm->cm_flags = MPS_CM_FLAGS_SGE_SIMPLE | MPS_CM_FLAGS_WAKEUP; cm->cm_desc.Default.RequestFlags = MPI2_REQ_DESCRIPT_FLAGS_DEFAULT_TYPE; @@ -653,7 +659,7 @@ copyout(rpl, cmd->rpl, sz); if (buf != NULL) copyout(buf, cmd->buf, cmd->len); - mps_dprint(sc, MPS_INFO, "mps_user_command: reply size %d\n", sz ); + mps_printf(sc, "mps_user_command: reply size %d\n", sz ); RetFreeUnlocked: mps_lock(sc); --gKMricLos+KVdGMg--