From owner-svn-src-stable-10@freebsd.org Mon Oct 5 08:57:17 2015 Return-Path: Delivered-To: svn-src-stable-10@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BF34CA10171; Mon, 5 Oct 2015 08:57:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id AF03EA84; Mon, 5 Oct 2015 08:57:17 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.70]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id t958vHrF018839; Mon, 5 Oct 2015 08:57:17 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id t958vHpL018836; Mon, 5 Oct 2015 08:57:17 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201510050857.t958vHpL018836@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Mon, 5 Oct 2015 08:57:17 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r288732 - in stable/10/sys: cam/ctl conf modules/ctl X-SVN-Group: stable-10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-10@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SVN commit messages for only the 10-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Oct 2015 08:57:17 -0000 Author: mav Date: Mon Oct 5 08:57:16 2015 New Revision: 288732 URL: https://svnweb.freebsd.org/changeset/base/288732 Log: MFC r287621: Reimplement CTL High Availability. CTL HA functionality was originally implemented by Copan many years ago, but large part of the sources was never published. This change includes clean room implementation of the missing code and fixes for many bugs. This code supports dual-node HA with ALUA in four modes: - Active/Unavailable without interlink between nodes; - Active/Standby with second node handling only basic LUN discovery and reservation, synchronizing with the first node through the interlink; - Active/Active with both nodes processing commands and accessing the backing storage, synchronizing with the first node through the interlink; - Active/Active with second node working as proxy, transfering all commands to the first node for execution through the interlink. Unlike original Copan's implementation, depending on specific hardware, this code uses simple custom TCP-based protocol for interlink. It has no authentication, so it should never be enabled on public interfaces. The code may still need some polishing, but generally it is functional. Relnotes: yes Sponsored by: iXsystems, Inc. Added: stable/10/sys/cam/ctl/ctl_ha.c - copied unchanged from r287621, head/sys/cam/ctl/ctl_ha.c Modified: stable/10/sys/cam/ctl/README.ctl.txt stable/10/sys/cam/ctl/ctl.c stable/10/sys/cam/ctl/ctl.h stable/10/sys/cam/ctl/ctl_backend.h stable/10/sys/cam/ctl/ctl_backend_block.c stable/10/sys/cam/ctl/ctl_backend_ramdisk.c stable/10/sys/cam/ctl/ctl_cmd_table.c stable/10/sys/cam/ctl/ctl_error.c stable/10/sys/cam/ctl/ctl_error.h stable/10/sys/cam/ctl/ctl_frontend.c stable/10/sys/cam/ctl/ctl_frontend_cam_sim.c stable/10/sys/cam/ctl/ctl_frontend_ioctl.c stable/10/sys/cam/ctl/ctl_frontend_iscsi.c stable/10/sys/cam/ctl/ctl_ha.h stable/10/sys/cam/ctl/ctl_io.h stable/10/sys/cam/ctl/ctl_private.h stable/10/sys/cam/ctl/ctl_tpc.c stable/10/sys/cam/ctl/ctl_tpc_local.c stable/10/sys/cam/ctl/scsi_ctl.c stable/10/sys/conf/files stable/10/sys/modules/ctl/Makefile Directory Properties: stable/10/ (props changed) Modified: stable/10/sys/cam/ctl/README.ctl.txt ============================================================================== --- stable/10/sys/cam/ctl/README.ctl.txt Mon Oct 5 08:55:59 2015 (r288731) +++ stable/10/sys/cam/ctl/README.ctl.txt Mon Oct 5 08:57:16 2015 (r288732) @@ -43,12 +43,9 @@ Features: - Persistent reservation support - Mode sense/select support - Error injection support - - High Availability support (1) + - High Availability support - All I/O handled in-kernel, no userland context switch overhead. -(1) HA Support is just an API stub, and needs much more to be fully - functional. See the to-do list below. - Configuring and Running CTL: =========================== @@ -245,27 +242,6 @@ To Do List: another data structure in the stack, more memory allocations, etc. This will also require changes to the CAM CCB structure to support CTL. - - Full-featured High Availability support. The HA API that is in ctl_ha.h - is essentially a renamed version of Copan's HA API. There is no - substance to it, but it remains in CTL to show what needs to be done to - implement active/active HA from a CTL standpoint. The things that would - need to be done include: - - A kernel level software API for message passing as well as DMA - between at least two nodes. - - Hardware support and drivers for inter-node communication. This - could be as simples as ethernet hardware and drivers. - - A "supervisor", or startup framework to control and coordinate - HA startup, failover (going from active/active to single mode), - and failback (going from single mode to active/active). - - HA support in other components of the stack. The goal behind HA - is that one node can fail and another node can seamlessly take - over handling I/O requests. This requires support from pretty - much every component in the storage stack, from top to bottom. - CTL is one piece of it, but you also need support in the RAID - stack/filesystem/backing store. You also need full configuration - mirroring, and all peer nodes need to be able to talk to the - underlying storage hardware. - Code Roadmap: ============ @@ -365,11 +341,11 @@ This is a CTL frontend port that is also frontend allows for using CTL without any target-capable hardware. So any LUNs you create in CTL are visible via this port. +ctl_ha.c: ctl_ha.h: -------- -This is a stubbed-out High Availability API. See the comments in the -header and the description of what is needed as far as HA support above. +This is a High Availability API and TCP-based interlink implementation. ctl_io.h: -------- Modified: stable/10/sys/cam/ctl/ctl.c ============================================================================== --- stable/10/sys/cam/ctl/ctl.c Mon Oct 5 08:55:59 2015 (r288731) +++ stable/10/sys/cam/ctl/ctl.c Mon Oct 5 08:57:16 2015 (r288732) @@ -1,6 +1,7 @@ /*- * Copyright (c) 2003-2009 Silicon Graphics International Corp. * Copyright (c) 2012 The FreeBSD Foundation + * Copyright (c) 2015 Alexander Motin * All rights reserved. * * Portions of this software were developed by Edward Tomasz Napierala @@ -84,25 +85,6 @@ __FBSDID("$FreeBSD$"); struct ctl_softc *control_softc = NULL; /* - * Size and alignment macros needed for Copan-specific HA hardware. These - * can go away when the HA code is re-written, and uses busdma for any - * hardware. - */ -#define CTL_ALIGN_8B(target, source, type) \ - if (((uint32_t)source & 0x7) != 0) \ - target = (type)(source + (0x8 - ((uint32_t)source & 0x7)));\ - else \ - target = (type)source; - -#define CTL_SIZE_8B(target, size) \ - if ((size & 0x7) != 0) \ - target = size + (0x8 - (size & 0x7)); \ - else \ - target = size; - -#define CTL_ALIGN_8B_MARGIN 16 - -/* * Template mode pages. */ @@ -351,12 +333,6 @@ const static struct ctl_logical_block_pr } }; -/* - * XXX KDM move these into the softc. - */ -static int rcv_sync_msg; -static uint8_t ctl_pause_rtr; - SYSCTL_NODE(_kern_cam, OID_AUTO, ctl, CTLFLAG_RD, 0, "CAM Target Layer"); static int worker_threads = -1; TUNABLE_INT("kern.cam.ctl.worker_threads", &worker_threads); @@ -375,11 +351,10 @@ SYSCTL_INT(_kern_cam_ctl, OID_AUTO, debu */ #define SCSI_EVPD_NUM_SUPPORTED_PAGES 10 -#ifdef notyet static void ctl_isc_event_handler(ctl_ha_channel chanel, ctl_ha_event event, int param); static void ctl_copy_sense_data(union ctl_ha_msg *src, union ctl_io *dest); -#endif +static void ctl_copy_sense_data_back(union ctl_io *src, union ctl_ha_msg *dest); static int ctl_init(void); void ctl_shutdown(void); static int ctl_open(struct cdev *dev, int flags, int fmt, struct thread *td); @@ -395,10 +370,6 @@ static int ctl_alloc_lun(struct ctl_soft static int ctl_free_lun(struct ctl_lun *lun); static void ctl_create_lun(struct ctl_be_lun *be_lun); static struct ctl_port * ctl_io_port(struct ctl_io_hdr *io_hdr); -/** -static void ctl_failover_change_pages(struct ctl_softc *softc, - struct ctl_scsiio *ctsio, int master); -**/ static int ctl_do_mode_select(union ctl_io *io); static int ctl_pro_preempt(struct ctl_softc *softc, struct ctl_lun *lun, @@ -435,10 +406,11 @@ static int ctl_check_blocked(struct ctl_ static int ctl_scsiio_lun_check(struct ctl_lun *lun, const struct ctl_cmd_entry *entry, struct ctl_scsiio *ctsio); -//static int ctl_check_rtr(union ctl_io *pending_io, struct ctl_softc *softc); -#ifdef notyet -static void ctl_failover(void); -#endif +static void ctl_failover_lun(struct ctl_lun *lun); +static void ctl_est_ua(struct ctl_lun *lun, uint32_t initidx, ctl_ua_type ua); +static void ctl_est_ua_all(struct ctl_lun *lun, uint32_t except, ctl_ua_type ua); +static void ctl_clr_ua(struct ctl_lun *lun, uint32_t initidx, ctl_ua_type ua); +static void ctl_clr_ua_all(struct ctl_lun *lun, uint32_t except, ctl_ua_type ua); static void ctl_clr_ua_allluns(struct ctl_softc *ctl_softc, uint32_t initidx, ctl_ua_type ua_type); static int ctl_scsiio_precheck(struct ctl_softc *ctl_softc, @@ -477,9 +449,7 @@ static void ctl_work_thread(void *arg); static void ctl_enqueue_incoming(union ctl_io *io); static void ctl_enqueue_rtr(union ctl_io *io); static void ctl_enqueue_done(union ctl_io *io); -#ifdef notyet static void ctl_enqueue_isc(union ctl_io *io); -#endif static const struct ctl_cmd_entry * ctl_get_cmd_entry(struct ctl_scsiio *ctsio, int *sa); static const struct ctl_cmd_entry * @@ -487,6 +457,11 @@ static const struct ctl_cmd_entry * static int ctl_cmd_applicable(uint8_t lun_type, const struct ctl_cmd_entry *entry); +static uint64_t ctl_get_prkey(struct ctl_lun *lun, uint32_t residx); +static void ctl_clr_prkey(struct ctl_lun *lun, uint32_t residx); +static void ctl_alloc_prkey(struct ctl_lun *lun, uint32_t residx); +static void ctl_set_prkey(struct ctl_lun *lun, uint32_t residx, uint64_t key); + /* * Load the serialization table. This isn't very pretty, but is probably * the easiest way to do it. @@ -519,7 +494,11 @@ static moduledata_t ctl_moduledata = { DECLARE_MODULE(ctl, ctl_moduledata, SI_SUB_CONFIGURE, SI_ORDER_THIRD); MODULE_VERSION(ctl, 1); -#ifdef notyet +static struct ctl_frontend ha_frontend = +{ + .name = "ha", +}; + static void ctl_isc_handler_finish_xfer(struct ctl_softc *ctl_softc, union ctl_ha_msg *msg_info) @@ -541,7 +520,7 @@ ctl_isc_handler_finish_xfer(struct ctl_s ctsio->sense_residual = msg_info->scsi.sense_residual; ctsio->residual = msg_info->scsi.residual; memcpy(&ctsio->sense_data, &msg_info->scsi.sense_data, - sizeof(ctsio->sense_data)); + msg_info->scsi.sense_len); memcpy(&ctsio->io_hdr.ctl_private[CTL_PRIV_LBA_LEN].bytes, &msg_info->scsi.lbalen, sizeof(msg_info->scsi.lbalen)); ctl_enqueue_isc((union ctl_io *)ctsio); @@ -560,38 +539,327 @@ ctl_isc_handler_finish_ser_only(struct c } ctsio = &msg_info->hdr.serializing_sc->scsiio; -#if 0 - /* - * Attempt to catch the situation where an I/O has - * been freed, and we're using it again. - */ - if (ctsio->io_hdr.io_type == 0xff) { - union ctl_io *tmp_io; - tmp_io = (union ctl_io *)ctsio; - printf("%s: %p use after free!\n", __func__, - ctsio); - printf("%s: type %d msg %d cdb %x iptl: " - "%u:%u:%u tag 0x%04x " - "flag %#x status %x\n", - __func__, - tmp_io->io_hdr.io_type, - tmp_io->io_hdr.msg_type, - tmp_io->scsiio.cdb[0], - tmp_io->io_hdr.nexus.initid, - tmp_io->io_hdr.nexus.targ_port, - tmp_io->io_hdr.nexus.targ_lun, - (tmp_io->io_hdr.io_type == - CTL_IO_TASK) ? - tmp_io->taskio.tag_num : - tmp_io->scsiio.tag_num, - tmp_io->io_hdr.flags, - tmp_io->io_hdr.status); - } -#endif ctsio->io_hdr.msg_type = CTL_MSG_FINISH_IO; ctl_enqueue_isc((union ctl_io *)ctsio); } +void +ctl_isc_announce_lun(struct ctl_lun *lun) +{ + struct ctl_softc *softc = lun->ctl_softc; + union ctl_ha_msg *msg; + struct ctl_ha_msg_lun_pr_key pr_key; + int i, k; + + if (softc->ha_link != CTL_HA_LINK_ONLINE) + return; + mtx_lock(&lun->lun_lock); + i = sizeof(msg->lun); + if (lun->lun_devid) + i += lun->lun_devid->len; + i += sizeof(pr_key) * lun->pr_key_count; +alloc: + mtx_unlock(&lun->lun_lock); + msg = malloc(i, M_CTL, M_WAITOK); + mtx_lock(&lun->lun_lock); + k = sizeof(msg->lun); + if (lun->lun_devid) + k += lun->lun_devid->len; + k += sizeof(pr_key) * lun->pr_key_count; + if (i < k) { + free(msg, M_CTL); + i = k; + goto alloc; + } + bzero(&msg->lun, sizeof(msg->lun)); + msg->hdr.msg_type = CTL_MSG_LUN_SYNC; + msg->hdr.nexus.targ_lun = lun->lun; + msg->hdr.nexus.targ_mapped_lun = lun->lun; + msg->lun.flags = lun->flags; + msg->lun.pr_generation = lun->PRGeneration; + msg->lun.pr_res_idx = lun->pr_res_idx; + msg->lun.pr_res_type = lun->res_type; + msg->lun.pr_key_count = lun->pr_key_count; + i = 0; + if (lun->lun_devid) { + msg->lun.lun_devid_len = lun->lun_devid->len; + memcpy(&msg->lun.data[i], lun->lun_devid->data, + msg->lun.lun_devid_len); + i += msg->lun.lun_devid_len; + } + for (k = 0; k < CTL_MAX_INITIATORS; k++) { + if ((pr_key.pr_key = ctl_get_prkey(lun, k)) == 0) + continue; + pr_key.pr_iid = k; + memcpy(&msg->lun.data[i], &pr_key, sizeof(pr_key)); + i += sizeof(pr_key); + } + mtx_unlock(&lun->lun_lock); + ctl_ha_msg_send(CTL_HA_CHAN_CTL, &msg->port, sizeof(msg->port) + i, + M_WAITOK); + free(msg, M_CTL); +} + +void +ctl_isc_announce_port(struct ctl_port *port) +{ + struct ctl_softc *softc = control_softc; + union ctl_ha_msg *msg; + int i; + + if (port->targ_port < softc->port_min || + port->targ_port >= softc->port_max || + softc->ha_link != CTL_HA_LINK_ONLINE) + return; + i = sizeof(msg->port) + strlen(port->port_name) + 1; + if (port->lun_map) + i += sizeof(uint32_t) * CTL_MAX_LUNS; + if (port->port_devid) + i += port->port_devid->len; + if (port->target_devid) + i += port->target_devid->len; + msg = malloc(i, M_CTL, M_WAITOK); + bzero(&msg->port, sizeof(msg->port)); + msg->hdr.msg_type = CTL_MSG_PORT_SYNC; + msg->hdr.nexus.targ_port = port->targ_port; + msg->port.port_type = port->port_type; + msg->port.physical_port = port->physical_port; + msg->port.virtual_port = port->virtual_port; + msg->port.status = port->status; + i = 0; + msg->port.name_len = sprintf(&msg->port.data[i], + "%d:%s", softc->ha_id, port->port_name) + 1; + i += msg->port.name_len; + if (port->lun_map) { + msg->port.lun_map_len = sizeof(uint32_t) * CTL_MAX_LUNS; + memcpy(&msg->port.data[i], port->lun_map, + msg->port.lun_map_len); + i += msg->port.lun_map_len; + } + if (port->port_devid) { + msg->port.port_devid_len = port->port_devid->len; + memcpy(&msg->port.data[i], port->port_devid->data, + msg->port.port_devid_len); + i += msg->port.port_devid_len; + } + if (port->target_devid) { + msg->port.target_devid_len = port->target_devid->len; + memcpy(&msg->port.data[i], port->target_devid->data, + msg->port.target_devid_len); + i += msg->port.target_devid_len; + } + ctl_ha_msg_send(CTL_HA_CHAN_CTL, &msg->port, sizeof(msg->port) + i, + M_WAITOK); + free(msg, M_CTL); +} + +static void +ctl_isc_ha_link_up(struct ctl_softc *softc) +{ + struct ctl_port *port; + struct ctl_lun *lun; + + STAILQ_FOREACH(port, &softc->port_list, links) + ctl_isc_announce_port(port); + STAILQ_FOREACH(lun, &softc->lun_list, links) + ctl_isc_announce_lun(lun); +} + +static void +ctl_isc_ha_link_down(struct ctl_softc *softc) +{ + struct ctl_port *port; + struct ctl_lun *lun; + union ctl_io *io; + + mtx_lock(&softc->ctl_lock); + STAILQ_FOREACH(lun, &softc->lun_list, links) { + mtx_lock(&lun->lun_lock); + lun->flags &= ~CTL_LUN_PEER_SC_PRIMARY; + mtx_unlock(&lun->lun_lock); + + mtx_unlock(&softc->ctl_lock); + io = ctl_alloc_io(softc->othersc_pool); + mtx_lock(&softc->ctl_lock); + ctl_zero_io(io); + io->io_hdr.msg_type = CTL_MSG_FAILOVER; + io->io_hdr.nexus.targ_mapped_lun = lun->lun; + ctl_enqueue_isc(io); + } + + STAILQ_FOREACH(port, &softc->port_list, links) { + if (port->targ_port >= softc->port_min && + port->targ_port < softc->port_max) + continue; + port->status &= ~CTL_PORT_STATUS_ONLINE; + } + mtx_unlock(&softc->ctl_lock); +} + +static void +ctl_isc_ua(struct ctl_softc *softc, union ctl_ha_msg *msg, int len) +{ + struct ctl_lun *lun; + uint32_t iid = ctl_get_initindex(&msg->hdr.nexus); + + if (msg->hdr.nexus.targ_lun < CTL_MAX_LUNS && + (lun = softc->ctl_luns[msg->hdr.nexus.targ_lun]) != NULL) { + if (msg->ua.ua_all) { + if (msg->ua.ua_set) + ctl_est_ua_all(lun, iid, msg->ua.ua_type); + else + ctl_clr_ua_all(lun, iid, msg->ua.ua_type); + } else { + if (msg->ua.ua_set) + ctl_est_ua(lun, iid, msg->ua.ua_type); + else + ctl_clr_ua(lun, iid, msg->ua.ua_type); + } + } +} + +static void +ctl_isc_lun_sync(struct ctl_softc *softc, union ctl_ha_msg *msg, int len) +{ + struct ctl_lun *lun; + struct ctl_ha_msg_lun_pr_key pr_key; + int i, k; + + lun = softc->ctl_luns[msg->hdr.nexus.targ_lun]; + if (lun == NULL) { + CTL_DEBUG_PRINT(("%s: Unknown LUN %d\n", __func__, + msg->hdr.nexus.targ_lun)); + } else { + mtx_lock(&lun->lun_lock); + i = (lun->lun_devid != NULL) ? lun->lun_devid->len : 0; + if (msg->lun.lun_devid_len != i || (i > 0 && + memcmp(&msg->lun.data[0], lun->lun_devid->data, i) != 0)) { + mtx_unlock(&lun->lun_lock); + printf("%s: Received conflicting HA LUN %d\n", + __func__, msg->hdr.nexus.targ_lun); + return; + } else { + /* Record whether peer is primary. */ + if ((msg->lun.flags & CTL_LUN_PRIMARY_SC) && + (msg->lun.flags & CTL_LUN_DISABLED) == 0) + lun->flags |= CTL_LUN_PEER_SC_PRIMARY; + else + lun->flags &= ~CTL_LUN_PEER_SC_PRIMARY; + + /* If peer is primary and we are not -- use data */ + if ((lun->flags & CTL_LUN_PRIMARY_SC) == 0 && + (lun->flags & CTL_LUN_PEER_SC_PRIMARY)) { + lun->PRGeneration = msg->lun.pr_generation; + lun->pr_res_idx = msg->lun.pr_res_idx; + lun->res_type = msg->lun.pr_res_type; + lun->pr_key_count = msg->lun.pr_key_count; + for (k = 0; k < CTL_MAX_INITIATORS; k++) + ctl_clr_prkey(lun, k); + for (k = 0; k < msg->lun.pr_key_count; k++) { + memcpy(&pr_key, &msg->lun.data[i], + sizeof(pr_key)); + ctl_alloc_prkey(lun, pr_key.pr_iid); + ctl_set_prkey(lun, pr_key.pr_iid, + pr_key.pr_key); + i += sizeof(pr_key); + } + } + + mtx_unlock(&lun->lun_lock); + CTL_DEBUG_PRINT(("%s: Known LUN %d, peer is %s\n", + __func__, msg->hdr.nexus.targ_lun, + (msg->lun.flags & CTL_LUN_PRIMARY_SC) ? + "primary" : "secondary")); + + /* If we are primary but peer doesn't know -- notify */ + if ((lun->flags & CTL_LUN_PRIMARY_SC) && + (msg->lun.flags & CTL_LUN_PEER_SC_PRIMARY) == 0) + ctl_isc_announce_lun(lun); + } + } +} + +static void +ctl_isc_port_sync(struct ctl_softc *softc, union ctl_ha_msg *msg, int len) +{ + struct ctl_port *port; + int i, new; + + port = softc->ctl_ports[msg->hdr.nexus.targ_port]; + if (port == NULL) { + CTL_DEBUG_PRINT(("%s: New port %d\n", __func__, + msg->hdr.nexus.targ_port)); + new = 1; + port = malloc(sizeof(*port), M_CTL, M_WAITOK | M_ZERO); + port->frontend = &ha_frontend; + port->targ_port = msg->hdr.nexus.targ_port; + } else if (port->frontend == &ha_frontend) { + CTL_DEBUG_PRINT(("%s: Updated port %d\n", __func__, + msg->hdr.nexus.targ_port)); + new = 0; + } else { + printf("%s: Received conflicting HA port %d\n", + __func__, msg->hdr.nexus.targ_port); + return; + } + port->port_type = msg->port.port_type; + port->physical_port = msg->port.physical_port; + port->virtual_port = msg->port.virtual_port; + port->status = msg->port.status; + i = 0; + free(port->port_name, M_CTL); + port->port_name = strndup(&msg->port.data[i], msg->port.name_len, + M_CTL); + i += msg->port.name_len; + if (msg->port.lun_map_len != 0) { + if (port->lun_map == NULL) + port->lun_map = malloc(sizeof(uint32_t) * CTL_MAX_LUNS, + M_CTL, M_WAITOK); + memcpy(port->lun_map, &msg->port.data[i], + sizeof(uint32_t) * CTL_MAX_LUNS); + i += msg->port.lun_map_len; + } else { + free(port->lun_map, M_CTL); + port->lun_map = NULL; + } + if (msg->port.port_devid_len != 0) { + if (port->port_devid == NULL || + port->port_devid->len != msg->port.port_devid_len) { + free(port->port_devid, M_CTL); + port->port_devid = malloc(sizeof(struct ctl_devid) + + msg->port.port_devid_len, M_CTL, M_WAITOK); + } + memcpy(port->port_devid->data, &msg->port.data[i], + msg->port.port_devid_len); + port->port_devid->len = msg->port.port_devid_len; + i += msg->port.port_devid_len; + } else { + free(port->port_devid, M_CTL); + port->port_devid = NULL; + } + if (msg->port.target_devid_len != 0) { + if (port->target_devid == NULL || + port->target_devid->len != msg->port.target_devid_len) { + free(port->target_devid, M_CTL); + port->target_devid = malloc(sizeof(struct ctl_devid) + + msg->port.target_devid_len, M_CTL, M_WAITOK); + } + memcpy(port->target_devid->data, &msg->port.data[i], + msg->port.target_devid_len); + port->target_devid->len = msg->port.target_devid_len; + i += msg->port.target_devid_len; + } else { + free(port->port_devid, M_CTL); + port->port_devid = NULL; + } + if (new) { + if (ctl_port_register(port) != 0) { + printf("%s: ctl_port_register() failed with error\n", + __func__); + } + } +} + /* * ISC (Inter Shelf Communication) event handler. Events from the HA * subsystem come in here. @@ -605,54 +873,33 @@ ctl_isc_event_handler(ctl_ha_channel cha ctl_ha_status isc_status; softc = control_softc; - io = NULL; - - -#if 0 - printf("CTL: Isc Msg event %d\n", event); -#endif + CTL_DEBUG_PRINT(("CTL: Isc Msg event %d\n", event)); if (event == CTL_HA_EVT_MSG_RECV) { - union ctl_ha_msg msg_info; + union ctl_ha_msg *msg, msgbuf; - isc_status = ctl_ha_msg_recv(CTL_HA_CHAN_CTL, &msg_info, - sizeof(msg_info), /*wait*/ 0); -#if 0 - printf("CTL: msg_type %d\n", msg_info.msg_type); -#endif - if (isc_status != 0) { - printf("Error receiving message, status = %d\n", - isc_status); + if (param > sizeof(msgbuf)) + msg = malloc(param, M_CTL, M_WAITOK); + else + msg = &msgbuf; + isc_status = ctl_ha_msg_recv(CTL_HA_CHAN_CTL, msg, param, + M_WAITOK); + if (isc_status != CTL_HA_STATUS_SUCCESS) { + printf("%s: Error receiving message: %d\n", + __func__, isc_status); + if (msg != &msgbuf) + free(msg, M_CTL); return; } - switch (msg_info.hdr.msg_type) { + CTL_DEBUG_PRINT(("CTL: msg_type %d\n", msg->msg_type)); + switch (msg->hdr.msg_type) { case CTL_MSG_SERIALIZE: -#if 0 - printf("Serialize\n"); -#endif - io = ctl_alloc_io_nowait(softc->othersc_pool); - if (io == NULL) { - printf("ctl_isc_event_handler: can't allocate " - "ctl_io!\n"); - /* Bad Juju */ - /* Need to set busy and send msg back */ - msg_info.hdr.msg_type = CTL_MSG_BAD_JUJU; - msg_info.hdr.status = CTL_SCSI_ERROR; - msg_info.scsi.scsi_status = SCSI_STATUS_BUSY; - msg_info.scsi.sense_len = 0; - if (ctl_ha_msg_send(CTL_HA_CHAN_CTL, &msg_info, - sizeof(msg_info), 0) > CTL_HA_STATUS_SUCCESS){ - } - goto bailout; - } + io = ctl_alloc_io(softc->othersc_pool); ctl_zero_io(io); - // populate ctsio from msg_info + // populate ctsio from msg io->io_hdr.io_type = CTL_IO_SCSI; io->io_hdr.msg_type = CTL_MSG_SERIALIZE; - io->io_hdr.original_sc = msg_info.hdr.original_sc; -#if 0 - printf("pOrig %x\n", (int)msg_info.original_sc); -#endif + io->io_hdr.original_sc = msg->hdr.original_sc; io->io_hdr.flags |= CTL_FLAG_FROM_OTHER_SC | CTL_FLAG_IO_ACTIVE; /* @@ -662,18 +909,23 @@ ctl_isc_event_handler(ctl_ha_channel cha * * XXX KDM add another flag that is more specific. */ - if (softc->ha_mode == CTL_HA_MODE_SER_ONLY) + if (softc->ha_mode != CTL_HA_MODE_XFER) io->io_hdr.flags |= CTL_FLAG_INT_COPY; - io->io_hdr.nexus = msg_info.hdr.nexus; + io->io_hdr.nexus = msg->hdr.nexus; #if 0 printf("port %u, iid %u, lun %u\n", io->io_hdr.nexus.targ_port, io->io_hdr.nexus.initid, io->io_hdr.nexus.targ_lun); #endif - io->scsiio.tag_num = msg_info.scsi.tag_num; - io->scsiio.tag_type = msg_info.scsi.tag_type; - memcpy(io->scsiio.cdb, msg_info.scsi.cdb, + io->scsiio.tag_num = msg->scsi.tag_num; + io->scsiio.tag_type = msg->scsi.tag_type; +#ifdef CTL_TIME_IO + io->io_hdr.start_time = time_uptime; + getbintime(&io->io_hdr.start_bt); +#endif /* CTL_TIME_IO */ + io->scsiio.cdb_len = msg->scsi.cdb_len; + memcpy(io->scsiio.cdb, msg->scsi.cdb, CTL_MAX_CDBLEN); if (softc->ha_mode == CTL_HA_MODE_XFER) { const struct ctl_cmd_entry *entry; @@ -691,7 +943,7 @@ ctl_isc_event_handler(ctl_ha_channel cha struct ctl_sg_entry *sgl; int i, j; - io = msg_info.hdr.original_sc; + io = msg->hdr.original_sc; if (io == NULL) { printf("%s: original_sc == NULL!\n", __func__); /* XXX KDM do something here */ @@ -703,97 +955,66 @@ ctl_isc_event_handler(ctl_ha_channel cha * Keep track of this, we need to send it back over * when the datamove is complete. */ - io->io_hdr.serializing_sc = msg_info.hdr.serializing_sc; + io->io_hdr.serializing_sc = msg->hdr.serializing_sc; - if (msg_info.dt.sg_sequence == 0) { - /* - * XXX KDM we use the preallocated S/G list - * here, but we'll need to change this to - * dynamic allocation if we need larger S/G - * lists. - */ - if (msg_info.dt.kern_sg_entries > - sizeof(io->io_hdr.remote_sglist) / - sizeof(io->io_hdr.remote_sglist[0])) { - printf("%s: number of S/G entries " - "needed %u > allocated num %zd\n", - __func__, - msg_info.dt.kern_sg_entries, - sizeof(io->io_hdr.remote_sglist)/ - sizeof(io->io_hdr.remote_sglist[0])); - - /* - * XXX KDM send a message back to - * the other side to shut down the - * DMA. The error will come back - * through via the normal channel. - */ - break; - } - sgl = io->io_hdr.remote_sglist; - memset(sgl, 0, - sizeof(io->io_hdr.remote_sglist)); + if (msg->dt.sg_sequence == 0) { + i = msg->dt.kern_sg_entries + + io->scsiio.kern_data_len / + CTL_HA_DATAMOVE_SEGMENT + 1; + sgl = malloc(sizeof(*sgl) * i, M_CTL, + M_WAITOK | M_ZERO); + io->io_hdr.remote_sglist = sgl; + io->io_hdr.local_sglist = + &sgl[msg->dt.kern_sg_entries]; io->scsiio.kern_data_ptr = (uint8_t *)sgl; io->scsiio.kern_sg_entries = - msg_info.dt.kern_sg_entries; + msg->dt.kern_sg_entries; io->scsiio.rem_sg_entries = - msg_info.dt.kern_sg_entries; + msg->dt.kern_sg_entries; io->scsiio.kern_data_len = - msg_info.dt.kern_data_len; + msg->dt.kern_data_len; io->scsiio.kern_total_len = - msg_info.dt.kern_total_len; + msg->dt.kern_total_len; io->scsiio.kern_data_resid = - msg_info.dt.kern_data_resid; + msg->dt.kern_data_resid; io->scsiio.kern_rel_offset = - msg_info.dt.kern_rel_offset; - /* - * Clear out per-DMA flags. - */ - io->io_hdr.flags &= ~CTL_FLAG_RDMA_MASK; - /* - * Add per-DMA flags that are set for this - * particular DMA request. - */ - io->io_hdr.flags |= msg_info.dt.flags & - CTL_FLAG_RDMA_MASK; + msg->dt.kern_rel_offset; + io->io_hdr.flags &= ~CTL_FLAG_BUS_ADDR; + io->io_hdr.flags |= msg->dt.flags & + CTL_FLAG_BUS_ADDR; } else sgl = (struct ctl_sg_entry *) io->scsiio.kern_data_ptr; - for (i = msg_info.dt.sent_sg_entries, j = 0; - i < (msg_info.dt.sent_sg_entries + - msg_info.dt.cur_sg_entries); i++, j++) { - sgl[i].addr = msg_info.dt.sg_list[j].addr; - sgl[i].len = msg_info.dt.sg_list[j].len; + for (i = msg->dt.sent_sg_entries, j = 0; + i < (msg->dt.sent_sg_entries + + msg->dt.cur_sg_entries); i++, j++) { + sgl[i].addr = msg->dt.sg_list[j].addr; + sgl[i].len = msg->dt.sg_list[j].len; #if 0 printf("%s: L: %p,%d -> %p,%d j=%d, i=%d\n", __func__, - msg_info.dt.sg_list[j].addr, - msg_info.dt.sg_list[j].len, + msg->dt.sg_list[j].addr, + msg->dt.sg_list[j].len, sgl[i].addr, sgl[i].len, j, i); #endif } -#if 0 - memcpy(&sgl[msg_info.dt.sent_sg_entries], - msg_info.dt.sg_list, - sizeof(*sgl) * msg_info.dt.cur_sg_entries); -#endif /* * If this is the last piece of the I/O, we've got * the full S/G list. Queue processing in the thread. * Otherwise wait for the next piece. */ - if (msg_info.dt.sg_last != 0) + if (msg->dt.sg_last != 0) ctl_enqueue_isc(io); break; } /* Performed on the Serializing (primary) SC, XFER mode only */ case CTL_MSG_DATAMOVE_DONE: { - if (msg_info.hdr.serializing_sc == NULL) { + if (msg->hdr.serializing_sc == NULL) { printf("%s: serializing_sc == NULL!\n", __func__); /* XXX KDM now what? */ @@ -804,33 +1025,35 @@ ctl_isc_event_handler(ctl_ha_channel cha * there was a failure, so we can return status * back to the initiator. */ - io = msg_info.hdr.serializing_sc; + io = msg->hdr.serializing_sc; io->io_hdr.msg_type = CTL_MSG_DATAMOVE_DONE; - io->io_hdr.status = msg_info.hdr.status; - io->scsiio.scsi_status = msg_info.scsi.scsi_status; - io->scsiio.sense_len = msg_info.scsi.sense_len; - io->scsiio.sense_residual =msg_info.scsi.sense_residual; - io->io_hdr.port_status = msg_info.scsi.fetd_status; - io->scsiio.residual = msg_info.scsi.residual; - memcpy(&io->scsiio.sense_data,&msg_info.scsi.sense_data, - sizeof(io->scsiio.sense_data)); + io->io_hdr.flags |= CTL_FLAG_IO_ACTIVE; + io->io_hdr.port_status = msg->scsi.fetd_status; + io->scsiio.residual = msg->scsi.residual; + if (msg->hdr.status != CTL_STATUS_NONE) { + io->io_hdr.status = msg->hdr.status; + io->scsiio.scsi_status = msg->scsi.scsi_status; + io->scsiio.sense_len = msg->scsi.sense_len; + io->scsiio.sense_residual =msg->scsi.sense_residual; + memcpy(&io->scsiio.sense_data, + &msg->scsi.sense_data, + msg->scsi.sense_len); + } ctl_enqueue_isc(io); break; } /* Preformed on Originating SC, SER_ONLY mode */ case CTL_MSG_R2R: - io = msg_info.hdr.original_sc; + io = msg->hdr.original_sc; if (io == NULL) { - printf("%s: Major Bummer\n", __func__); - return; - } else { -#if 0 - printf("pOrig %x\n",(int) ctsio); -#endif + printf("%s: original_sc == NULL!\n", + __func__); + break; } + io->io_hdr.flags |= CTL_FLAG_IO_ACTIVE; io->io_hdr.msg_type = CTL_MSG_R2R; - io->io_hdr.serializing_sc = msg_info.hdr.serializing_sc; + io->io_hdr.serializing_sc = msg->hdr.serializing_sc; ctl_enqueue_isc(io); break; @@ -842,22 +1065,20 @@ ctl_isc_event_handler(ctl_ha_channel cha */ case CTL_MSG_FINISH_IO: if (softc->ha_mode == CTL_HA_MODE_XFER) - ctl_isc_handler_finish_xfer(softc, - &msg_info); + ctl_isc_handler_finish_xfer(softc, msg); else - ctl_isc_handler_finish_ser_only(softc, - &msg_info); + ctl_isc_handler_finish_ser_only(softc, msg); break; /* Preformed on Originating SC */ case CTL_MSG_BAD_JUJU: - io = msg_info.hdr.original_sc; + io = msg->hdr.original_sc; if (io == NULL) { printf("%s: Bad JUJU!, original_sc is NULL!\n", __func__); break; } - ctl_copy_sense_data(&msg_info, io); + ctl_copy_sense_data(msg, io); /* * IO should have already been cleaned up on other * SC so clear this flag so we won't send a message @@ -866,7 +1087,7 @@ ctl_isc_event_handler(ctl_ha_channel cha io->io_hdr.flags &= ~CTL_FLAG_SENT_2OTHER_SC; io->io_hdr.flags |= CTL_FLAG_IO_ACTIVE; - /* io = msg_info.hdr.serializing_sc; */ + /* io = msg->hdr.serializing_sc; */ io->io_hdr.msg_type = CTL_MSG_BAD_JUJU; ctl_enqueue_isc(io); break; @@ -874,91 +1095,99 @@ ctl_isc_event_handler(ctl_ha_channel cha /* Handle resets sent from the other side */ case CTL_MSG_MANAGE_TASKS: { struct ctl_taskio *taskio; - taskio = (struct ctl_taskio *)ctl_alloc_io_nowait( + taskio = (struct ctl_taskio *)ctl_alloc_io( softc->othersc_pool); - if (taskio == NULL) { - printf("ctl_isc_event_handler: can't allocate " - "ctl_io!\n"); - /* Bad Juju */ - /* should I just call the proper reset func - here??? */ - goto bailout; - } ctl_zero_io((union ctl_io *)taskio); taskio->io_hdr.io_type = CTL_IO_TASK; taskio->io_hdr.flags |= CTL_FLAG_FROM_OTHER_SC; - taskio->io_hdr.nexus = msg_info.hdr.nexus; - taskio->task_action = msg_info.task.task_action; - taskio->tag_num = msg_info.task.tag_num; - taskio->tag_type = msg_info.task.tag_type; + taskio->io_hdr.nexus = msg->hdr.nexus; + taskio->task_action = msg->task.task_action; + taskio->tag_num = msg->task.tag_num; + taskio->tag_type = msg->task.tag_type; #ifdef CTL_TIME_IO taskio->io_hdr.start_time = time_uptime; getbintime(&taskio->io_hdr.start_bt); -#if 0 - cs_prof_gettime(&taskio->io_hdr.start_ticks); -#endif #endif /* CTL_TIME_IO */ ctl_run_task((union ctl_io *)taskio); break; } /* Persistent Reserve action which needs attention */ case CTL_MSG_PERS_ACTION: - presio = (struct ctl_prio *)ctl_alloc_io_nowait( + presio = (struct ctl_prio *)ctl_alloc_io( softc->othersc_pool); - if (presio == NULL) { - printf("ctl_isc_event_handler: can't allocate " - "ctl_io!\n"); - /* Bad Juju */ - /* Need to set busy and send msg back */ - goto bailout; - } ctl_zero_io((union ctl_io *)presio); presio->io_hdr.msg_type = CTL_MSG_PERS_ACTION; - presio->pr_msg = msg_info.pr; + presio->io_hdr.flags |= CTL_FLAG_FROM_OTHER_SC; + presio->io_hdr.nexus = msg->hdr.nexus; + presio->pr_msg = msg->pr; ctl_enqueue_isc((union ctl_io *)presio); break; - case CTL_MSG_SYNC_FE: - rcv_sync_msg = 1; + case CTL_MSG_UA: + ctl_isc_ua(softc, msg, param); + break; + case CTL_MSG_PORT_SYNC: + ctl_isc_port_sync(softc, msg, param); + break; + case CTL_MSG_LUN_SYNC: + ctl_isc_lun_sync(softc, msg, param); break; default: - printf("How did I get here?\n"); + printf("Received HA message of unknown type %d\n", + msg->hdr.msg_type); + break; } - } else if (event == CTL_HA_EVT_MSG_SENT) { - if (param != CTL_HA_STATUS_SUCCESS) { - printf("Bad status from ctl_ha_msg_send status %d\n", - param); + if (msg != &msgbuf) + free(msg, M_CTL); + } else if (event == CTL_HA_EVT_LINK_CHANGE) { + printf("CTL: HA link status changed from %d to %d\n", + softc->ha_link, param); + if (param == softc->ha_link) + return; + if (softc->ha_link == CTL_HA_LINK_ONLINE) { + softc->ha_link = param; + ctl_isc_ha_link_down(softc); + } else { + softc->ha_link = param; + if (softc->ha_link == CTL_HA_LINK_ONLINE) + ctl_isc_ha_link_up(softc); } return; - } else if (event == CTL_HA_EVT_DISCONNECT) { - printf("CTL: Got a disconnect from Isc\n"); - return; } else { printf("ctl_isc_event_handler: Unknown event %d\n", event); return; } - -bailout: - return; } static void ctl_copy_sense_data(union ctl_ha_msg *src, union ctl_io *dest) { - struct scsi_sense_data *sense; - sense = &dest->scsiio.sense_data; - bcopy(&src->scsi.sense_data, sense, sizeof(*sense)); + memcpy(&dest->scsiio.sense_data, &src->scsi.sense_data, + src->scsi.sense_len); dest->scsiio.scsi_status = src->scsi.scsi_status; dest->scsiio.sense_len = src->scsi.sense_len; dest->io_hdr.status = src->hdr.status; } -#endif + +static void +ctl_copy_sense_data_back(union ctl_io *src, union ctl_ha_msg *dest) +{ + + memcpy(&dest->scsi.sense_data, &src->scsiio.sense_data, + src->scsiio.sense_len); + dest->scsi.scsi_status = src->scsiio.scsi_status; *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***