From owner-freebsd-scsi@freebsd.org Fri Dec 11 22:35:34 2015 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A61D8A04D0F for ; Fri, 11 Dec 2015 22:35:34 +0000 (UTC) (envelope-from Mykel@mWare.ca) Received: from Vice.ServerNorth.net (vice.ServerNorth.net [209.44.123.194]) by mx1.freebsd.org (Postfix) with ESMTP id 5C842143E for ; Fri, 11 Dec 2015 22:35:33 +0000 (UTC) (envelope-from Mykel@mWare.ca) Received: from mail.servernorth.net (localhost [127.0.0.1]) by Vice.ServerNorth.net (Postfix) with ESMTP id BD16856571 for ; Fri, 11 Dec 2015 17:34:18 -0500 (EST) Received: from myke@servernorth.net by mail.servernorth.net (Archiveopteryx 3.1.4) with esmtpsa id 1449873257-81299-81295/9/2; Fri, 11 Dec 2015 17:34:17 -0500 To: freebsd-scsi@freebsd.org From: Mykel@mWare.ca Subject: Informal(?) sesX messages Message-Id: <566B4F68.2040807@mWare.ca> Date: Fri, 11 Dec 2015 17:34:16 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:42.0) Gecko/20100101 Thunderbird/42.0 Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Sender: myke@servernorth.net X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Dec 2015 22:35:34 -0000 Hi all, please CC me on reply as I'm not subscribed to this list. I've got one of those Supermicro 72-drive monster machines, all ZFS'd up. https://www.supermicro.com/products/system/4u/6048/SSG-6048R-E1CR72L.cfm And before & after replacing a faulty SAS Expander and a pair of cables=20 (gobs of WRITE/ABORT errors), I'm still occasionally seeing these kernel=20 messages (in groups), and I'm not sure if they're benign, or pointing to=20 a SAS expander event... or what. I admit, this is my first time dealing=20 with a machine with SAS expanders, so I'm a bit out of my depth in=20 diagnosis thereof. Dec 11 16:06:54 ZFS-AF kernel: ses5: da7,pass7: Element descriptor: = 'Slot00' Dec 11 16:06:54 ZFS-AF kernel: ses5: da7,pass7: SAS Device Slot Element:=20 1 Phys at Slot 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844bd449 Every now and then (couple times per week) I'll get messages like that=20 for a number of drives, but not all, and not particularly specific to=20 any particular drive - but always spinning rust, not SSDs. It seems to=20 happen under high load, but I'm not completely certain of that. - Smartctl is happy with all the drives. - We've got redundant power supplies and have rotated them around, run=20 with spares - ZFS has logged zero checksum errors, many scrubs, massive=20 writes/copies/send|recvs & bonnies later. - The messages come when we're using either istgt, ctld, local=20 bonnie++s, dd'ing 0s, or just scrubbing. - Seem to only occur while there's regular activity on the pools, NOT=20 when importing/exporting/snapshotting. - Tried without SSDs connected, and with TRIM off, no change (saw a post=20 from a few years ago with the older version of the LSI SAS card having=20 some issues with that.) - It's a new machine, we're still commissioning it, but plan to press it=20 into service as a Vmware storage machine later next week - I've compiled and fired up sesd, but haven't had any messages go by=20 recently - OS is on a pair of Intel SSDs using the motherboard's controller,=20 unrelated/unaffected. Root on ZFS tho. Do I have anything to worry about? Are these normal things to see=20 popping up sporadically? Could this be a firmware bug in the expander(s)=20 or with the driver? I don't have any important data on there at this time, so=20 data-destructive testing is possible right now. Including bunch of info that might be helpful for dx. Thanks for any advice/suggestion! Myke PS: Please CC me in your reply. PPS: That's the kind of hostname you get when you don't give me=20 something better, or tell me that you don't care ;) FreeBSD ZFS-AF.$CLIENT.fqdn 10.2-RELEASE-p7 FreeBSD 10.2-RELEASE-p7 #0:=20 Mon Nov 2 14:19:39 UTC 2015=20 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 (The device IDs are a bit skewed as I hot-rearranged a number of them=20 (camcontrol stopping and reimporting the zpools), but it'll do it after=20 a reboot as well.) Dec 11 16:06:54 ZFS-AF kernel: ses5: da7,pass7: Element descriptor: = 'Slot00' Dec 11 16:06:54 ZFS-AF kernel: ses5: da7,pass7: SAS Device Slot Element:=20 1 Phys at Slot 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844bd449 Dec 11 16:06:54 ZFS-AF kernel: ses5: da4,pass4: Element descriptor: = 'Slot01' Dec 11 16:06:54 ZFS-AF kernel: ses5: da4,pass4: SAS Device Slot Element:=20 1 Phys at Slot 1 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844bc9b1 Dec 11 16:06:54 ZFS-AF kernel: ses5: da2,pass2: Element descriptor: = 'Slot02' Dec 11 16:06:54 ZFS-AF kernel: ses5: da2,pass2: SAS Device Slot Element:=20 1 Phys at Slot 2 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844d6ea1 Dec 11 16:06:54 ZFS-AF kernel: ses5: da1,pass1: Element descriptor: = 'Slot03' Dec 11 16:06:54 ZFS-AF kernel: ses5: da1,pass1: SAS Device Slot Element:=20 1 Phys at Slot 3 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844b9785 Dec 11 16:06:54 ZFS-AF kernel: ses5: da32,pass39: Element descriptor:=20 'Slot04' Dec 11 16:06:54 ZFS-AF kernel: ses5: da32,pass39: SAS Device Slot=20 Element: 1 Phys at Slot 4 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844bda21 Dec 11 16:06:54 ZFS-AF kernel: ses5: da33,pass40: Element descriptor:=20 'Slot05' Dec 11 16:06:54 ZFS-AF kernel: ses5: da33,pass40: SAS Device Slot=20 Element: 1 Phys at Slot 5 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844d050d Dec 11 16:06:54 ZFS-AF kernel: ses5: da34,pass41: Element descriptor:=20 'Slot06' Dec 11 16:06:54 ZFS-AF kernel: ses5: da34,pass41: SAS Device Slot=20 Element: 1 Phys at Slot 6 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844c81cd Dec 11 16:06:54 ZFS-AF kernel: ses5: da35,pass42: Element descriptor:=20 'Slot07' Dec 11 16:06:54 ZFS-AF kernel: ses5: da35,pass42: SAS Device Slot=20 Element: 1 Phys at Slot 7 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844bf42d Dec 11 16:06:54 ZFS-AF kernel: ses5: da36,pass43: Element descriptor:=20 'Slot08' Dec 11 16:06:54 ZFS-AF kernel: ses5: da36,pass43: SAS Device Slot=20 Element: 1 Phys at Slot 8 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844cf71d Dec 11 16:06:54 ZFS-AF kernel: ses5: da37,pass44: Element descriptor:=20 'Slot09' Dec 11 16:06:54 ZFS-AF kernel: ses5: da37,pass44: SAS Device Slot=20 Element: 1 Phys at Slot 9 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844bde8d Dec 11 16:06:54 ZFS-AF kernel: ses5: da38,pass45: Element descriptor:=20 'Slot10' Dec 11 16:06:54 ZFS-AF kernel: ses5: da38,pass45: SAS Device Slot=20 Element: 1 Phys at Slot 10 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844d66c9 Dec 11 16:06:54 ZFS-AF kernel: ses5: da39,pass46: Element descriptor:=20 'Slot11' Dec 11 16:06:54 ZFS-AF kernel: ses5: da39,pass46: SAS Device Slot=20 Element: 1 Phys at Slot 11 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: SAS device type 1 id 0 Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: protocols: Initiator( None=20 ) Target( SSP ) Dec 11 16:06:54 ZFS-AF kernel: ses5: phy 0: parent 500304801ea2df3f=20 addr 5000c500844cf01d [root@ZFS-AF ~]# camcontrol devlist at scbus0 target 13 lun 0 (da12,pass13= ) at scbus0 target 14 lun 0 (da14,pass15= ) at scbus0 target 15 lun 0 (da16,pass17= ) at scbus0 target 23 lun 0 (da19,pass20= ) at scbus0 target 32 lun 0 (ses0,pass8) at scbus1 target 8 lun 0 (pass9,da8) at scbus1 target 9 lun 0 (da10,pass11) at scbus1 target 10 lun 0 (da9,pass10) at scbus1 target 11 lun 0 (pass12,da11= ) at scbus1 target 12 lun 0 (da22,pass28= ) at scbus1 target 13 lun 0 (da13,pass14= ) at scbus1 target 14 lun 0 (da15,pass16= ) at scbus1 target 15 lun 0 (da17,pass18= ) at scbus1 target 20 lun 0 (da23,pass29= ) at scbus1 target 21 lun 0 (da21,pass27= ) at scbus1 target 22 lun 0 (da18,pass19= ) at scbus1 target 23 lun 0 (da20,pass26= ) at scbus1 target 32 lun 0 (ses1,pass21= ) at scbus6 target 0 lun 0 (ses2,pass22) at scbus11 target 0 lun 0 (ada0,pass23= ) at scbus12 target 0 lun 0 (ada1,pass24= ) at scbus13 target 0 lun 0 (ses3,pass25= ) at scbus14 target 8 lun 0 (da6,pass6) at scbus14 target 9 lun 0 (da5,pass5) at scbus14 target 10 lun 0 (da3,pass3) at scbus14 target 11 lun 0 (da0,pass0) at scbus14 target 12 lun 0 (pass30,da2= 4) at scbus14 target 13 lun 0 (pass31,da2= 5) at scbus14 target 14 lun 0 (pass32,da2= 6) at scbus14 target 15 lun 0 (pass33,da2= 7) at scbus14 target 16 lun 0 (pass34,da2= 8) at scbus14 target 17 lun 0 (pass35,da2= 9) at scbus14 target 18 lun 0 (pass36,da3= 0) at scbus14 target 19 lun 0 (pass37,da3= 1) at scbus14 target 20 lun 0 (ses4,pass3= 8) at scbus14 target 39 lun 0 (da7,pass7) at scbus14 target 40 lun 0 (da4,pass4) at scbus14 target 41 lun 0 (da2,pass2) at scbus14 target 42 lun 0 (da1,pass1) at scbus14 target 43 lun 0 (pass39,da3= 2) at scbus14 target 44 lun 0 (pass40,da3= 3) at scbus14 target 45 lun 0 (pass41,da3= 4) at scbus14 target 46 lun 0 (pass42,da3= 5) at scbus14 target 47 lun 0 (pass43,da3= 6) at scbus14 target 48 lun 0 (pass44,da3= 7) at scbus14 target 49 lun 0 (pass45,da3= 8) at scbus14 target 50 lun 0 (pass46,da3= 9) at scbus14 target 63 lun 0 (ses5,pass4= 7) [root@ZFS-AF ~]# [root@ZFS-AF ~]# smp_discover ses0 phy 21:D:attached:[5000c500844c64e5:00 t(SSP)] 12 Gbps phy 22:D:attached:[5000c500844d8de5:00 t(SSP)] 12 Gbps phy 23:D:attached:[5000c500844d6865:00 t(SSP)] 12 Gbps phy 31:D:attached:[5000c500844c9a3d:00 t(SSP)] 12 Gbps phy 40:U:attached:[5003048018c77101:03 i(SSP+STP+SMP)] 12 Gbps phy 41:U:attached:[5003048018c77101:02 i(SSP+STP+SMP)] 12 Gbps phy 42:U:attached:[5003048018c77101:01 i(SSP+STP+SMP)] 12 Gbps phy 43:U:attached:[5003048018c77101:00 i(SSP+STP+SMP)] 12 Gbps phy 44:U:attached:[5003048018c77101:07 i(SSP+STP+SMP)] 12 Gbps phy 45:U:attached:[5003048018c77101:06 i(SSP+STP+SMP)] 12 Gbps phy 46:U:attached:[5003048018c77101:05 i(SSP+STP+SMP)] 12 Gbps phy 47:U:attached:[5003048018c77101:04 i(SSP+STP+SMP)] 12 Gbps phy 48:D:attached:[50030480090b8d3d:00 V i(SMP) t(SSP)] 12 Gbps [root@ZFS-AF ~]# smp_discover ses1 phy 0:U:attached:[500304801970b401:00 i(SSP+STP+SMP)] 12 Gbps phy 1:U:attached:[500304801970b401:01 i(SSP+STP+SMP)] 12 Gbps phy 2:U:attached:[500304801970b401:03 i(SSP+STP+SMP)] 12 Gbps phy 3:U:attached:[500304801970b401:02 i(SSP+STP+SMP)] 12 Gbps phy 4:D:attached:[5000c5007788ad75:00 t(SSP)] 12 Gbps phy 5:D:attached:[5000c5007799d161:00 t(SSP)] 12 Gbps phy 6:D:attached:[5000c5007788fd71:00 t(SSP)] 12 Gbps phy 7:D:attached:[5000c5007788d7f9:00 t(SSP)] 12 Gbps phy 8:D:attached:[50030480090b8d88:00 t(SATA)] 12 Gbps phy 9:D:attached:[5000c500844c2185:00 t(SSP)] 12 Gbps phy 10:D:attached:[5000c500844d7ca1:00 t(SSP)] 12 Gbps phy 11:D:attached:[5000c500844c5db5:00 t(SSP)] 12 Gbps phy 16:D:attached:[50030480090b8d90:00 t(SATA)] 12 Gbps phy 17:D:attached:[50030480090b8d91:00 t(SATA)] 12 Gbps phy 18:D:attached:[50030480090b8d92:00 t(SATA)] 12 Gbps phy 19:D:attached:[5000c500844d028d:00 t(SSP)] 12 Gbps phy 22:U:disabled phy 23:U:disabled phy 24:U:attached:[500304801970b401:06 i(SSP+STP+SMP)] 12 Gbps phy 25:U:attached:[500304801970b401:07 i(SSP+STP+SMP)] 12 Gbps phy 26:U:attached:[500304801970b401:05 i(SSP+STP+SMP)] 12 Gbps phy 27:U:attached:[500304801970b401:04 i(SSP+STP+SMP)] 12 Gbps phy 31:U:disabled phy 32:U:disabled phy 36:D:attached:[50030480090b8dbd:00 V i(SMP) t(SSP)] 12 Gbps [root@ZFS-AF ~]# smp_discover ses4 phy 4:D:attached:[5000c500844bb641:00 t(SSP)] 12 Gbps phy 5:D:attached:[5000c500844ba865:00 t(SSP)] 12 Gbps phy 6:D:disabled phy 7:D:disabled phy 8:D:disabled phy 9:D:disabled phy 10:D:disabled phy 11:D:disabled phy 12:D:attached:[5000c500844c6699:00 t(SSP)] 12 Gbps phy 13:D:attached:[5000c500844bd9d5:00 t(SSP)] 12 Gbps phy 14:D:attached:[5000c500844d8c29:00 t(SSP)] 12 Gbps phy 15:D:attached:[5000c500844ba3c1:00 t(SSP)] 12 Gbps phy 16:D:attached:[5000c500844ca251:00 t(SSP)] 12 Gbps phy 17:D:attached:[5000c500844bcc31:00 t(SSP)] 12 Gbps phy 18:D:attached:[5000c500844bda45:00 t(SSP)] 12 Gbps phy 19:D:attached:[5000c500844c8579:00 t(SSP)] 12 Gbps phy 20:D:attached:[5000c500844c7ee1:00 t(SSP)] 12 Gbps phy 21:D:attached:[5000c500844d01fd:00 t(SSP)] 12 Gbps phy 22:U:disabled phy 23:U:disabled phy 24:U:attached:[500304801970b101:02 i(SSP+STP+SMP)] 12 Gbps phy 25:U:attached:[500304801970b101:03 i(SSP+STP+SMP)] 12 Gbps phy 26:U:attached:[500304801970b101:01 i(SSP+STP+SMP)] 12 Gbps phy 27:U:attached:[500304801970b101:00 i(SSP+STP+SMP)] 12 Gbps phy 28:D:attached:[500304801ea2dfbd:00 V i(SMP) t(SSP)] 12 Gbps [root@ZFS-AF ~]# smp_discover ses5 phy 0:D:attached:[5000c500844bd449:00 t(SSP)] 12 Gbps phy 1:D:attached:[5000c500844bc9b1:00 t(SSP)] 12 Gbps phy 2:D:attached:[5000c500844d6ea1:00 t(SSP)] 12 Gbps phy 3:D:attached:[5000c500844b9785:00 t(SSP)] 12 Gbps phy 20:D:attached:[5000c500844bda21:00 t(SSP)] 12 Gbps phy 21:D:attached:[5000c500844d050d:00 t(SSP)] 12 Gbps phy 22:D:attached:[5000c500844c81cd:00 t(SSP)] 12 Gbps phy 23:D:attached:[5000c500844bf42d:00 t(SSP)] 12 Gbps phy 24:D:attached:[5000c500844cf71d:00 t(SSP)] 12 Gbps phy 25:D:attached:[5000c500844bde8d:00 t(SSP)] 12 Gbps phy 26:D:attached:[5000c500844d66c9:00 t(SSP)] 12 Gbps phy 27:D:attached:[5000c500844cf01d:00 t(SSP)] 12 Gbps phy 40:U:attached:[500304801970b102:07 i(SSP+STP+SMP)] 12 Gbps phy 41:U:attached:[500304801970b102:06 i(SSP+STP+SMP)] 12 Gbps phy 42:U:attached:[500304801970b102:05 i(SSP+STP+SMP)] 12 Gbps phy 43:U:attached:[500304801970b102:04 i(SSP+STP+SMP)] 12 Gbps phy 48:D:attached:[500304801ea2df3d:00 V i(SMP) t(SSP)] 12 Gbps [root@ZFS-AF ~]# mpr0@pci0:2:0:0: class=3D0x010700 card=3D0x080815d9 chip=3D0x00971= 000=20 rev=3D0x02 hdr=3D0x00 vendor =3D 'LSI Logic / Symbios Logic' device =3D 'SAS3008 PCI-Express Fusion-MPT SAS-3' class =3D mass storage subclass =3D SAS mpr1@pci0:3:0:0: class=3D0x010700 card=3D0x080815d9 chip=3D0x00971= 000=20 rev=3D0x02 hdr=3D0x00 vendor =3D 'LSI Logic / Symbios Logic' device =3D 'SAS3008 PCI-Express Fusion-MPT SAS-3' class =3D mass storage subclass =3D SAS mpr2@pci0:133:0:0: class=3D0x010700 card=3D0x080815d9 chip=3D0x00971= 000=20 rev=3D0x02 hdr=3D0x00 vendor =3D 'LSI Logic / Symbios Logic' device =3D 'SAS3008 PCI-Express Fusion-MPT SAS-3' class =3D mass storage subclass =3D SAS