From owner-freebsd-scsi@freebsd.org Sun Jul 1 21:01:34 2018 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2B474FDC14D for ; Sun, 1 Jul 2018 21:01:34 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 117887DF81 for ; Sun, 1 Jul 2018 21:01:33 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: by mailman.ysv.freebsd.org (Postfix) id C98D1FDC117; Sun, 1 Jul 2018 21:01:32 +0000 (UTC) Delivered-To: scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B5DEEFDC10E for ; Sun, 1 Jul 2018 21:01:32 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from mxrelay.ysv.freebsd.org (mxrelay.ysv.freebsd.org [IPv6:2001:1900:2254:206a::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.ysv.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4F6BD7DF79 for ; Sun, 1 Jul 2018 21:01:32 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.ysv.freebsd.org (Postfix) with ESMTPS id 815FB21B8E for ; Sun, 1 Jul 2018 21:01:31 +0000 (UTC) (envelope-from bugzilla-noreply@FreeBSD.org) Received: from kenobi.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id w61L1Vnd053458 for ; Sun, 1 Jul 2018 21:01:31 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Received: (from bugzilla@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id w61L1V4O053449 for scsi@FreeBSD.org; Sun, 1 Jul 2018 21:01:31 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Message-Id: <201807012101.w61L1V4O053449@kenobi.freebsd.org> X-Authentication-Warning: kenobi.freebsd.org: bugzilla set sender to bugzilla-noreply@FreeBSD.org using -f From: bugzilla-noreply@FreeBSD.org To: scsi@FreeBSD.org Subject: Problem reports for scsi@FreeBSD.org that need special attention Date: Sun, 1 Jul 2018 21:01:31 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Jul 2018 21:01:34 -0000 To view an individual PR, use: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id). The following is a listing of current problems submitted by FreeBSD users, which need special attention. These represent problem reports covering all versions including experimental development code and obsolete releases. Status | Bug Id | Description ------------+-----------+--------------------------------------------------- New | 221952 | cam iosched: Fix trim statistics 1 problems total for which you should take action. From owner-freebsd-scsi@freebsd.org Tue Jul 3 09:57:03 2018 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 82BCF1030A91 for ; Tue, 3 Jul 2018 09:57:03 +0000 (UTC) (envelope-from CrimsonThunder@gmx.net) Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D23F777DAC for ; Tue, 3 Jul 2018 09:57:02 +0000 (UTC) (envelope-from CrimsonThunder@gmx.net) Received: from [193.170.152.64] ([193.170.152.64]) by web-mail.gmx.net (3c-app-gmx-bs01.server.lan [172.19.170.50]) (via HTTP); Tue, 3 Jul 2018 11:56:53 +0200 Message-ID: From: "Oliver Sech" To: freebsd-scsi@freebsd.org Subject: problems with SAS JBODs Date: Tue, 3 Jul 2018 11:56:53 +0200 Importance: normal Sensitivity: Normal X-Priority: 3 X-Provags-ID: V03:K1:F7J1NzOGpQZgqXpDNRbUHq4V2kDq1skBkSAi5JEmwKw1ikrX1pxfHpTtX90rTECffpAhn IXn8+Ed8jKQmYDFUbJFko9lORgrySZoyaoKYy5gdUmW6d8DPYSpf7ntr6ehCm2/X9FTOvBSInbK+ JTdayKTyAM1GZwPAFKiYBxIczmGoxjpdRq7e0m4pX0X4rD/vDDkpZcV38bZ2br9RY23E6zx+gu5p SoakzB+t646dIfarrSNqCKvDKUwtCx1+vtfOcF0I4wGdHrkNQQCyhZChqaOdy3yTVP3677iZlXJe Ro= X-UI-Out-Filterresults: notjunk:1;V01:K0:q+PrxApIWqY=:qZPMi9IqGDPS2gz20ktom4 FpzG12CnhKiGxIf4yOpzGeY63Snlow6Cy25jR8D+DI2Owy2vlWVXZGINk8qiiL1cWKK6UvJMd XJ+g+HTk5Veqw0c7Ca+3Q7CBFu4KHVur5KVFdPa0rNt0AZhQ1meEuxJt65XnacO2Zhf96nZQc k1w4Qq0qx3AQfJ6/qyqa2+KgHItSBBAkiJYfOcvFPUIt/CWJtjBZMbUhlqumdOMrymeMx2Zyp JUvbI9Hq0iuDDUrOX7p2mgY3jG3r7KiJCvi0/SJRab9S30amkBXGfaEbkx2ntFR2jsOLdwKld K2bFjrQ3y22u7FyJAHHZwYLhYGV4OJEIYtNANMI5Aiw/3OeSW5ZrtRf+OUbQTckM5VZkEcljj HsFHQ+tBHccerUBb3S6e86wFvetIF1QP4tRJJYCOvKpFzG4N5gBPxoGqkfaGvBlHJdoFK27sp AaQdbOS33LhqGpw8knc/yEJVVzh4aPinzO9I1oNdZ1uxL3TVCwDu MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jul 2018 09:57:03 -0000 From owner-freebsd-scsi@freebsd.org Tue Jul 3 12:29:07 2018 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 24F42103ED46 for ; Tue, 3 Jul 2018 12:29:07 +0000 (UTC) (envelope-from CrimsonThunder@gmx.net) Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 725077E3B9 for ; Tue, 3 Jul 2018 12:29:06 +0000 (UTC) (envelope-from CrimsonThunder@gmx.net) Received: from [193.170.152.64] ([193.170.152.64]) by web-mail.gmx.net (3c-app-gmx-bs01.server.lan [172.19.170.50]) (via HTTP); Tue, 3 Jul 2018 14:28:58 +0200 MIME-Version: 1.0 Message-ID: From: "Oliver Sech" To: freebsd-scsi@freebsd.org Subject: problems with SAS JBODs 2 Content-Type: text/plain; charset=UTF-8 Date: Tue, 3 Jul 2018 14:28:58 +0200 Importance: normal Sensitivity: Normal Content-Transfer-Encoding: quoted-printable X-Priority: 3 X-Provags-ID: V03:K1:rYQxv3CGKWSL7s+RhGFtTT9jCBYRww7J/UgGFudYROqRYjXXbp1//Fa+JHCit7GWFtZB0 bHKcLyb9cxmbv1SYv9rss8e2awS6oiHGxnynG7L2PLeBFhRUEazBSiKVvE3VAKDfQvyMxUMFOVSf ewoFnITzVyrlLfXheuJmZRkhrucH+waQ3fdeELdWOs0wTCnCEat/nAhaf9EyVFB8dvwjrmXTbIy/ D2skSVKQG/rjELv7SyYpfb8faZ0LBwI3nOl07/H9sh8HiGskloX5XOYvmKhEMJwWO4rftmFNKCxv qo= X-UI-Out-Filterresults: notjunk:1;V01:K0:pMcRsJJ3n/A=:q3clALHsU/7+wCST7VHZIW HZlMntZDLytHpPls9/Dp0Wm4vEi4UsbmdLVfMezB3AfSNPvOuOFIWSXkWZpggrEiapDsomyMJ XPrp0D4cEh5WQ89pywPEsRbUPjYoEd7S7/xKlSh5uAGixOO+dBR4BterpLTbxzo0I5Lq0B3to qAplLb3W2tRvnBs69jJT/2Fw8CGP6JrZHNaWJAGFvTGOYRJzrQm86QNnY1kF8bohqX/mEr+s4 RsiWFoyiNc126YEfk7BiAnrVb24B6SVaQPHfw8ymXFLTTUYUAqETZK0FWRl3tgs0om1LuK/qq 4UNfX06N2y9qvy1qPzKlOhgVCBacsBlaKgl8jYQFnhFrGU/KBq8vMggJU7VGoeGT4E2BBpvn2 IRoI9iE5X+fk7FzvKLiaGhVTPAMR3jskCGjXa1Z3Wqh9yU8TWg9WrwaYQhWX6uCydgq2iM0/T k2e/tCDPH75qCKpfh4t/D/5HJiEdBXXowWjoeb7elmQ5nrhKFU3J X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jul 2018 12:29:07 -0000 Hi! =C2=A0 I use FreeBSD with for a large ZFS pool (over 1PB) and I recently encounte= red a lot of problems with the JBODs=2E Generally everything works fine unt= il I replug the shelves=2E =C2=A0 When I start with a clean system and attach a single shelf every thing see= ms fine=2E -> 44 disks show up, I can use the enclosure services (sesutil) and the sy= stem continues to run without problems=2E Once I disconnect the SAS cable, wait until all devices disapear and recon= nect I get all sorts of problems=2E -> a random number of disks shows up and the enclosure "ses" do not show u= p Once I restart the system I can start over again=2E =C2=A0 On the server with the large pool there are only certain ports on the HBA = that I can use, otherwise disks will be missing after a reboot and my ZFS p= ool won't go online=2E I tried different firmware on the HBA=2E I tried the mpr=2Eko module from = the broadcom site=2E (I replaced the one in /boot/kernel?) I tested all the things above with a Linux as OS and everything seems to w= ork=2E =C2=A0 =C2=A0 Is there anything I'm missing? A command that can reset the SAS components= ? =C2=A0 =C2=A0 FreeBSD version: 11=2E1-RELEASE-p11 HBA: broadcom lsi 9305-16e (latest firmware) JBOD:SC847E2C-R1K28JBOD (two expanders, internally daisy chained) From owner-freebsd-scsi@freebsd.org Tue Jul 3 12:54:27 2018 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 12E231016E4D for ; Tue, 3 Jul 2018 12:54:27 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wr0-x22b.google.com (mail-wr0-x22b.google.com [IPv6:2a00:1450:400c:c0c::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 89DF17F920 for ; Tue, 3 Jul 2018 12:54:26 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wr0-x22b.google.com with SMTP id a12-v6so1878379wro.1 for ; Tue, 03 Jul 2018 05:54:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=IQmGDhoo7zLPm2/Bq0f++T/p3or24W7PcnC0hQiz7qY=; b=K2YPPWiLDLenbZBn7swzmqV0TiIrleTCC1mipA3WXS2OenHUqk9ISjqo3Hm420GGr2 zuAZaxcAzoY9UQ1aZtOQ9TiLXc1vdZ4TKOJTI2Hbt+w6/ic4/tvBTkkhRC2bFR+AookE JnSb6T9OyDv2LndQmLg/fRpbxyHe58lVad3oGcTQ4LaKnheJqePYhvA2zaS3iegIuaCU cDKLrnz3BTJE1ctRZVTBNbKh/xxI7Rz2/4xaJc0EpJkcaYq8oHEIF6A+xfuRmgtk7id3 8D+XpB9Ni3YBv2W6JbZl7RG+OG+/+WVHWJdVgDvzDhL72gg09UxjW/reVDxRp81mQYfp Hozw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=IQmGDhoo7zLPm2/Bq0f++T/p3or24W7PcnC0hQiz7qY=; b=ILq5A8qsE9RfOCOFdBtnmkx6AQq3h2VhfcYSXPkzM5idxBKEzNE+9RB44BwNn05aAv zyBT71OMSyd94FdI0u6Ry3VUNfEhuH6N8xRKQJUmvg1B5tLJ6CERIEADdQHyqE9dknCC X4MovGYIjFnsXfKeIw31acchs/PMtssDQRznHmbtFG+a2KMQujcBw9GCOT+RMMK3m52b S2naCXqvhdm1oy6FP+TmeXJCRSTNsrf+n8cwotlv68N0rbvozQhG9y4eOOsRtLnllx5b yfjO0w4vg8RJ1B3HAZN75ZmTtfD1rJQh75HqdwWT7khqny40HOauq4JlQ0UdOr/nYniR 2MTg== X-Gm-Message-State: APt69E2KQuhNUyI08kqWvPZBIr8Wc0vZ+aRMlDVj7rmlrCoC9XerR/Ru TccMrKxB0C1Wna/2hqayFn6XngaY X-Google-Smtp-Source: AAOMgpfkvRtB9MMb/4ynCCbq9fu687G5iFitKYAguXiK3WoW2QeNERPhv64N8eIxLr0NqJT6Cg6FYw== X-Received: by 2002:adf:88e3:: with SMTP id g32-v6mr21145945wrg.62.1530622464857; Tue, 03 Jul 2018 05:54:24 -0700 (PDT) Received: from bens-mac.home (LFbn-NIC-1-215-125.w2-15.abo.wanadoo.fr. [2.15.61.125]) by smtp.gmail.com with ESMTPSA id s10-v6sm3218830wmb.12.2018.07.03.05.54.23 for (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 03 Jul 2018 05:54:24 -0700 (PDT) Content-Type: text/plain; charset=us-ascii; delsp=yes; format=flowed Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: problems with SAS JBODs 2 From: Ben RUBSON X-Priority: 3 In-Reply-To: Date: Tue, 3 Jul 2018 14:54:22 +0200 Content-Transfer-Encoding: 7bit Message-Id: References: To: FreeBSD-scsi X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jul 2018 12:54:27 -0000 On 03 Jul 2018 14:28, Oliver Sech wrote: > Once I disconnect the SAS cable, wait until all devices disapear and > reconnect I get all sorts of problems. > -> a random number of disks shows up and the enclosure "ses" do not show up > Once I restart the system I can start over again. Hi, I faced same sort of issue but with iSCSI disks. At least disks did not disconnect properly, and did not reconnect until a reboot was performed. Among the needed iSCSI patches, a GEOM one has been pushed : https://github.com/freebsd/freebsd/commit/ea40366602be7548eba0bec35fec46ea4509dbb7 (it's in 11.2, but not in your 11.1). Perhaps this could help. Ben From owner-freebsd-scsi@freebsd.org Tue Jul 3 14:06:26 2018 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EE51D102337B for ; Tue, 3 Jul 2018 14:06:25 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-lf0-x236.google.com (mail-lf0-x236.google.com [IPv6:2a00:1450:4010:c07::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4396282BA9 for ; Tue, 3 Jul 2018 14:06:25 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-lf0-x236.google.com with SMTP id u202-v6so1715022lff.9 for ; Tue, 03 Jul 2018 07:06:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=v4YdNpGo5eTww084qOR9pZjIqbjpaL4JLHsLtjcYaqw=; b=HqY863BmtgRHO0Wq4bFunOKy/GlLNn8peCnkvc+XQsIOlZcuMQVkiKk9OWrgQqOsMd KaGcj2H82i+s2TStYOXZ9pk7MusRh3EOUHdPOALCGC3Qe2jWGV8LGgjvOmm13fjmmYJb XI7h8e4aQGCaz6cihS02FyxTT+blNyKUPr34m4YWv7UCAnmG3fB+jBO09qk/7bcoKACT A/6QFqIU076UgDiwxo3p9lzHdYKF1vZ4ObpvlUNERgidZLb9md15HLsp+TQSeUer/8eA RtQu9FDZIkNuLfPAWHQz/4EZOE3d2o21s3rFU93h6tPVcY58lh02fyf1TUg4o0CrC+af G6DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=v4YdNpGo5eTww084qOR9pZjIqbjpaL4JLHsLtjcYaqw=; b=fwDSX0ZR6jCmDTb3VOxviQpj/Ixr5tATT3e1/7DCuJcl2T6GQNkqYRyBSHXA1jcaE4 gStQm6/WxcYqMPliArX1CutKw5ftgrCTCeY/AClBYt5hMlxJYHy2v5HXVTjSQvln06CJ N6v9MUwnKYdOJXUx/GD5q45KshTPmFS1zL2Xvx4nt4eWeYDwN+Sso+BphkjxXdNwVDdi GBa+KQhVEorSoNYjku014ocF4bPNaFhpkMW4Pwc6MuHqNBXST9y/CJBQcOVR7gYQiR5M ZcTTXmLr8FkFSbnQ+XFo29vRRwlSPuInjoTRbV7/iiibBX1ZlxXXuey0pi+wfzBE+lVt hjkg== X-Gm-Message-State: APt69E1sv1FOJlLXboC7eC6Z4jMh6aBfwwZzKkkdvTY1vQA8WR6CDH3a I0AXk7BScoNcdgcMPIq6ifk/yI48kV3Qpre2G48= X-Google-Smtp-Source: AAOMgpcxluot58m9uzrFun9yvHV4R3BRDYOa1pHq2SK3kT7Qs+K6OjpvlKZexk7/92jKf5h6VXjvlTF5JgGFr/qdfEU= X-Received: by 2002:a19:eac1:: with SMTP id y62-v6mr7109314lfi.138.1530626783342; Tue, 03 Jul 2018 07:06:23 -0700 (PDT) MIME-Version: 1.0 Sender: asomers@gmail.com Received: by 2002:ab3:1b91:0:0:0:0:0 with HTTP; Tue, 3 Jul 2018 07:06:22 -0700 (PDT) In-Reply-To: References: From: Alan Somers Date: Tue, 3 Jul 2018 08:06:22 -0600 X-Google-Sender-Auth: 45xipcwAtgDccMcAMAxGnRPFW0E Message-ID: Subject: Re: problems with SAS JBODs 2 To: Oliver Sech Cc: FreeBSD-scsi Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.27 X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jul 2018 14:06:26 -0000 On Tue, Jul 3, 2018 at 6:28 AM, Oliver Sech wrote: > Hi! > > I use FreeBSD with for a large ZFS pool (over 1PB) and I recently > encountered a lot of problems with the JBODs. Generally everything works > fine until I replug the shelves. > > When I start with a clean system and attach a single shelf every thing > seems fine. > -> 44 disks show up, I can use the enclosure services (sesutil) and the > system continues to run without problems. > Once I disconnect the SAS cable, wait until all devices disapear and > reconnect I get all sorts of problems. > -> a random number of disks shows up and the enclosure "ses" do not show up > Once I restart the system I can start over again. > > On the server with the large pool there are only certain ports on the HBA > that I can use, otherwise disks will be missing after a reboot and my ZFS > pool won't go online. > I tried different firmware on the HBA. I tried the mpr.ko module from the > broadcom site. (I replaced the one in /boot/kernel?) > I tested all the things above with a Linux as OS and everything seems to > work. > > > Is there anything I'm missing? A command that can reset the SAS components? > > > FreeBSD version: 11.1-RELEASE-p11 > HBA: broadcom lsi 9305-16e (latest firmware) > JBOD:SC847E2C-R1K28JBOD (two expanders, internally daisy chained) > _______________________________________________ > freebsd-scsi@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-scsi > To unsubscribe, send any mail to "freebsd-scsi-unsubscribe@freebsd.org" > 1) Are the expanders daisy chained? Some SAS expanders don't work reliably when daisy chained. Best to direct connect each one to the server. 2) Are the expanders connected in multipath or single path? You need geom_multipath if you're going to do that. 3) Are you attempting to use wide ports (two SAS cables connecting each expander to the HBA). If do, you'll need to make sure that each pair of SAS cables goes to the same HBA chip (not merely the same card, as some cards contain two HBA chips). 4) Are you trying to remove an expander while ZFS is active on that expander? That will suspend your pool, and ZFS doesn't always recover from a suspended state. -Alan From owner-freebsd-scsi@freebsd.org Tue Jul 3 14:26:37 2018 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4937A1024BB8 for ; Tue, 3 Jul 2018 14:26:37 +0000 (UTC) (envelope-from ken@kdm.org) Received: from mithlond.kdm.org (mithlond.kdm.org [96.89.93.250]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mithlond.kdm.org", Issuer "mithlond.kdm.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id CE3F08379D; Tue, 3 Jul 2018 14:26:36 +0000 (UTC) (envelope-from ken@kdm.org) Received: from mithlond.kdm.org (localhost [127.0.0.1]) by mithlond.kdm.org (8.15.2/8.14.9) with ESMTPS id w63EQTqr022873 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 3 Jul 2018 10:26:29 -0400 (EDT) (envelope-from ken@mithlond.kdm.org) Received: (from ken@localhost) by mithlond.kdm.org (8.15.2/8.14.9/Submit) id w63EQT50022872; Tue, 3 Jul 2018 10:26:29 -0400 (EDT) (envelope-from ken) Date: Tue, 3 Jul 2018 10:26:29 -0400 From: "Kenneth D. Merry" To: Oliver Sech Cc: freebsd-scsi@freebsd.org, slm@freebsd.org Subject: Re: problems with SAS JBODs 2 Message-ID: <20180703142629.GF26046@mithlond.kdm.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (mithlond.kdm.org [127.0.0.1]); Tue, 03 Jul 2018 10:26:30 -0400 (EDT) X-Spam-Status: No, score=-2.7 required=5.0 tests=ALL_TRUSTED,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mithlond.kdm.org X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Jul 2018 14:26:37 -0000 On Tue, Jul 03, 2018 at 14:28:58 +0200, Oliver Sech wrote: > Hi! > ?? > I use FreeBSD with for a large ZFS pool (over 1PB) and I recently encountered a lot of problems with the JBODs. Generally everything works fine until I replug the shelves. > ?? > When I start with a clean system and attach a single shelf every thing seems fine. > -> 44 disks show up, I can use the enclosure services (sesutil) and the system continues to run without problems. > Once I disconnect the SAS cable, wait until all devices disapear and reconnect I get all sorts of problems. > -> a random number of disks shows up and the enclosure "ses" do not show up > Once I restart the system I can start over again. > ?? > On the server with the large pool there are only certain ports on the HBA that I can use, otherwise disks will be missing after a reboot and my ZFS pool won't go online. > I tried different firmware on the HBA. I tried the mpr.ko module from the broadcom site. (I replaced the one in /boot/kernel?) > I tested all the things above with a Linux as OS and everything seems to work. > ?? > ?? > Is there anything I'm missing? A command that can reset the SAS components? > ?? > ?? > FreeBSD version: 11.1-RELEASE-p11 > HBA: broadcom lsi 9305-16e (latest firmware) > JBOD:SC847E2C-R1K28JBOD (two expanders, internally daisy chained) Steve McConnell (CCed) and I have been corresponding with someone else who has a problem very similar to yours. The most likely issue is that the mapping table stored on the card is messed up. Can you send dmesg output with the following loader tunable set: hw.mpr.debug_level=0x203 That will turn on debugging for the mapping code and may show the problem. If you see messages like this: mpr0: Attempting to reuse target id 63 handle 0x000b mpr0: Attempting to reuse target id 64 handle 0x000c mpr0: Attempting to reuse target id 65 handle 0x000d mpr0: Attempting to reuse target id 66 handle 0x000e mpr0: Attempting to reuse target id 67 handle 0x000f mpr0: Attempting to reuse target id 68 handle 0x0010 mpr0: Attempting to reuse target id 69 handle 0x0011 mpr0: Attempting to reuse target id 70 handle 0x0012 mpr0: Attempting to reuse target id 66 handle 0x000e It indicates that the mapping code is preventing some of the drives from fully probing because there are collisions in the table. Unfortunately we have not yet fixed the problem in the other situation. (He is running with multipathing, which could be contributing to the problem.) I have a script and utility that will clear the mapping table in the card, but that hasn't been enough to fix the other situation. If you do have a mapping problem, I can give you the script/utility to clear the table and we can see whether it fixes your problem. If not, it'll probably have to wait until Steve gets back from vacation. Ken -- Kenneth Merry ken@FreeBSD.ORG From owner-freebsd-scsi@freebsd.org Wed Jul 4 10:15:40 2018 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 61CE810252F6 for ; Wed, 4 Jul 2018 10:15:40 +0000 (UTC) (envelope-from crimsonthunder@gmx.net) Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A578B72379; Wed, 4 Jul 2018 10:15:39 +0000 (UTC) (envelope-from crimsonthunder@gmx.net) Received: from [10.12.22.246] ([193.170.152.64]) by mail.gmx.com (mrgmx003 [212.227.17.190]) with ESMTPSA (Nemesis) id 0M4Gyx-1gSD2w1k4R-00rpLX; Wed, 04 Jul 2018 12:15:37 +0200 Subject: Re: problems with SAS JBODs 2 To: Alan Somers Cc: FreeBSD-scsi References: From: Oliver Sech Message-ID: <237f77ab-89e2-188b-b2b1-84c6d88609b0@gmx.net> Date: Wed, 4 Jul 2018 12:15:36 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K1:gONWIpw357uDHwszQA/HlgHPawiUP7HBEm6/+oMHcSWHolxNmR8 5hTk1TyPzIp7K9WrUhWKad5JxU9l3kMARjGzAt4kzehB14EL3kfhl5KiRaSfBJo+xyq92D+ Ehtst7pw9kH4R9ziq9d6zGn0HX2aujZ2B0KN1Z5006oH0b4vUD6pTCP8bwMcsk1IDCaJXsr 0rEv2cCGmLYUkO2AIwcxA== X-UI-Out-Filterresults: notjunk:1;V01:K0:GuyHOPMS98c=:/jtz4jUYIX/wRiIVf4VbU3 9xcQbSFtKlsDpiGAwshlFZaKnTZo6I4Yyq+Gc/4yeMk+SSOId1/LY8ZfFlN8gt8S7TM50jsOQ GeznOlQscWKOweeFwIRtZTuJTbs5+gP85IV9/T0VgkdNu5HQT4H2jCmlkl+v1HuRS9Rz6w+QE kUGDmWS4cIJlhxENy6/wDz/QDgFlIBmX1H/GgWVHddJCGOneXE8UeWk9fD0+HT5A6TxipXKPL M0F3DS380xvksHbqLzRQ1zUCZNrbCwozwMTabRyL9Ro67FIwQB1qvBnMppli+21PDK/nsxEzg vdXcBEUvTvjWaiwCgMU5IhFT4PR3gNGb3jTGgOa2M20f03xTvp/tH6748iTHfQjMGG+u/d+nD h6sH0npLyB72Na0zMj9W+oFjQTSep0j//jjoqxcotpUh+4jvzkt3Te4ydq6f3ZoCkHDebeTQQ Q8d1G16Ki883a+DW7AfAjrf349LAxTo1pbU6VBXmM+RL3XHNbYGvdA1O+y8czPIKvuhUFu1Eu Ky17QY9VFT5E0fKbWGCw6jztLgcZLZWcv6/oTMsRILNpXkuzAsG9TLdUhTnWVWTI14SVfm3Zj YtssUM8QdrHG1QwJS/fT2ALPA6v7P2fQUl0MCSEUIj/U9U9We4UXEmeuOfFSL0HQq36EobUlq wnMxNFuzHV8FVfMw4pDZ+zVGGh19aSfurtJlRbth/KQ/0gc5XcH6Cj5jErH7dDT36THz/havm m+pbByDhE898XEtK9qt0qb6J6uowB3O9nCWy43Rss6H+n3KNevU+GAHaK4Cljv37R8q6PvRfV b7PI86z X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Jul 2018 10:15:40 -0000 > 1) Are the expanders daisy chained? Some SAS expanders don't work reliably > when daisy chained. Best to direct connect each one to the server. At the moment I have 1 JBOD connected to 1 HBA Port with 1 cable (4 lanes?). Unfortunately the JBOD has 24 slots in the front and 20 in the back and, those are connected via a internal SAS daisy chaining. I could rewire and connect each backplane directly to the server, but unfortunately I do not have enough ports.. JOBD Model: Supermicro 847E2C-R1K28JBOD > 2) Are the expanders connected in multipath or single path? You need > geom_multipath if you're going to do that. See answer 1. There is a single path from the host to the first expander. > 3) Are you attempting to use wide ports (two SAS cables connecting each > expander to the HBA). If do, you'll need to make sure that each pair of > SAS cables goes to the same HBA chip (not merely the same card, as some > cards contain two HBA chips). see 1. The last time I opened one of those JBODs there were 8 SAS cables between the Front and Back expander. I assume that wide ports are being used. (2 expanders per backplane as well) > 4) Are you trying to remove an expander while ZFS is active on that > expander? That will suspend your pool, and ZFS doesn't always recover from > a suspended state. I'm testing with a new unused disk shelf that was never part of the ZFS pool. There were From owner-freebsd-scsi@freebsd.org Wed Jul 4 10:28:33 2018 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7AA3A102672E for ; Wed, 4 Jul 2018 10:28:33 +0000 (UTC) (envelope-from crimsonthunder@gmx.net) Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CB03172BF6; Wed, 4 Jul 2018 10:28:32 +0000 (UTC) (envelope-from crimsonthunder@gmx.net) Received: from [10.12.22.246] ([193.170.152.64]) by mail.gmx.com (mrgmx003 [212.227.17.190]) with ESMTPSA (Nemesis) id 0MMBiP-1fTGV63Ua5-00805r; Wed, 04 Jul 2018 12:28:29 +0200 Subject: Re: problems with SAS JBODs 2 To: "Kenneth D. Merry" Cc: freebsd-scsi@freebsd.org, slm@freebsd.org References: <20180703142629.GF26046@mithlond.kdm.org> From: Oliver Sech Message-ID: Date: Wed, 4 Jul 2018 12:28:29 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180703142629.GF26046@mithlond.kdm.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K1:q45F6FvDO7to6go6lJ3Fl7VfxHzumQqb35Tx9T10dapkmcJyhYf 9og77p5OsVLlmoB/cp/caR38Eq6etIDTezkAikkY91NVuTVFCNPSZ47nVgE3h7q2hsVb+4f 4SoLP41iF/dpGvbZ/W8BOfrUn0vcup4SqFJno1Ex4/aWaeUUo4+yyMsoQDTDmTwo9oyHcf9 rneBL8LohEmizMHGeUJFQ== X-UI-Out-Filterresults: notjunk:1;V01:K0:9S8cY5xf2Tw=:nX0dp3U1FSm18F7lE+/xPU wIJdRzFM77YjksCNhmdLeKtp2vJnR1WXjN5r7yW/ziinmQmHyEgrTuGWsH6uuqs44OUB0dHiu LzS+bKW5z60Itzd0/D5rIo4QCWRk9gsyfXfolE4CD17FwFh5QvpO8KmFvpKw8bAQ/GBMXlqzu bOa8tOmwBDhEZK02mU7356Ycnv/S9PY7u3N6cg85EkYCH2yCQDLh4NvyahtxoRjDXYZ9OlYDk Bym4KfVM82nhi/ojtlVrGQUAuVzMkq9ecX4RP6LGryf75NJGSp5dPffu1SJ++IfeSYJk6KLyK Pa4XGtzoGGXn+3zGNGf3p76s00LkF4EsnAiuyv1cg+TdPC3V8DV/GsgI3rtKHNi+CYQ6JyWY8 CdUYd2tdMfmW/Nmqehl8STPiJSto/w+coJH7bZIw9f8C5+pXZEjkY0mp5KS3OlQT+2QogOH3v GhoNpQwoGvYlOp+g6tQWuelsNdP9PxqIkwBGZdcAT5X5u7jXJ4JTmAXnjN+3kxCYCXxrpmaZJ OYWrqaNSK7mlDhIGm3UhkCerWXsInCqRVrXfv00hyqCMnhZSmmwmWY5drXBJKgxsKSxt5EwzF pM3UmEw6vy+RGy3sSx0yua+MRZprizocXG8Ao91pGgyccptWA1H1EMTC+HJFLAbl1eHFODkDq 2DHcj+J17A9Ac4pajWfWUDK1dHtga6TuxPxfArBl8GZZ1regXmnQC7hsTqsECI/GzqZ9rItIj WuqBNYuoWFvFOegqW8fvf2rsBGZC84Ywdfaz2adaRNvZn5tl3FMKbtQucZ4kv1jQTZpK0XCIo RiZ6FKO X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Jul 2018 10:28:33 -0000 > The most likely issue is that the mapping table stored on the card is messed > up. Can you send dmesg output with the following loader tunable set: > > hw.mpr.debug_level=0x203 > > That will turn on debugging for the mapping code and may show the problem. > > If you see messages like this: > > mpr0: Attempting to reuse target id 63 handle 0x000b > mpr0: Attempting to reuse target id 64 handle 0x000c > mpr0: Attempting to reuse target id 65 handle 0x000d > mpr0: Attempting to reuse target id 66 handle 0x000e > mpr0: Attempting to reuse target id 67 handle 0x000f > mpr0: Attempting to reuse target id 68 handle 0x0010 > mpr0: Attempting to reuse target id 69 handle 0x0011 > mpr0: Attempting to reuse target id 70 handle 0x0012 > mpr0: Attempting to reuse target id 66 handle 0x000e > > It indicates that the mapping code is preventing some of the drives from > fully probing because there are collisions in the table. > > Unfortunately we have not yet fixed the problem in the other situation. > (He is running with multipathing, which could be contributing to the > problem.) > > I have a script and utility that will clear the mapping table in the card, > but that hasn't been enough to fix the other situation. If you do have a > mapping problem, I can give you the script/utility to clear the table and > we can see whether it fixes your problem. > > If not, it'll probably have to wait until Steve gets back from vacation. > > Ken I added the "hw.mpr.debug_level" tunable and collected logs on the whole connect -> disconnect -> connect problem. logs collected: first connect log: https://paste.docker.ist.ac.at/?6ec80dde0e1f236f#NufbXSs6o+dTDTPgZgWbU8vRQ6B47tMbQ8LHPkMXfIg= first connect sesutil: https://paste.docker.ist.ac.at/?256810338f87adc1#/N3m6iFH304SxSxpnHCt0ocOeAU8zkBennul2/BcKpQ= disconnected shelf log: https://paste.docker.ist.ac.at/?07ff1129a6cb6117#8WH8AjO1sO2hZlHE39h314CoQxxFZmBVZNo+Q8+qp4Q= disconnected shelf mprutil: https://paste.docker.ist.ac.at/?eebaee72dc9e1cfe#WTlnO5vlPb7997lJCMswWfwtcq1rN04CaFbxmMWHqrU= second connect log: https://paste.docker.ist.ac.at/?684ff32c6dae185b#nZ32x023ApRvNKrVUhvCr7xi5cYJnPhs9XNTfEW6sMw= second connect sesutil: https://paste.docker.ist.ac.at/?f0302ce3aa8e55d7#+ZaJsCUiLh/7VsqBJ5oPHxZtRbM1dVS2RankrXePikw= second connect mprutil: https://paste.docker.ist.ac.at/?4b8d347aed941c1f#wX7y0cjtb2gYKLU99IIftmDcFpKiV2QqjcC7YN96nB0= If you are interested in investigating this further I can try to organize a "test environment" as I'm pretty sure this issue is not limited to my hardware? best regards, Oliver