From owner-freebsd-fs@freebsd.org Mon Sep 28 15:44:29 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 75953A0B5F2 for ; Mon, 28 Sep 2015 15:44:29 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wi0-x22b.google.com (mail-wi0-x22b.google.com [IPv6:2a00:1450:400c:c05::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 15AC217EE for ; Mon, 28 Sep 2015 15:44:29 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by wicfx3 with SMTP id fx3so107253886wic.0 for ; Mon, 28 Sep 2015 08:44:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=8hzrxrFKa4hIt0vf9NQ4uqFxUED05SGChGctxl3r0OU=; b=paSZA6mRRIQR2ucr9i/fkTHC09LX46cVReKq0yn93hPFUMjh2MwrcfH9+XdwxWmt4O j1I5kCKNAJrCQSAenBlBzyMxLQHecfPub2oz7t0mZGJWfwqKWI8Lxx8+2CEeUTCULWJR OHbDkyIqSgxsO1zSd/INOzPSsLGqTZ3jc1xxfhzWwvNgyJWMNIWiVvx2k+shOgjWAic2 kgjmKYgiBLatq3WkFRWIMMcGNspUgvR/Xsm4JqPF2LSJ92ls3Xao85R6zxOuBHxwtq3W w3jfx5SWkkut6KWN0l/gSGoqE5v8cmzKIPUOqow1NjHTlBvf8adyRtyJG4s9xRrp4+qh WX1A== X-Received: by 10.194.82.198 with SMTP id k6mr22358953wjy.139.1443455067399; Mon, 28 Sep 2015 08:44:27 -0700 (PDT) Received: from [192.168.0.1] (cag06-2-82-237-68-117.fbx.proxad.net. [82.237.68.117]) by smtp.gmail.com with ESMTPSA id kj5sm18997725wjb.19.2015.09.28.08.44.26 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 28 Sep 2015 08:44:26 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: Cannot replace broken hard drive with LSI HBA From: Ben RUBSON In-Reply-To: <5609578E.1050606@physics.umn.edu> Date: Mon, 28 Sep 2015 17:44:24 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <75069031-4111-4F67-A836-A509994A35DC@gmail.com> References: <1443447383.5271.66.camel@data-b104.adm.slu.se> <5609578E.1050606@physics.umn.edu> To: "freebsd-fs@freebsd.org" X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Sep 2015 15:44:29 -0000 Hello, I also plan to use a LSI SAS adapter (9211-8i), with FreeBSD / ZFS. 2 types of firmware exist for this card, The IT (Initiator Target) one = and the IR (Integrated RAID) one. According to my findings, the IT one is the recommended one to use with = ZFS. Of course, not sure it is related, but did U try the IT firmware ? Best regards, Ben > Le 28 sept. 2015 =C3=A0 17:06, Graham Allan a = =C3=A9crit : >=20 > I have seen this and keep experiencing it. I posted a question about = it a while back but I don't think there was much response. >=20 > https://lists.freebsd.org/pipermail/freebsd-fs/2014-July/019715.html >=20 > My original question was with 9.1, and at the time we discovered that = if you ran the LSI utility "sas2ircu", for example simply "sas2ircu 0 = DISPLAY", it was seem to ang for a while, then issue a bus reset, and = the replaced drives are detected. >=20 > Now that I also see the same issue on 9.3, running sas2ircu in this = situation usually seems to cause a panic, so it's not exactly progress. >=20 > = https://lists.freebsd.org/pipermail/freebsd-scsi/2015-August/006794.html >=20 > I am using Dell servers, generally R710 and R720, with LSI 9207-8e = controllers, Supermicro JBZOD chassis, and mostly WD drives. I got the = above problems using firmware 16 (probably) with both 9.1 and 9.3. >=20 > Regarding your experience with firmware 20, I believe it is "known = bad", though some seem to disagree. Certainly when building my = recent-ish large 9.3 servers I specifically tested it and got consistent = data corruption. There is now a newer release of firmware 20 , = "20.00.04.00" which seems to be fixed - see this thread: >=20 > = https://lists.freebsd.org/pipermail/freebsd-scsi/2015-August/006793.html >=20 > This is kind of painful as the new firmware was posted by LSI with no = comment or no release notes, yet if you follow all the references there = are hints that it was known internally to be problematic. It's bad if = selecting the HBA firmware for FreeBSD is degenerated to a "black art" = but that seems to be where it is right now. >=20 > I don't know that there are any other viable choices for SAS HBA = besides LSI - I've never heard of any. >=20 > Your bugzilla link is interesting. We are also using WD drives and = Supermicro enclosures so there is a lot in common. I wonder if these = changes are in 10.2-RELEASE? >=20 > Graham >=20 > On 9/28/2015 8:36 AM, Karli Sj=C3=B6berg wrote: >> Hey all! >>=20 >> I=C2=B4m just giving a shout out here to see if anyone else have had = similar >> experiences working with LSI/Avago HBA's in FreeBSD. >>=20 >> For some time now, about a year or so, we=C2=B4ve had several times = were hard >> drives have dropped out, you pull it out, pop a new back in, but it >> never shows up in the OS. When inserted, nothing prints in the logs, = and >> physically, it just blinks for a half a second, then nothing. The = entire >> server then needs to be rebooted to get the drive back. >>=20 >> As for the hardware, we have several SuperMicro servers, an HP, and = an >> old SUN server that all have this problem. It=C2=B4s happened with = both old >> and new drives from different manufacturers and sizes. The only thing = in >> common has been the LSI/Avago HBA. >>=20 >> The software is FreeBSD-10.1-STABLE as per this[*] bug, very close to >> 10.2-RELEASE, mps driver version 20 and the firmware has been flashed = to >> 19. Also tried firmware version 20 but ZFS went nuts, displaying >> checksum errors on just about every disk in the pool. >>=20 >> I=C2=B4ts gotten to the point I=C2=B4m fed up and have to ask if = someone else >> could think of a fix, since neither software nor firmware upgrade = seems >> to make a difference. Or to suggest another HBA instead? >>=20 >> Thanks in advance! >>=20 >> /K >>=20 >> [*]: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D191348