From owner-freebsd-fs@FreeBSD.ORG Fri Dec 26 10:21:15 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D44542BD for ; Fri, 26 Dec 2014 10:21:15 +0000 (UTC) Received: from mail-wi0-x236.google.com (mail-wi0-x236.google.com [IPv6:2a00:1450:400c:c05::236]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4C47E64D85 for ; Fri, 26 Dec 2014 10:21:15 +0000 (UTC) Received: by mail-wi0-f182.google.com with SMTP id h11so16799264wiw.3 for ; Fri, 26 Dec 2014 02:21:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=s6A1cCHU4PheoPGMzlZc0mSDbCQPdXva1QiQ3vU0jf4=; b=0kQTbzkJCWMBQOReLRbgWkhbwp3Cjc3eQb/xWC1TaSbHjawxSzbRiRQJQDCA9LlW7W uIZQYkYJmsqubxLQecvehozwktSG7Jzek7ksqDos0gwtTuFAO+VhiKTcGnvqlJd9PSBo PLfvH8HWxYykQ3a+YyHB+HULGI1yeYBKTXgrsjv/JtgZ5WSHiRb2VfN9n4D3RyJb0+Dl CKIbC9Z62XkP3isMv/kD5/i+TFIhrbvmYkpSNvHXvXyOIOegXnpZbGSBsvcMo/eUYTKk H+BCnBgbHq1FDtE4MT21hEUgQ6vrkhkObZkNRtHvhebuByWOdBn+ftoX1zqEj4EXwFGH RNEg== MIME-Version: 1.0 X-Received: by 10.180.205.163 with SMTP id lh3mr68713224wic.63.1419589273599; Fri, 26 Dec 2014 02:21:13 -0800 (PST) Received: by 10.27.137.70 with HTTP; Fri, 26 Dec 2014 02:21:13 -0800 (PST) In-Reply-To: <549C838B.1070302@multiplay.co.uk> References: <549C65FF.4010702@multiplay.co.uk> <549C838B.1070302@multiplay.co.uk> Date: Fri, 26 Dec 2014 12:21:13 +0200 Message-ID: Subject: Re: LSI SAS 9300-8i weird ZFS checksum errors From: George Kontostanos To: Steven Hartland Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Dec 2014 10:21:15 -0000 On Thu, Dec 25, 2014 at 11:37 PM, Steven Hartland wrote: > > On 25/12/2014 21:03, George Kontostanos wrote: > > > > On Thu, Dec 25, 2014 at 9:31 PM, Steven Hartland > wrote: > >> >> On 25/12/2014 14:39, George Kontostanos wrote: >> >>> Hello, list and Merry Christmas to all >>> >>> I am facing some weird checksum errors during scrub. The configuration is >>> the following: >>> >>> Board: Supermicro Motherboard X10DRi-T4+ ( >>> http://www.supermicro.com/products/motherboard/xeon/c600/x10dri-t4_.cfm) >>> Controller: LSI SAS 9300-8i ( >>> http://www.lsi.com/products/host-bus-adapters/pages/lsi-sas-9300-8i.aspx >>> ) >>> HDD: 21X6TB Western Digital WD60EFRX >>> HDD: 2XIntel SATA 600GB Solid-State Drive SSDSC2BB600G401 DC >>> S3500 >>> (SWAP, ZIL, CACHE) >>> Chassis: Supermicro 847BE1C-R1K28LPB 4U Storage Chassis >>> RAM: 64 GB >>> >>> I installed initially FreeBSD 10.1-RELEASE created one pool consistent >>> by 3 >>> X7disk VDEVs in RAIDZ3. I used NFS to start copying some data. After >>> copying around 3TB I initiated a scrub. >>> The result was the following: http://pastebin.com/rswgCY2A and >>> http://pastebin.com/DQ2urGXk >>> >>> I tried to flash the controller but the LSI utility did not recognize the >>> controller. I installed FreeBSD 9.3-RELEASE and used LSI's mpslsi3 >>> driver. >>> I was able to flash the latest bios and firmware that way. >>> >>> LSI Corporation SAS3 Flash Utility >>> Version 07.00.00.00 (2014.08.14) >>> Copyright (c) 2008-2014 LSI Corporation. All rights reserved >>> >>> Adapter Selected is a LSI SAS: SAS3008(C0) >>> >>> Controller Number : 0 >>> Controller : SAS3008(C0) >>> PCI Address : 00:82:00:00 >>> SAS Address : 500605b-0-06ce-27e0 >>> NVDATA Version (Default) : 06.03.00.05 >>> NVDATA Version (Persistent) : 06.03.00.05 >>> Firmware Product ID : 0x2221 (IT) >>> Firmware Version : 06.00.00.00 >>> NVDATA Vendor : LSI >>> NVDATA Product ID : SAS9300-8i >>> BIOS Version : 08.13.00.00 >>> UEFI BSD Version : 02.00.00.00 >>> FCODE Version : N/A >>> Board Name : SAS9300-8i >>> Board Assembly : H3-25573-00E >>> Board Tracer Number : SV32928040 >>> >>> I recreated the pool again and started writing data via NFS again. After >>> 3 >>> TB of data I started a scrub and I am still getting checksum errors >>> though >>> there are no messages regarding the drives anymore in /var/log/messages >>> >>> pool: Pool >>> state: ONLINE >>> status: One or more devices has experienced an unrecoverable error. An >>> attempt was made to correct the error. Applications are unaffected. >>> action: Determine if the device needs to be replaced, and clear the >>> errors >>> using 'zpool clear' or replace the device with 'zpool replace'. >>> see: http://illumos.org/msg/ZFS-8000-9P >>> >>> scan: scrub in progress since Thu Dec 25 08:46:21 2014 >>> 2.28T scanned out of 5.54T at 816M/s, 1h9m to go >>> 11.9M repaired, 41.26% done >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> Pool ONLINE 0 0 0 >>> raidz3-0 ONLINE 0 0 0 >>> gpt/WD-WX41D94RN5A3 ONLINE 0 0 15 (repairing) >>> gpt/WD-WX41D948YE1U ONLINE 0 0 14 (repairing) >>> gpt/WD-WX41D94RN879 ONLINE 0 0 16 (repairing) >>> gpt/WD-WX21D947NC83 ONLINE 0 0 24 (repairing) >>> gpt/WD-WX21D947NT77 ONLINE 0 0 15 (repairing) >>> gpt/WD-WX41D948YAKV ONLINE 0 0 19 (repairing) >>> gpt/WD-WX21D9421SCV ONLINE 0 0 20 (repairing) >>> raidz3-1 ONLINE 0 0 0 >>> gpt/WD-WX21D9421F6F ONLINE 0 0 16 (repairing) >>> gpt/WD-WX41D948YPN4 ONLINE 0 0 14 (repairing) >>> gpt/WD-WX21D947NE2K ONLINE 0 0 22 (repairing) >>> gpt/WD-WX41D948Y2PX ONLINE 0 0 19 (repairing) >>> gpt/WD-WX41D94RNAX7 ONLINE 0 0 17 (repairing) >>> gpt/WD-WX21D947N1RP ONLINE 0 0 12 (repairing) >>> gpt/WD-WX21D94216X7 ONLINE 0 0 20 (repairing) >>> raidz3-2 ONLINE 0 0 0 >>> gpt/WD-WX41D948YAHP ONLINE 0 0 25 (repairing) >>> gpt/WD-WX21D947N06F ONLINE 0 0 18 (repairing) >>> gpt/WD-WX21D947N3T1 ONLINE 0 0 21 (repairing) >>> gpt/WD-WX41D94RNT7D ONLINE 0 0 5 (repairing) >>> gpt/WD-WX41D948Y9VV ONLINE 0 0 18 (repairing) >>> gpt/WD-WX41D94RNS62 ONLINE 0 0 24 (repairing) >>> gpt/WD-WX21D9421ZP9 ONLINE 0 0 28 (repairing) >>> logs >>> mirror-3 ONLINE 0 0 0 >>> gpt/zil0 ONLINE 0 0 0 >>> gpt/zil1 ONLINE 0 0 0 >>> cache >>> gpt/cache0 ONLINE 0 0 0 >>> gpt/cache1 ONLINE 0 0 0 >>> >>> errors: No known data errors >>> >>> This is really driving me crazy since smartmon tools do not display any >>> errors on the drives. >>> >>> Any suggestions are most welcomed!!! >>> >>> Check for bad hardware, first guess would be memory, next would be >> hotswap backplane. >> >> Regards >> Steve >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> > > Hi Steve, > > Memory looks good in memtest. I am not sure what you mean > regarding hotswap backplane. > > How are the disks attached? > > The most common way is your controller being attached to a hotswap > backplane, which you then plug the disks into. > > Unfortunately these backplanes are one of the most common sources of > issues, especially at higher speeds and even more so if they aren't direct > passthrough i.e. they are actually expanders which processing of their own. > > You report the chassis is a 847BE1C-R1K28LPB which includes such > expanders, specifically BPN-SAS3-846EL1 and BPN-SAS3-826EL1. > > If this is how you are connecting the disk I would strongly advise > eliminating this from the equation by connecting the disks direct to the > LSI controller. > > You can also check to see if there are any firmware updates for the > expanders. > > Regards > Steve > Thanks for your reply Steve. Unfortunately I am thousands of miles away from the DC. In another continent actually! I have contacted SuperMicro support to see if they do have any firmware updates. I might also need to find someone to go to the DC and physically attach the disks directly to the controller. Best! -- George Kontostanos ---