From owner-freebsd-fs@FreeBSD.ORG Thu Dec 25 21:37:30 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5BAF44B3 for ; Thu, 25 Dec 2014 21:37:30 +0000 (UTC) Received: from mail-wi0-f176.google.com (mail-wi0-f176.google.com [209.85.212.176]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D4E2A3726 for ; Thu, 25 Dec 2014 21:37:29 +0000 (UTC) Received: by mail-wi0-f176.google.com with SMTP id ex7so15996603wid.15 for ; Thu, 25 Dec 2014 13:37:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type; bh=fqU1Acs80SPYMg1yfiBDt70errxMMJksx655V0Fcyi0=; b=Na+yc7Gp4CmQtWnYLbDIdonhwjc0NrljEQrxSH51X460ih2oMomB+2aE7zfZo8SAib C/B8Kh710t82gsYJT7CrXq6Vwps3tUp0aDvVcGMUq2oM3meH5n8TfGmSiUZE54k481z7 ZCD8YUklIdIuOB+ChJVaKBUKWDFgsn0Cx4c1EIHa9yzosAcI/uWDpy+ibsh6wdymtLDQ 1c7tvff4VBz0obY/yyAL0cl01PFtepUxbBWAZzK58O6XooPwHuKBGjtmem/I3v6+O5W6 N3q2HGFLgvWTRnMeypN19UAanNFCy1dO+dfVlj4T+WrAtwB0JFjASzRY0OmACVVw/9hm q+Hg== X-Gm-Message-State: ALoCoQnhjwvalsgs2BEUMP71tOxIX8Fii7IDgJA+6/EE8ZTSzVsX0RBflpJA4VFZYXFiKKLobZMG X-Received: by 10.181.8.66 with SMTP id di2mr62392095wid.49.1419543448299; Thu, 25 Dec 2014 13:37:28 -0800 (PST) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by mx.google.com with ESMTPSA id dr3sm26048709wib.4.2014.12.25.13.37.27 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Dec 2014 13:37:27 -0800 (PST) Message-ID: <549C838B.1070302@multiplay.co.uk> Date: Thu, 25 Dec 2014 21:37:15 +0000 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: George Kontostanos Subject: Re: LSI SAS 9300-8i weird ZFS checksum errors References: <549C65FF.4010702@multiplay.co.uk> In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Dec 2014 21:37:30 -0000 On 25/12/2014 21:03, George Kontostanos wrote: > > > On Thu, Dec 25, 2014 at 9:31 PM, Steven Hartland > > wrote: > > > On 25/12/2014 14:39, George Kontostanos wrote: > > Hello, list and Merry Christmas to all > > I am facing some weird checksum errors during scrub. The > configuration is > the following: > > Board: Supermicro Motherboard X10DRi-T4+ ( > http://www.supermicro.com/products/motherboard/xeon/c600/x10dri-t4_.cfm) > Controller: LSI SAS 9300-8i ( > http://www.lsi.com/products/host-bus-adapters/pages/lsi-sas-9300-8i.aspx) > HDD: 21X6TB Western Digital WD60EFRX > HDD: 2XIntel SATA 600GB Solid-State Drive > SSDSC2BB600G401 DC S3500 > (SWAP, ZIL, CACHE) > Chassis: Supermicro 847BE1C-R1K28LPB 4U Storage Chassis > RAM: 64 GB > > I installed initially FreeBSD 10.1-RELEASE created one pool > consistent by 3 > X7disk VDEVs in RAIDZ3. I used NFS to start copying some data. > After > copying around 3TB I initiated a scrub. > The result was the following: http://pastebin.com/rswgCY2A and > http://pastebin.com/DQ2urGXk > > I tried to flash the controller but the LSI utility did not > recognize the > controller. I installed FreeBSD 9.3-RELEASE and used LSI's > mpslsi3 driver. > I was able to flash the latest bios and firmware that way. > > LSI Corporation SAS3 Flash Utility > Version 07.00.00.00 (2014.08.14) > Copyright (c) 2008-2014 LSI Corporation. All rights reserved > > Adapter Selected is a LSI SAS: SAS3008(C0) > > Controller Number : 0 > Controller : SAS3008(C0) > PCI Address : 00:82:00:00 > SAS Address : 500605b-0-06ce-27e0 > NVDATA Version (Default) : 06.03.00.05 > NVDATA Version (Persistent) : 06.03.00.05 > Firmware Product ID : 0x2221 (IT) > Firmware Version : 06.00.00.00 > NVDATA Vendor : LSI > NVDATA Product ID : SAS9300-8i > BIOS Version : 08.13.00.00 > UEFI BSD Version : 02.00.00.00 > FCODE Version : N/A > Board Name : SAS9300-8i > Board Assembly : H3-25573-00E > Board Tracer Number : SV32928040 > > I recreated the pool again and started writing data via NFS > again. After 3 > TB of data I started a scrub and I am still getting checksum > errors though > there are no messages regarding the drives anymore in > /var/log/messages > > pool: Pool > state: ONLINE > status: One or more devices has experienced an unrecoverable > error. An > attempt was made to correct the error. Applications are > unaffected. > action: Determine if the device needs to be replaced, and > clear the errors > using 'zpool clear' or replace the device with 'zpool replace'. > see: http://illumos.org/msg/ZFS-8000-9P > > scan: scrub in progress since Thu Dec 25 08:46:21 2014 > 2.28T scanned out of 5.54T at 816M/s, 1h9m to go > 11.9M repaired, 41.26% done > config: > > NAME STATE READ WRITE CKSUM > Pool ONLINE 0 0 0 > raidz3-0 ONLINE 0 0 0 > gpt/WD-WX41D94RN5A3 ONLINE 0 0 15 (repairing) > gpt/WD-WX41D948YE1U ONLINE 0 0 14 (repairing) > gpt/WD-WX41D94RN879 ONLINE 0 0 16 (repairing) > gpt/WD-WX21D947NC83 ONLINE 0 0 24 (repairing) > gpt/WD-WX21D947NT77 ONLINE 0 0 15 (repairing) > gpt/WD-WX41D948YAKV ONLINE 0 0 19 (repairing) > gpt/WD-WX21D9421SCV ONLINE 0 0 20 (repairing) > raidz3-1 ONLINE 0 0 0 > gpt/WD-WX21D9421F6F ONLINE 0 0 16 (repairing) > gpt/WD-WX41D948YPN4 ONLINE 0 0 14 (repairing) > gpt/WD-WX21D947NE2K ONLINE 0 0 22 (repairing) > gpt/WD-WX41D948Y2PX ONLINE 0 0 19 (repairing) > gpt/WD-WX41D94RNAX7 ONLINE 0 0 17 (repairing) > gpt/WD-WX21D947N1RP ONLINE 0 0 12 (repairing) > gpt/WD-WX21D94216X7 ONLINE 0 0 20 (repairing) > raidz3-2 ONLINE 0 0 0 > gpt/WD-WX41D948YAHP ONLINE 0 0 25 (repairing) > gpt/WD-WX21D947N06F ONLINE 0 0 18 (repairing) > gpt/WD-WX21D947N3T1 ONLINE 0 0 21 (repairing) > gpt/WD-WX41D94RNT7D ONLINE 0 0 5 (repairing) > gpt/WD-WX41D948Y9VV ONLINE 0 0 18 (repairing) > gpt/WD-WX41D94RNS62 ONLINE 0 0 24 (repairing) > gpt/WD-WX21D9421ZP9 ONLINE 0 0 28 (repairing) > logs > mirror-3 ONLINE 0 0 0 > gpt/zil0 ONLINE 0 0 0 > gpt/zil1 ONLINE 0 0 0 > cache > gpt/cache0 ONLINE 0 0 0 > gpt/cache1 ONLINE 0 0 0 > > errors: No known data errors > > This is really driving me crazy since smartmon tools do not > display any > errors on the drives. > > Any suggestions are most welcomed!!! > > Check for bad hardware, first guess would be memory, next would be > hotswap backplane. > > Regards > Steve > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to > "freebsd-fs-unsubscribe@freebsd.org > " > > > Hi Steve, > > Memory looks good in memtest. I am not sure what you mean > regarding hotswap backplane. How are the disks attached? The most common way is your controller being attached to a hotswap backplane, which you then plug the disks into. Unfortunately these backplanes are one of the most common sources of issues, especially at higher speeds and even more so if they aren't direct passthrough i.e. they are actually expanders which processing of their own. You report the chassis is a 847BE1C-R1K28LPB which includes such expanders, specifically BPN-SAS3-846EL1 and BPN-SAS3-826EL1. If this is how you are connecting the disk I would strongly advise eliminating this from the equation by connecting the disks direct to the LSI controller. You can also check to see if there are any firmware updates for the expanders. Regards Steve