From owner-freebsd-fs@freebsd.org  Thu Sep  8 07:45:40 2016
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 36FA1BD12F3
 for <freebsd-fs@mailman.ysv.freebsd.org>; Thu,  8 Sep 2016 07:45:40 +0000 (UTC)
 (envelope-from maurizio.vairani@cloverinformatica.it)
Received: from host202-129-static.10-188-b.business.telecomitalia.it
 (host202-129-static.10-188-b.business.telecomitalia.it [188.10.129.202])
 by mx1.freebsd.org (Postfix) with ESMTP id 9F2A8AD5;
 Thu,  8 Sep 2016 07:45:38 +0000 (UTC)
 (envelope-from maurizio.vairani@cloverinformatica.it)
Received: from [192.168.0.60] (unknown [192.168.0.60])
 by host202-129-static.10-188-b.business.telecomitalia.it (Postfix) with ESMTP
 id 5384712AD8F; Thu,  8 Sep 2016 09:39:59 +0200 (CEST)
Subject: Re: ZFS-8000-8A: assistance needed
To: Ruslan Makhmatkhanov <rm@FreeBSD.org>
References: <c6e3d35a-d554-a809-4959-ee858c38aca7@FreeBSD.org>
From: Maurizio Vairani <maurizio.vairani@cloverinformatica.it>
Cc: freebsd-fs@freebsd.org
Message-ID: <b9edb1ae-b59a-aefc-f547-1fb69e79f0f7@cloverinformatica.it>
Date: Thu, 8 Sep 2016 09:39:59 +0200
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101
 Thunderbird/45.2.0
MIME-Version: 1.0
In-Reply-To: <c6e3d35a-d554-a809-4959-ee858c38aca7@FreeBSD.org>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Sep 2016 07:45:40 -0000

Hi Ruslan,

Il 06/09/2016 22:00, Ruslan Makhmatkhanov ha scritto:
> Hello,
>
> I've got something new here and just not sure where to start on 
> solving that. It's on 10.2-RELEASE-p7 amd64.
>
> """
> root:~ # zpool status -xv
>   pool: storage_ssd
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
>     corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
>     entire pool from backup.
>    see: http://illumos.org/msg/ZFS-8000-8A
>   scan: scrub repaired 0 in 0h26m with 5 errors on Tue Aug 23 00:40:24 
> 2016
> config:
>
>     NAME              STATE     READ WRITE CKSUM
>     storage_ssd       ONLINE       0     0 59.3K
>       mirror-0        ONLINE       0     0     0
>         gpt/drive-06  ONLINE       0     0     0
>         gpt/drive-07  ONLINE       0     0     9
>       mirror-1        ONLINE       0     0  119K
>         gpt/drive-08  ONLINE       0     0  119K
>         gpt/drive-09  ONLINE       0     0  119K
>     cache
>       mfid5           ONLINE       0     0     0
>       mfid6           ONLINE       0     0     0
>
> errors: Permanent errors have been detected in the following files:
>
>         <0x1bd0a>:<0x8>
>         <0x31f23>:<0x8>
>         /storage_ssd/f262f6ebaf5011e39ca7047d7bb28f4a/disk
>         /storage_ssd/7ba3f661fa9811e3bd9d047d7bb28f4a/disk
>         /storage_ssd/2751d305ecba11e3aef0047d7bb28f4a/disk
>         /storage_ssd/6aa805bd22e911e4b470047d7bb28f4a/disk
> """
>
> The pool looks ok, if I understand correctly, but we have a slowdown 
> in Xen VM's, that are using these disks via iSCSI. So can please 
> anybody explain what exactly that mean?
The OS retries the read and/or write operation and you notice a slowdown.
>
> 1. Am I right that we have a hardware failure that lead to data 
> corruption?
Yes.
> If so, how to identify failed disk(s) 
The disks containing gpt/drive-07, the disk with gpt/drive-08 and the 
disk with gpt/drive-09.  With smartctl you can read the smart status of 
the disks for more info. I use smartd  with HDDs and SSDs and it, 
usually, warns me about a failing disk before zfs.
> and how it is possible that data is corrupted on zfs mirror?
If in both disks the sectors with the same data are damaged.
> Is there anything I can do to recover except restoring from backup?
Probably no, but you can check the iSCSI disk in the Xen VM if it is 
usable.
>
> 2. What first and second damaged "files" are and why they are shown 
> like that?
ZFS metadata.
>
> I have this in /var/log/messages, but to me it looks like iSCSI 
> message, that's spring up when accessing damaged files:
>
> """
> kernel: (1:32:0/28): WRITE command returned errno 122
> """
Probably in /var/log/messages you can read messages like this:
Aug 27 03:02:19 clover-nas2 kernel: (ada3:ahcich15:0:0:0): CAM status: 
ATA Status Error
Aug 27 03:02:19 clover-nas2 kernel: (ada3:ahcich15:0:0:0): ATA status: 
51 (DRDY SERV ERR), error: 40 (UNC )
Aug 27 03:02:19 clover-nas2 kernel: (ada3:ahcich15:0:0:0): RES: 51 40 e8 
0f a6 40 44 00 00 08 00
Aug 27 03:02:19 clover-nas2 kernel: (ada3:ahcich15:0:0:0): Error 5, 
Retries exhausted

In this message the /dev/ada3 HDD is failing.

> Manual zpool scrub was tried on this pool to not avail. The pool 
> capacity is only 66% full.
>
> Thanks for any hints in advance.
>
Maurizio