Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Jul 2019 10:39:34 +0300
From:      Daniel Braniss <danny@cs.huji.ac.il>
To:        Allan Jude <allanjude@freebsd.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: zpool errors
Message-ID:  <05D8BD75-78B4-4336-8A8A-C84A901CB3D4@cs.huji.ac.il>
In-Reply-To: <70f1be10-e37a-de20-e188-6155fda2d06a@freebsd.org>
References:  <52CE32B1-7E01-4C35-A2AB-84D3D5BD4E2F@cs.huji.ac.il> <27c3e59a-07ea-5df3-9de2-302d5290a477@freebsd.org> <831204B6-3F3B-4736-89FA-1207C4C46A7E@cs.huji.ac.il> <70f1be10-e37a-de20-e188-6155fda2d06a@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help


> On 10 Jul 2019, at 20:23, Allan Jude <allanjude@freebsd.org> wrote:
>=20
> On 2019-07-10 11:37, Daniel Braniss wrote:
>>=20
>>=20
>>> On 10 Jul 2019, at 18:24, Allan Jude <allanjude@freebsd.org> wrote:
>>>=20
>>> On 2019-07-10 10:48, Daniel Braniss wrote:
>>>> hi,
>>>> i got a degraded pool, but can=E2=80=99t make sense  of the file =
name:
>>>>=20
>>>> protonew-2# zpool status -vx
>>>> pool: h
>>>> state: ONLINE
>>>> status: One or more devices has experienced an error resulting in =
data
>>>>      corruption.  Applications may be affected.
>>>> action: Restore the file in question if possible.  Otherwise =
restore the
>>>>      entire pool from backup.
>>>> see: http://illumos.org/msg/ZFS-8000-8A =
<http://illumos.org/msg/ZFS-8000-8A>;
>>>> scan: scrub repaired 6.50K in 17h30m with 0 errors on Wed Jul 10 =
12:06:14 2019
>>>> config:
>>>>=20
>>>>      NAME          STATE     READ WRITE CKSUM
>>>>      h             ONLINE       0     0 14.4M
>>>>        gpt/r5/zfs  ONLINE       0     0 57.5M
>>>>=20
>>>> errors: Permanent errors have been detected in the following files:
>>>>=20
>>>>      <0x102>:<0x30723>
>>>>      <0x102>:<0x30726>
>>>>      <0x102>:<0x3062a>
>>>> =E2=80=A6
>>>>      <0x281>:<0x0>
>>>>      <0x6aa>:<0x305cd>
>>>>      <0xffffffffffffffff>:<0x305cd>
>>>>=20
>>>>=20
>>>> any hints as how I can identify third files?
>>>>=20
>>>> thanks,
>>>> 	danny
>>>>=20
>>>> _______________________________________________
>>>> freebsd-hackers@freebsd.org mailing list
>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>>>> To unsubscribe, send any mail to =
"freebsd-hackers-unsubscribe@freebsd.org"
>>>>=20
>>>=20
>>> Once a file has been deleted, ZFS can have a hard time determining =
its
>>> filename.
>>>=20
>>> It is inode 198186 (0x3062a) on dataset 0x102. The file has been
>>> deleted, but still exists in at least one snapshot.
>>>=20
>>> Although, 57 million checksum errors seems like there may be some =
other
>>> problem. You might look for and resolve the problem with what =
appears to
>>> be a raid5 you have built your ZFS pool on top of it? Then do 'zpool
>>> clear' to reset the counters to zero, and 'zpool scrub' to try to =
read
>>> everything again.
>>>=20
>>> --=20
>>> Allan Jude
>>>=20
>> I don=E2=80=99t know when the first error was detected, and this host =
has been up for 367 days!
>> I did a scrub but no change.
>> i will remove old snapshots and see if it helps.
>>=20
>> is it possible to know at least which volume?
>>=20
>> thanks,
>> 	danny
>>=20
>>=20
>> _______________________________________________
>> freebsd-hackers@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>> To unsubscribe, send any mail to =
"freebsd-hackers-unsubscribe@freebsd.org"
>>=20
>=20
> zdb -ddddd h 0x102
>=20
> Should tell you about which dataset that is
>=20
> --=20
> Allan Jude
>=20

firstly, thanks for your help!
now, after doing a zpool clear, I notice that the CHKSUM is growing,
the pool is on a raid controller raid5 (PERC from dell) which is showing
it=E2=80=99s correcting the errors (=E2=80=98Corrected medium error =
during recovery on PD =E2=80=A6).

so what can be  the cause? btw, the FreeBSD is 10.3-stable.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?05D8BD75-78B4-4336-8A8A-C84A901CB3D4>