Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Sep 2024 11:35:25 +0100 (BST)
From:      andy thomas <andy@time-domain.co.uk>
To:        Allan Jude <allanjude@freebsd.org>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Does a failed separate ZIL disk mean the entire zpool is lost?
Message-ID:  <alpine.BSF.2.22.395.2409101105040.74876@mail0.time-domain.net>
In-Reply-To: <dabea42c-65d7-40ea-bd37-840148e855c5@freebsd.org>
References:  <alpine.BSF.2.22.395.2409091634020.50467@mail0.time-domain.net> <535969cf-0b0b-48ca-a163-fc238f316bb7@gmx.at> <dabea42c-65d7-40ea-bd37-840148e855c5@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.

--951801389-2008165308-1725964525=:74876
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8BIT

Thank you but I'm afraid I didn't use two mirrored ZIL devices since I 
didn't know this was possible at the time I set this server up (late 2017 
and before I was even aware of the 'FreeBSD Mastery: ZFS' book!) And there 
were no spare disk bays in the server's chassis to add another device and 
at the time PCIe > nvme adapters were not available. For data resilience I 
relied on an identical mirror server in the same rack linked via a 2 x 
10GBit/sec bonded point-to-point network link but this server also failed 
in the data centre melt-down...

It looks like the data is now lost so I won't waste any more time trying 
to recover it - this incident will hopefully persuade my employer to heed 
advice given years ago regarding locating mirror servers in a different 
data centre linked by a fast multi-gigabit connection.

Andy

PS: the ZFS and Advanced ZFS books are truly excellent, by the way!

On Mon, 9 Sep 2024, Allan Jude wrote:

> As the last person mentioned, you should be able to import with the -m flag, 
> and only lose about 5 seconds worth of writes.
>
> The pool is already partially imported at boot by the other mechanisms, you 
> might need to disable that to prevent the partial import at boot, so you can 
> do the manual import.
>
> On 2024-09-09 12:20 p.m., infoomatic wrote:
>> did you use two mirrored ZIL devices?
>> 
>> You can "zpool import -m", but you will probably be confronted with some
>> errors - you will probably lose the data the ZIL has not committed, but
>> most of your data in your pool should be there
>> 
>> 
>> On 09.09.24 17:51, andy thomas wrote:
>>> A server I look after had a 65TB ZFS RAIDz1 pool with 8 x 8TB hard disks
>>> plus one hot spare and separate ZFS intent log (ZIL) and L2ARC cache
>>> disks that used a pair of 256GB SSDs. This ran really well for 6 years
>>> until 2 weeks ago, when the main cooling system in the data centre where
>>> it was installed failed and the backup cooling system failed to start up.
>>> 
>>> The upshot was the ZIL SSD went short-circuit across its power
>>> connector, shorting out the server's PSUs and shutting down the server.
>>> After replacing the failed SSD and verifying all the spinning hard disks
>>> and the cache SSD are undamaged, attempts to import the pool fail with
>>> the following message:
>>> 
>>> NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP DEDUP
>>> HEALTH  ALTROOT
>>> clustor2      -      -      -        -         -      - -      -
>>> UNAVAIL  -
>>> 
>>> Does this mean the pool's contents are now lost and unrecoverable?
>>> 
>>> Andy
>>> 
>> 
>
>


----------------------------
Andy Thomas,
Time Domain Systems

Tel: +44 (0)7866 556626
http://www.time-domain.co.uk
--951801389-2008165308-1725964525=:74876--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.22.395.2409101105040.74876>