Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 27 Jun 2016 08:57:14 -0700
From:      Freddie Cash <fjwcash@gmail.com>
To:        Holger Freyther <holger@freyther.de>
Cc:        =?UTF-8?Q?Karli_Sj=C3=B6berg?= <karli.sjoberg@slu.se>,  FreeBSD Filesystems <freebsd-fs@freebsd.org>
Subject:   Re: Deadlock in zpool import with degraded pool
Message-ID:  <CAOjFWZ7_W9JoNr9CAN5UmmTpKKso04rQ-LOj-u4hxR1MMwCsWg@mail.gmail.com>
In-Reply-To: <9DF3E719-5184-419E-B81A-599D5ECCD969@freyther.de>
References:  <8a4cb87252c04ebfbf71451c5dc1a41e@exch2-4.slu.se> <9DF3E719-5184-419E-B81A-599D5ECCD969@freyther.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On Jun 27, 2016 8:21 AM, "Holger Freyther" <holger@freyther.de> wrote:
>
>
> > On 26 Jun 2016, at 19:28, Karli Sj=C3=B6berg <karli.sjoberg@slu.se> wro=
te:
> >
>
> Hi,
>
>
>
> > That's your problem right there; dedup! You need to throw more RAM into
it until the destroy can complete. If the mobo is 'full', you need
new/other hw to cram more RAM into or you can kiss your data goodbye. I've
been in the exact same situation as you are now so I sympathize:(
>
> did you look at it further?
>
> * Why does it only start after I zfs destroyed something? The dedup
hash/table/??? grows by that?

Because every reference to every deleted block needs to be updated
(decremented) in the DDT (dedupe table), which means the DDT needs to be
pulled into ARC first. It's the pathological case for RAM use with dedupe
enabled. :(

> * Why a plain dead-lock and no panic?

It's stuck trying to free RAM for ARC to load the DDT.

> * Is there an easy way to see how much RAM is needed? (In the end I can
use Linux/KVM with RAM backed in a file/disk and just wait...)

There's a zdb command (-S or something like that) that will show the block
distribution in the DDT, along with how many unique data blocks there are.
You need approx 1 GB of ARC per TB of unique data, over and above any other
RAM requirements for normal operation.

And then double that for deleting snapshots. :(

> * Would you know if zpool import -o readonly avoids loading/building that
big table? From common sense this block table would only be needed on write
to map from checksum to block?

If you are in the "hang on import due to out-of-memory" situation, the only
solution is to add more RAM (if possible) and just keep rebooting the
server. Every import process will delete a little more data from the pool,
update a little more of the DDT, and eventually the destroy process will
complete, and the pool will be imported.

The longest one for me took a little over a week of rebooting the server
multiple times per day. :(

We've since moved away from using dedupe. It was a great feature to have
when we could only afford 400 GB drives and could get 3-5x convinced
compress + dedupe ratios. Now that we can get 4-8 TB drives, it's not worth
it.

Cheers,
Freddie



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOjFWZ7_W9JoNr9CAN5UmmTpKKso04rQ-LOj-u4hxR1MMwCsWg>