Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Aug 2012 09:31:35 +0200
From:      Hugo Lombard <hal@elizium.za.net>
To:        Karli =?iso-8859-1?Q?Sj=F6berg?= <Karli.Sjoberg@slu.se>
Cc:        "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org>
Subject:   Re: Hang when importing pool
Message-ID:  <20120815073135.GO6757@squishy.elizium.za.net>
In-Reply-To: <49C9D08A-85EF-4D23-B07F-F3980CBA5A97@slu.se>
References:  <D13A3EA7-B229-4B78-915E-A3CC3162DB8A@slu.se> <CAOjFWZ52XvMO%2BA7cwa3fnkJcXMCbGgWD91gvZsmW8Navh0AZ9A@mail.gmail.com> <49C9D08A-85EF-4D23-B07F-F3980CBA5A97@slu.se>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Aug 15, 2012 at 08:45:38AM +0200, Karli Sjöberg wrote:
> 
> I took your advice. I replaced my Core i5 with a Xeon X3470 and ramped
> up the RAM to 32GB, maxing out the HW. Sadly enough, it still stalls
> in the exact same manner:( This has to be the most frustrating thing
> ever, since there´s tons of data there that I really need and if it
> wasn´t for that stupid destroy operation, it would still be
> accessible.
> 
> I feel that FreeBSD is partly to blame since it was completely
> possible in the originating SUN machine with Solaris that only has
> 16GB RAM to do the same destroy to the same dataset without any
> problem. Sure, it took forever and then some (about two weeks) but it
> stayed afloat during the whole time.
> 

Sorry to hear about your pain.

I've recently run into a similar problem where destroying a lot of
snapshots on de-duped filesystems caused two boxes (one a replica of the
other) to strangle itself.  After much stuggling, I opted to redo the
slave box, mounted the master box's pool readonly, and rsync'ed the
datasets across.

In retrospect, I shouldn't have deleted so many snapshots at once.
Boxes are both quad-core Opterons with 16GB RAM each.  On the newly
re-done box, I've decided not to use de-dupe.

In the process of searching for an answer I came across this thread:

  http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg47526.html

The person who noted the issue originally finally managed to recover
their pool with a loan machine from Oracle that had 120GB RAM:

  http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg47529.html

Personally, I don't think the problem is purely FreeBSD's fault.

-- 
Hugo Lombard



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120815073135.GO6757>