Date: Thu, 4 Jul 2013 13:56:44 -0700 From: Freddie Cash <fjwcash@gmail.com> To: Jeremy Chadwick <jdc@koitsu.org> Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org> Subject: Re: Slow resilvering with mirrored ZIL Message-ID: <CAOjFWZ7dr45twGCP=9_6wTy_RewVJsoK__KCVDtLYEQ3RAVeDQ@mail.gmail.com> In-Reply-To: <20130704191203.GA95642@icarus.home.lan> References: <CABBFC07-68C2-4F43-9AFC-920D8C34282E@unixconn.com> <51D42107.1050107@digsys.bg> <2EF46A8C-6908-4160-BF99-EC610B3EA771@alumni.chalmers.se> <51D437E2.4060101@digsys.bg> <E5CCC8F551CA4627A3C7376AD63A83CC@multiplay.co.uk> <CBCA1716-A3EC-4E3B-AE0A-3C8028F6AACF@alumni.chalmers.se> <20130704000405.GA75529@icarus.home.lan> <C8C696C0-2963-4868-8BB8-6987B47C3460@alumni.chalmers.se> <20130704171637.GA94539@icarus.home.lan> <2A261BEA-4452-4F6A-8EFB-90A54D79CBB9@alumni.chalmers.se> <20130704191203.GA95642@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jul 4, 2013 at 12:12 PM, Jeremy Chadwick <jdc@koitsu.org> wrote: > I believe -- but I need someone else to chime in here with confirmation, > particularly someone who is familiar with ZFS's internals -- once your > pool is ashift 12, you can do a disk replacement ***without*** having to > do the gnop procedure (because the pool itself is already using ashift > 12). But again, I need someone to confirm that. > Correct. The ashift property of a vdev is set at creation time and cannot be changed (AFAIK) without destroying/recreating the pool. Thus, you can use gnop to create the vdev with ashift=12, and then just do normal "zpool replace" or "zpool detach/attach" to replace drives in the vdevs (512b or 4K drives) without gnop. Haven't read the code :) but I have done many, many drive replacements on ashift=9 and ashift=12 vdevs and watched what happens via zdb. :) The WD10EARS are known for excessively parking their heads, which > causes massive performance problems with both reads and writes. This is > known by PC enthusiasts as the "LCC issue" (LCC = Load Cycle Count, > referring to SMART attribute 193). > > On these drives there are ways to work around this issue -- it > specifically involves disabling drive-level APM. To do so, you have to > initiate a specific ATA CDB to the drive using "camcontrol cmd", and > this has to be done every time the system reboots. There is one > drawback to disabling APM as well: the drives run hotter. > On some WD Green drives, depending on the firmware and manufacturing date, you can use the wdidle3.exe program (via a DOS boot) to set the timeout to either "disabled" or "15 minutes" which is usually enough to prevent most of the head-parking wear-out issues. However, I believe this only worked up until Dec 2011 or Dec 2012? We had the misfortune of using 12 of these in a ZFS storage box when they were first released (2 TB for under $150? Hell Yeah! Ooops, you get what you pay for ...). We quickly replaced them. You really need to be running stable/9 if you want to use SSDs with ZFS. > I cannot stress this enough. I will not bend on this fact. I do not > care if what people have are SLC rather than MLC or TLC -- it doesn't > matter. TRIM on ZFS is a downright necessity for long-term reliability > of an SSD. Anyway... > One can mitigate this a little by leaving 25% of the SSD unpartitioned/unformatted, thus allowing the background GC process to work without impacting performance and providing long-term performance that's close to (but not quite 100%) after-TRIM performance. Takes a lot of will-power to leave 8-16-odd GB free on an SSD that cost close to $200, though. :) It's not perfect, it's not as good as using TRIM, but at least it's doable on FreeBSD pre-9.1-STABLE. > > You should probably be made aware of the fact that SSDs need to be > kept roughly 30-40% unused to get the most benefits out of wear > levelling. Once you hit the 20% remaining mark, performance takes a > hit, and the drive begins hurting more and more. Low-capacity SSDs > are therefore generally worthless given the capacity limitation need. > Ah, I see you mention what I did above. :) Guess that's what I get for not reading all the way through before starting a reply. :) -- Freddie Cash fjwcash@gmail.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOjFWZ7dr45twGCP=9_6wTy_RewVJsoK__KCVDtLYEQ3RAVeDQ>