Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Nov 2012 06:06:22 -0800
From:      Gary Buhrmaster <gary.buhrmaster@gmail.com>
To:        Chris BeHanna <chris@behanna.org>
Cc:        FreeBSD FS <freebsd-fs@freebsd.org>
Subject:   Re: SSD recommendations for ZFS cache/log
Message-ID:  <CAMfXtQxqds=jxMndFnLNkCnGxuG5vwDiCUoz0kZynzRk0MDieA@mail.gmail.com>
In-Reply-To: <943159E4-8824-4767-96E1-89E8EC69DCDF@behanna.org>
References:  <CAFHbX1K-NPuAy5tW0N8=sJD=CU0Q1Pm3ZDkVkE%2BdjpCsD1U8_Q@mail.gmail.com> <57ac1f$gf3rkl@ipmail05.adl6.internode.on.net> <943159E4-8824-4767-96E1-89E8EC69DCDF@behanna.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Nov 13, 2012 at 8:18 PM, Chris BeHanna <chris@behanna.org> wrote:
....
> If you'll pardon what may be an ignorant question, does this matter if you have your machine on a UPS, especially if you run upsmon or nut to do a graceful shutdown when there are n minutes of battery remaining?

In the real world, UPS's aren't (uninterruptable), people pull power cords
(even redundant ones), power supplies fail, the power supply redundant
backplane fails, and the motherboard fries and shuts down the power
supply, and disks/SSDs sometimes corrupt themselves for other
random reasons.  And, of course, the reason any of this is so important
with SSDs is that (almost) all SSDs lie about having written the
data to the sectors (they indicate immediate success) since writing to
FLASH is so slow (you have to read a flash 4KB/8KB sector, update it
with your (usually/often) smaller block, erase the flash sector, and then
write the new data).  They may also be doing internal scrubs and
defragmentation at the time of the request.  And so they buffer
written data to onboard RAM and report immediate success.   Since
ZFS is so dependent on the ZIL being correct for recovery (smart
people have added codes to no longer result in complete loss when
it encounters a corrupted ZIL, but the result can still be some data
loss), the ZFS codes to update the ZIL expect that when the device
indicates "written to disk complete", it has been written.  Since the
flash has buffered the ZIL data, a power failure could result in violating
this presumption of ZFS and the ZIL integrity.  A common solution on
SSDs is sometimes called a "super capacitor" so that in the event of
a power failure the SSD still has enough power (time) to finish in-flight
writes.  Marketing in various companies call the solution different things.

Gary



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMfXtQxqds=jxMndFnLNkCnGxuG5vwDiCUoz0kZynzRk0MDieA>