Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 21 Jan 2012 23:06:16 +0100
From:      Alexander Leidinger <Alexander@Leidinger.net>
To:        Willem Jan Withagen <wjw@digiware.nl>
Cc:        fs@freebsd.org
Subject:   Re: Question about  ZFS with log and cache on SSD with GPT
Message-ID:  <20120121230616.00006267@unknown>
In-Reply-To: <4F1B0177.8080909@digiware.nl>
References:  <4F193D90.9020703@digiware.nl> <20120121162906.0000518c@unknown> <4F1B0177.8080909@digiware.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 21 Jan 2012 19:18:31 +0100 Willem Jan Withagen
<wjw@digiware.nl> wrote:

> On 21-1-2012 16:29, Alexander Leidinger wrote:
> >> What I've currently done is partition all disks (also the SSDs)
> >> with GPT like below:
> >> batman# zpool iostat -v
> >>                   capacity     operations    bandwidth
> >> pool           alloc   free   read  write   read  write
> >> -------------  -----  -----  -----  -----  -----  -----
> >> zfsboot        50.0G  49.5G      1     13  46.0K   164K
> >>   mirror       50.0G  49.5G      1     13  46.0K   164K
> >>     gpt/boot4      -      -      0      5  23.0K   164K
> >>     gpt/boot6      -      -      0      5  22.9K   164K
> >> -------------  -----  -----  -----  -----  -----  -----
> >> zfsdata        59.4G   765G     12     62   250K  1.30M
> >>   mirror       59.4G   765G     12     62   250K  1.30M
> >>     gpt/data4      -      -      5     15   127K  1.30M
> >>     gpt/data6      -      -      5     15   127K  1.30M
> >>   gpt/log2       11M  1005M      0     22     12   653K
> >>   gpt/log3     11.1M  1005M      0     22     12   652K
> > 
> > Do you have two log devices in non-mirrored mode? If yes, it would
> > be better to have the ZIL mirrored on a pair.
> 
> So what you are saying is that logging is faster in mirrored mode?

No.

> Our are you more concerned out losing the the LOG en thus possible
> losing data.

Yes. If one piece of the involved hardware dies, you lose data.

> >> cache              -      -      -      -      -      -
> >>   gpt/cache2   9.99G  26.3G     27     53  1.20M  5.30M
> >>   gpt/cache3   9.85G  26.4G     28     54  1.24M  5.23M
> >> -------------  -----  -----  -----  -----  -----  -----
> ....
> 
> >> Now the question would be are the GPT partitions correctly aligned
> >> to give optimal performance?
> > 
> > I would assume that the native block size of the flash is more like
> > 4kb than 512b. As such just creating the GPT partitions will not be
> > the best setup. 
> 
> Corsair reports:
> Max Random 4k Write (using IOMeter 08): 50k IOPS (4k aligned)
> So I guess that suggests 4k aligned is required.

Sounds like it is.

> > See
> > http://www.leidinger.net/blog/2011/05/03/another-root-on-zfs-howto-optimized-for-4k-sector-drives/
> > for a description how to align to 4k sectors. I do not know if the
> > main devices of the pool need to be setup with an emulated 4k size
> > (the gnop part in my description) or not, but I would assume all
> > disks in the pool needs to be setup with the temporary gnop setup.
> 
> Well one way of resetting up the harddisks would be to remove them
> from the mirror each in turn. Repartion, and then rebuild the mirror,
> hoping that that would work, since I need some extra space to move the
> partitions up. :(

Already answered by someone else, but I want to point out again, that
if you have the critical writes 4k aligned and they are mostly 4k or
bigger in size, you could be lucky.

You could compare the zpool iostat output with the gstat output of the
disks. If they more or less match, you are lucky. If the gstat output
is bigger, you are in the unlucky case.

> >> The harddisks are still std 512byte sectors, so that would be
> >> alright? The SSD's I have my doubts.....
> > 
> > You could assume that the majority of cases are 4k or bigger writes
> > (tune your MySQL this way, and do not forget to change the
> > recordsize of the zfs dataset which contains the db files to match
> > what the DB writes) and just align the partitions of the SSDs for
> > 4k (do not use the gnop part in my description). I would assume
> > that this already gives good performance in most cases.
> 
> I'll redo the SSD's with the suggestions from your page.
> 
> >> Good thing is that v28 allow you to toy with log and cache without
> >> loosing data. So I could redo the recreation of cache and log
> >> relatively easy.
> > 
> > You can still lose data when a log SSD dies (if they are not
> > mirrored).
> 
> I was more refering to the fact that under v28, one is able to remove
> log and cache thru zpool commands without loosing data. Just pulling
> the disks is of course going to corrupt data.

If you can recreate the data and don't care about data loss, and if you
verified that two ZIL devices give more performance than two, why not.

Bye,
Alexander.

-- 
http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120121230616.00006267>