Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 21 Jan 2012 16:29:06 +0100
From:      Alexander Leidinger <Alexander@Leidinger.net>
To:        Willem Jan Withagen <wjw@digiware.nl>
Cc:        fs@freebsd.org
Subject:   Re: Question about  ZFS with log and cache on SSD with GPT
Message-ID:  <20120121162906.0000518c@unknown>
In-Reply-To: <4F193D90.9020703@digiware.nl>
References:  <4F193D90.9020703@digiware.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 20 Jan 2012 11:10:24 +0100 Willem Jan Withagen
<wjw@digiware.nl> wrote:

> Now my question is more about the SSD configuration.
> (BTW adding 1 SSD got the insert rate up from 100/sec to > 1000/sec,
> once the cache was loaded.)
> 
> The database is on a mirror of 2 1T disks:
> ada0: <ST1000NM0011 SN02> ATA-8 SATA 3.x device
> ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> ada0: Command Queueing enabled
> 
> and there are 2 SSDs:
> ada2: <Corsair CSSD-F40GB2 2.0> ATA-8 SATA 2.x device
> ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> ada2: Command Queueing enabled
> 
> What I've currently done is partition all disks (also the SSDs) with
> GPT like below:
> batman# zpool iostat -v
>                   capacity     operations    bandwidth
> pool           alloc   free   read  write   read  write
> -------------  -----  -----  -----  -----  -----  -----
> zfsboot        50.0G  49.5G      1     13  46.0K   164K
>   mirror       50.0G  49.5G      1     13  46.0K   164K
>     gpt/boot4      -      -      0      5  23.0K   164K
>     gpt/boot6      -      -      0      5  22.9K   164K
> -------------  -----  -----  -----  -----  -----  -----
> zfsdata        59.4G   765G     12     62   250K  1.30M
>   mirror       59.4G   765G     12     62   250K  1.30M
>     gpt/data4      -      -      5     15   127K  1.30M
>     gpt/data6      -      -      5     15   127K  1.30M
>   gpt/log2       11M  1005M      0     22     12   653K
>   gpt/log3     11.1M  1005M      0     22     12   652K

Do you have two log devices in non-mirrored mode? If yes, it would be
better to have the ZIL mirrored on a pair.

> cache              -      -      -      -      -      -
>   gpt/cache2   9.99G  26.3G     27     53  1.20M  5.30M
>   gpt/cache3   9.85G  26.4G     28     54  1.24M  5.23M
> -------------  -----  -----  -----  -----  -----  -----
> 
> disks 4 and 6 are naming remains of pre ahci times and are ada0 and
> ada1. So the hardisks have the "std" zfs setup: a boot-pool and a
> data-pool.
> 
> The SSD's if partitioned and assigned to zfsdata with:
> 	gpart create -s GPT ada2
>    	gpart create -s GPT ada3
> 	gpart add -t freebsd-zfs -l log2 -s 1G ada2
> 	gpart add -t freebsd-zfs -l log3 -s 1G ada3
> 	gpart add -t freebsd-zfs -l cache2 ada2
> 	gpart add -t freebsd-zfs -l cache3 ada3
> 	zpool add zfsdata log /dev/gpt/log*
> 	zpool add zfsdata cache /dev/gpt/cache*
> 
> Now the question would be are the GPT partitions correctly aligned to
> give optimal performance?

I would assume that the native block size of the flash is more like 4kb
than 512b. As such just creating the GPT partitions will not be the
best setup. See
http://www.leidinger.net/blog/2011/05/03/another-root-on-zfs-howto-optimized-for-4k-sector-drives/
for a description how to align to 4k sectors. I do not know if the main
devices of the pool need to be setup with an emulated 4k size (the gnop
part in my description) or not, but I would assume all disks in the
pool needs to be setup with the temporary gnop setup.

> The harddisks are still std 512byte sectors, so that would be alright?
> The SSD's I have my doubts.....

You could assume that the majority of cases are 4k or bigger writes
(tune your MySQL this way, and do not forget to change the recordsize
of the zfs dataset which contains the db files to match what the DB
writes) and just align the partitions of the SSDs for 4k (do not use
the gnop part in my description). I would assume that this already
gives good performance in most cases.

> Good thing is that v28 allow you to toy with log and cache without
> loosing data. So I could redo the recreation of cache and log
> relatively easy.

You can still lose data when a log SSD dies (if they are not mirrored).

> I'd rather not redo the DB build since that takes a few days. :(
> But before loading the DB, I did use some of the tuning suggestions
> like using different recordsize for db-logs and innodb files.

Bye,
Alexander.

-- 
http://www.Leidinger.net    Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org       netchild @ FreeBSD.org  : PGP ID = 72077137



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120121162906.0000518c>