Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 28 Apr 2011 18:08:29 -0700
From:      Jeremy Chadwick <freebsd@jdc.parodius.com>
To:        Daniel O'Connor <doconnor@gsoft.com.au>
Cc:        freebsd-stable List <freebsd-stable@freebsd.org>
Subject:   Re: ZFS vs OSX Time Machine
Message-ID:  <20110429010829.GA36744@icarus.home.lan>
In-Reply-To: <AF725CFF-86A4-4D65-A26E-496F6B9BD33E@gsoft.com.au>
References:  <537A8F4F-A302-40F9-92DF-403388D99B4B@gsoft.com.au> <20110428195601.GA31807@icarus.home.lan> <AF725CFF-86A4-4D65-A26E-496F6B9BD33E@gsoft.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Apr 29, 2011 at 09:43:47AM +0930, Daniel O'Connor wrote:
> 
> On 29/04/2011, at 5:26, Jeremy Chadwick wrote:
> >> I have the following ZFS related tunables
> >> 
> >> vfs.zfs.arc_max="3072M"
> >> vfs.zfs.prefetch_disable="1" 
> >> vfs.zfs.txg.timeout=5
> >> vfs.zfs.cache_flush_disable=1
> > 
> > Are the last two actually *working* in /boot/loader.conf?  Can you
> > verify by looking at them via sysctl?  AFAIK they shouldn't work, since
> > they lack double-quotes around the values.  Parsing errors are supposed
> > to throw you back to the loader prompt.  See loader.conf(5) for the
> > syntax.
> 
> Yep, they're working :)
> 
> > I'm also not sure why you're setting cache_flush_disable at all.
> 
> I think I was wondering if it would help the abysmal write performance of these disks..
> 
> >> Any help appreciated, thanks :)
> > 
> > Others seem to be battling stating that "NFS doesn't work for TM", but
> > that isn't what you're complaining about.  You're complaining that
> > FreeBSD with ZFS + NFS performs extremely poorly when trying to do
> > backups from an OS X client using TM (writing to the NFS mount).
> 
> Yes, and also TM is over AFP not NFS (I forgot to mention that..)
> 
> > I have absolutely no experience with TM or OS X, so if it's actually a
> > client-level problem (which I'm doubting) I can't help you there.
> > 
> > Just sort of a ramble here at different things...
> > 
> > It would be useful to provide ZFS ARC sysctl data from the FreeBSD
> > system where you're seeing performance issues.  "sysctl -a
> > kstat.zfs.misc.arcstats" should suffice.
> 
> kstat.zfs.misc.arcstats.hits: 236092077
> kstat.zfs.misc.arcstats.misses: 6451964
> kstat.zfs.misc.arcstats.demand_data_hits: 98087637
> kstat.zfs.misc.arcstats.demand_data_misses: 1220891
> kstat.zfs.misc.arcstats.demand_metadata_hits: 138004440
> kstat.zfs.misc.arcstats.demand_metadata_misses: 5231073
> kstat.zfs.misc.arcstats.prefetch_data_hits: 0
> kstat.zfs.misc.arcstats.prefetch_data_misses: 0
> kstat.zfs.misc.arcstats.prefetch_metadata_hits: 0
> kstat.zfs.misc.arcstats.prefetch_metadata_misses: 0
> kstat.zfs.misc.arcstats.mru_hits: 15041670
> kstat.zfs.misc.arcstats.mru_ghost_hits: 956048
> kstat.zfs.misc.arcstats.mfu_hits: 221050407
> kstat.zfs.misc.arcstats.mfu_ghost_hits: 3269042
> kstat.zfs.misc.arcstats.allocated: 15785717
> kstat.zfs.misc.arcstats.deleted: 4690878
> kstat.zfs.misc.arcstats.stolen: 4990300
> kstat.zfs.misc.arcstats.recycle_miss: 2142423
> kstat.zfs.misc.arcstats.mutex_miss: 518
> kstat.zfs.misc.arcstats.evict_skip: 2251705
> kstat.zfs.misc.arcstats.evict_l2_cached: 0
> kstat.zfs.misc.arcstats.evict_l2_eligible: 470396116480
> kstat.zfs.misc.arcstats.evict_l2_ineligible: 2048
> kstat.zfs.misc.arcstats.hash_elements: 482679
> kstat.zfs.misc.arcstats.hash_elements_max: 503063
> kstat.zfs.misc.arcstats.hash_collisions: 19593315
> kstat.zfs.misc.arcstats.hash_chains: 116103
> kstat.zfs.misc.arcstats.hash_chain_max: 16
> kstat.zfs.misc.arcstats.p: 1692798721
> kstat.zfs.misc.arcstats.c: 3221225472
> kstat.zfs.misc.arcstats.c_min: 402653184
> kstat.zfs.misc.arcstats.c_max: 3221225472
> kstat.zfs.misc.arcstats.size: 3221162968
> kstat.zfs.misc.arcstats.hdr_size: 103492088
> kstat.zfs.misc.arcstats.data_size: 2764591616
> kstat.zfs.misc.arcstats.other_size: 353079264
> kstat.zfs.misc.arcstats.l2_hits: 0
> kstat.zfs.misc.arcstats.l2_misses: 0
> kstat.zfs.misc.arcstats.l2_feeds: 0
> kstat.zfs.misc.arcstats.l2_rw_clash: 0
> kstat.zfs.misc.arcstats.l2_read_bytes: 0
> kstat.zfs.misc.arcstats.l2_write_bytes: 0
> kstat.zfs.misc.arcstats.l2_writes_sent: 0
> kstat.zfs.misc.arcstats.l2_writes_done: 0
> kstat.zfs.misc.arcstats.l2_writes_error: 0
> kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
> kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
> kstat.zfs.misc.arcstats.l2_evict_reading: 0
> kstat.zfs.misc.arcstats.l2_free_on_write: 0
> kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
> kstat.zfs.misc.arcstats.l2_cksum_bad: 0
> kstat.zfs.misc.arcstats.l2_io_error: 0
> kstat.zfs.misc.arcstats.l2_size: 0
> kstat.zfs.misc.arcstats.l2_hdr_size: 0
> kstat.zfs.misc.arcstats.memory_throttle_count: 19
> kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0
> kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0
> kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0
> kstat.zfs.misc.arcstats.l2_write_in_l2: 0
> kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0
> kstat.zfs.misc.arcstats.l2_write_not_cacheable: 1
> kstat.zfs.misc.arcstats.l2_write_full: 0
> kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0
> kstat.zfs.misc.arcstats.l2_write_pios: 0
> kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0
> kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0
> kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0

Thanks.  I don't see anything indicative here of ARC problems.
memory_throttle_count being 19 is acceptable as well (though a very
large number could indicate issues of a different sort).  Otherwise
things look very good/normal.

> > You should also try executing "zpool iostat -v 1" during the TM backup
> > to see if there's a particular device which is behaving poorly.  There
> > have been reports of ZFS pools behaving poorly when a single device
> > within the pool has slow I/O (e.g. 5 hard disks, one of which has
> > internal issues, resulting in the entire pool performing horribly).  You
> > should let this run for probably 60-120 seconds to get an idea.  Given
> > your parameters above (assuming vfs.zfs.txg.timeout IS in fact 5!), you
> > should see "bursts" of writes every 5 seconds.
> 
> OK.
> 
> > I know that there are some things on ZFS that perform badly overall.
> > Anything that involves excessive/large numbers of files (not file sizes,
> > but actual files themselves) seems to perform not-so-great with ZFS.
> > For example, Maildir on ZFS = piss-poor performance.  There are ways to
> > work around this issue (if I remember correctly, by adding a dedicated
> > "log" device to your ZFS pool, but be aware your log devices need to
> > be reliable (if you have a single log device and it fails the entire
> > pool is damaged, if I remember right)), but I don't consider it
> > feasible.  So if TM is creating tons of files on the NFS mount (backed
> > by ZFS), then I imagine the performance isn't so great.
> 
> Hmm, the sparse disk image does have ~80000 files in a single directory..

Have you tried looking at dirhash sysctls?  Oh foo, that's for UFS only.

Could you please provide output from "zfs get all poolname"?  Myself and
others would like to review what settings you're using on the
filesystem.  If it's a separate filesystem (e.g. pool/foobar), please
also provide output from "zfs get all pool".

> > Could you please provide the following sysctl values?  Thanks.
> > 
> > kern.maxvnodes
> > kern.minvnodes
> > vfs.freevnodes
> > vfs.numvnodes
> 
> kern.maxvnodes: 204477
> kern.minvnodes: 51119
> vfs.freevnodes: 51118
> vfs.numvnodes: 66116

You look fine here -- no need to increase the kern.maxvnodes sysctl.

> > If the FreeBSD machine has a wireless card in it, if at all possible
> > could you try ruling that out by hooking up wired Ethernet instead?
> > It's probably not the cause, but worth trying anyway.  If you have a
> > home router or something doing 802.11, don't bother with this idea.
> 
> The FreeBSD box is wired, although it's using an re card as the em card died(!!).
> 
> The OSX box is connected via an Airport Express (11n).

Can you connect something to it via Ethernet and attempt an FTP transfer
(both PUT (store on server) and GET (retrieve from server)) from a
client on the wired network?  Make sure whatever you're PUT'ing and
GET'ing are using the ZFS filesystem.  Don't forget "binary" mode too.

You should see very good performance on files that are already in the
ARC.  So for example, pick a 500MB ISO file that hasn't been accessed
previously (thus isn't in the ARC).  GET'ing it should result in a lot
of disk I/O (zpool iostat -v 1).  But a subsequent GET should show very
little disk I/O, as all the data should be coming from memory (ARC).  A
PUT would test write.

Basically what I'm trying to figure out here is if the network layer is
somehow causing these problems for you or not.  Wireless is simply too
unreliable/too flippant in packet loss and latency to be a good medium
to test throughput of a filesystem.  Period.

> > Next, you COULD try using Samba/CIFS on the FreeBSD box to see if you
> > can narrow the issue down to bad NFS performance.  Please see this post
> > of mine about tuning Samba on FreeBSD (backed by ZFS) to get extremely
> > good performance.  Many people responded and said their performance
> > drastically improved (you can see the thread yourself).  The trick is
> > AIO.  You can ignore the part about setting vm.kmem_size in loader.conf;
> > that advice is now old/deprecated (does not pertain to you given the
> > date of your kernel), and vfs.zfs.txg.write_limit_override is something
> > you shouldn't mess with unless absolutely needed to leave it default:
> > 
> > http://lists.freebsd.org/pipermail/freebsd-stable/2011-February/061642.html
> 
> OK. I don't think TM can use CIFS, I will try ISCSI as someone else suggested, perhaps it will help.

Be aware there are all sorts of caveats/complexities with iSCSI on
FreeBSD.  There are past threads on -stable and -fs talking about them
in great detail.  I personally wouldn't go this route.

Why can't OS X use CIFS?  It has the ability to mount a SMB filesystem,
right?  Is there some reason you can't mount that, then tell TM to write
its backups to /mountedcifs?

> > Finally, when was the last time this FreeBSD machine was rebooted?  Some
> > people have seen horrible performance that goes away after a reboot.
> > There's some speculation that memory fragmentation has something to do
> > with it.  I simply don't know.  I'm not telling you to reboot the box
> > (please don't; it would be more useful if it could be kept up in case
> > folks want to do analysis of it).
> 
> I think performance does improve after a reboot :(

This could be a memory performance or fragmentation problem then.  Gosh
it's been a long time since I've read about that.  Some FreeBSD folks
knew of such, and I think someone came up with a patch for it, but I'm
not sure.  I wish I could remember the name of the developer who was
talking about it.  Artem Belevich maybe?

The other problem a user had pertaining to ZFS and memory performance
was even more odd, but was eventually tracked down.  He had installed
two DIMMs in his machine (making a total of 4) and suddenly memory
performance was abysmal.  Remove the DIMMs, performance restored.  Put
the two removed DIMMs in (previous working ones out), performance was
fine.  If I remember right, the issue turned out to be a bug in the
BIOS, and a BIOS upgrade from Intel (it was an Intel motherboard) fixed
it.  Intel is one of the only companies that releases *very* concise and
decent changelogs for their BIOSes, which is wonderful.

> last pid: 16112;  load averages:  0.24,  0.22,  0.23                     up 8+16:11:50  09:43:19
> 653 processes: 1 running, 652 sleeping
> CPU:  3.6% user,  0.0% nice,  3.4% system,  0.6% interrupt, 92.5% idle
> Mem: 1401M Active, 578M Inact, 4143M Wired, 4904K Cache, 16M Buf, 1658M Free
> Swap: 4096M Total, 160M Used, 3936M Free, 3% Inuse
> 
> although free does go down very low (~250MB) at times.

This is normal.  The ZFS ARC has most of your memory (shown as "Wired"
in the above top output).  If something needs memory, parts of the ARC
will be released/freed given memory pressure.

I will note something, however: your ARC max is set to 3072MB, yet Wired
is around 4143MB.  Do you have something running on this box that takes
up a lot of RAM?  mysqld, etc..?  I'm trying to account for the "extra
gigabyte" in Wired.  "top -o res" might help here, but we'd need to see
the process list.

I'm thinking something else on your machine is also taking up Wired,
because your arcstats shows:

> kstat.zfs.misc.arcstats.c: 3221225472
> kstat.zfs.misc.arcstats.c_min: 402653184
> kstat.zfs.misc.arcstats.c_max: 3221225472
> kstat.zfs.misc.arcstats.size: 3221162968

Which is about 3072MB (there is always some degree of variance).

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110429010829.GA36744>