FreeBSD Mail Archives

Date:      Fri, 29 Apr 2011 09:43:47 +0930
From:      "Daniel O'Connor" <doconnor@gsoft.com.au>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        freebsd-stable List <freebsd-stable@freebsd.org>
Subject:   Re: ZFS vs OSX Time Machine
Message-ID:  <AF725CFF-86A4-4D65-A26E-496F6B9BD33E@gsoft.com.au>
In-Reply-To: <20110428195601.GA31807@icarus.home.lan>
References:  <537A8F4F-A302-40F9-92DF-403388D99B4B@gsoft.com.au> <20110428195601.GA31807@icarus.home.lan>



On 29/04/2011, at 5:26, Jeremy Chadwick wrote:
>> I have the following ZFS related tunables
>> 
>> vfs.zfs.arc_max="3072M"
>> vfs.zfs.prefetch_disable="1" 
>> vfs.zfs.txg.timeout=5
>> vfs.zfs.cache_flush_disable=1
> 
> Are the last two actually *working* in /boot/loader.conf?  Can you
> verify by looking at them via sysctl?  AFAIK they shouldn't work, since
> they lack double-quotes around the values.  Parsing errors are supposed
> to throw you back to the loader prompt.  See loader.conf(5) for the
> syntax.

Yep, they're working :)

> I'm also not sure why you're setting cache_flush_disable at all.

I think I was wondering if it would help the abysmal write performance of these disks..

>> Any help appreciated, thanks :)
> 
> Others seem to be battling stating that "NFS doesn't work for TM", but
> that isn't what you're complaining about.  You're complaining that
> FreeBSD with ZFS + NFS performs extremely poorly when trying to do
> backups from an OS X client using TM (writing to the NFS mount).

Yes, and also TM is over AFP not NFS (I forgot to mention that..)

> I have absolutely no experience with TM or OS X, so if it's actually a
> client-level problem (which I'm doubting) I can't help you there.
> 
> Just sort of a ramble here at different things...
> 
> It would be useful to provide ZFS ARC sysctl data from the FreeBSD
> system where you're seeing performance issues.  "sysctl -a
> kstat.zfs.misc.arcstats" should suffice.

kstat.zfs.misc.arcstats.hits: 236092077
kstat.zfs.misc.arcstats.misses: 6451964
kstat.zfs.misc.arcstats.demand_data_hits: 98087637
kstat.zfs.misc.arcstats.demand_data_misses: 1220891
kstat.zfs.misc.arcstats.demand_metadata_hits: 138004440
kstat.zfs.misc.arcstats.demand_metadata_misses: 5231073
kstat.zfs.misc.arcstats.prefetch_data_hits: 0
kstat.zfs.misc.arcstats.prefetch_data_misses: 0
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 0
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 0
kstat.zfs.misc.arcstats.mru_hits: 15041670
kstat.zfs.misc.arcstats.mru_ghost_hits: 956048
kstat.zfs.misc.arcstats.mfu_hits: 221050407
kstat.zfs.misc.arcstats.mfu_ghost_hits: 3269042
kstat.zfs.misc.arcstats.allocated: 15785717
kstat.zfs.misc.arcstats.deleted: 4690878
kstat.zfs.misc.arcstats.stolen: 4990300
kstat.zfs.misc.arcstats.recycle_miss: 2142423
kstat.zfs.misc.arcstats.mutex_miss: 518
kstat.zfs.misc.arcstats.evict_skip: 2251705
kstat.zfs.misc.arcstats.evict_l2_cached: 0
kstat.zfs.misc.arcstats.evict_l2_eligible: 470396116480
kstat.zfs.misc.arcstats.evict_l2_ineligible: 2048
kstat.zfs.misc.arcstats.hash_elements: 482679
kstat.zfs.misc.arcstats.hash_elements_max: 503063
kstat.zfs.misc.arcstats.hash_collisions: 19593315
kstat.zfs.misc.arcstats.hash_chains: 116103
kstat.zfs.misc.arcstats.hash_chain_max: 16
kstat.zfs.misc.arcstats.p: 1692798721
kstat.zfs.misc.arcstats.c: 3221225472
kstat.zfs.misc.arcstats.c_min: 402653184
kstat.zfs.misc.arcstats.c_max: 3221225472
kstat.zfs.misc.arcstats.size: 3221162968
kstat.zfs.misc.arcstats.hdr_size: 103492088
kstat.zfs.misc.arcstats.data_size: 2764591616
kstat.zfs.misc.arcstats.other_size: 353079264
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_read_bytes: 0
kstat.zfs.misc.arcstats.l2_write_bytes: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 19
kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0
kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0
kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0
kstat.zfs.misc.arcstats.l2_write_in_l2: 0
kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0
kstat.zfs.misc.arcstats.l2_write_not_cacheable: 1
kstat.zfs.misc.arcstats.l2_write_full: 0
kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0
kstat.zfs.misc.arcstats.l2_write_pios: 0
kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0

> You should also try executing "zpool iostat -v 1" during the TM backup
> to see if there's a particular device which is behaving poorly.  There
> have been reports of ZFS pools behaving poorly when a single device
> within the pool has slow I/O (e.g. 5 hard disks, one of which has
> internal issues, resulting in the entire pool performing horribly).  You
> should let this run for probably 60-120 seconds to get an idea.  Given
> your parameters above (assuming vfs.zfs.txg.timeout IS in fact 5!), you
> should see "bursts" of writes every 5 seconds.

OK.

> I know that there are some things on ZFS that perform badly overall.
> Anything that involves excessive/large numbers of files (not file sizes,
> but actual files themselves) seems to perform not-so-great with ZFS.
> For example, Maildir on ZFS = piss-poor performance.  There are ways to
> work around this issue (if I remember correctly, by adding a dedicated
> "log" device to your ZFS pool, but be aware your log devices need to
> be reliable (if you have a single log device and it fails the entire
> pool is damaged, if I remember right)), but I don't consider it
> feasible.  So if TM is creating tons of files on the NFS mount (backed
> by ZFS), then I imagine the performance isn't so great.

Hmm, the sparse disk image does have ~80000 files in a single directory..

> Could you please provide the following sysctl values?  Thanks.
> 
> kern.maxvnodes
> kern.minvnodes
> vfs.freevnodes
> vfs.numvnodes

kern.maxvnodes: 204477
kern.minvnodes: 51119
vfs.freevnodes: 51118
vfs.numvnodes: 66116

> If the FreeBSD machine has a wireless card in it, if at all possible
> could you try ruling that out by hooking up wired Ethernet instead?
> It's probably not the cause, but worth trying anyway.  If you have a
> home router or something doing 802.11, don't bother with this idea.

The FreeBSD box is wired, although it's using an re card as the em card died(!!).

The OSX box is connected via an Airport Express (11n).

> Next, you COULD try using Samba/CIFS on the FreeBSD box to see if you
> can narrow the issue down to bad NFS performance.  Please see this post
> of mine about tuning Samba on FreeBSD (backed by ZFS) to get extremely
> good performance.  Many people responded and said their performance
> drastically improved (you can see the thread yourself).  The trick is
> AIO.  You can ignore the part about setting vm.kmem_size in loader.conf;
> that advice is now old/deprecated (does not pertain to you given the
> date of your kernel), and vfs.zfs.txg.write_limit_override is something
> you shouldn't mess with unless absolutely needed to leave it default:
> 
> http://lists.freebsd.org/pipermail/freebsd-stable/2011-February/061642.html

OK. I don't think TM can use CIFS, I will try ISCSI as someone else suggested, perhaps it will help.

> Finally, when was the last time this FreeBSD machine was rebooted?  Some
> people have seen horrible performance that goes away after a reboot.
> There's some speculation that memory fragmentation has something to do
> with it.  I simply don't know.  I'm not telling you to reboot the box
> (please don't; it would be more useful if it could be kept up in case
> folks want to do analysis of it).

I think performance does improve after a reboot :(

top looks like..
last pid: 16112;  load averages:  0.24,  0.22,  0.23                     up 8+16:11:50  09:43:19
653 processes: 1 running, 652 sleeping
CPU:  3.6% user,  0.0% nice,  3.4% system,  0.6% interrupt, 92.5% idle
Mem: 1401M Active, 578M Inact, 4143M Wired, 4904K Cache, 16M Buf, 1658M Free
Swap: 4096M Total, 160M Used, 3936M Free, 3% Inuse

although free does go down very low (~250MB) at times.

--
Daniel O'Connor software and network engineer
for Genesis Software - http://www.gsoft.com.au
"The nice thing about standards is that there
are so many of them to choose from."
  -- Andrew Tanenbaum
GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AF725CFF-86A4-4D65-A26E-496F6B9BD33E>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation