Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Jul 2008 13:57:27 -0700
From:      Matt Simerson <matt@corp.spry.com>
To:        freebsd-fs@freebsd.org
Cc:        pjd@freebsd.org
Subject:   ZFS hang issue and prefetch_disable
Message-ID:  <5E8D64DE-EC9B-4B11-BCB4-17BA63650BB7@corp.spry.com>

next in thread | raw e-mail | index | archive | help
Symptoms

Deadlocks under heavy IO load on the ZFS file system with  
prefetch_disable=0.  Setting vfs.zfs.prefetch_disable=1 results in a  
stable system.

Configuration

Two machines. Identically built. Both exhibit identical behavior.
8 cores (2 x E5420) x 2.5GHz, 16 GB RAM, 24 x 1TB disks.
FreeBSD 7.0 amd64
dmesg: http://matt.simerson.net/computing/zfs/dmesg.txt

Boot disk is a read only 1GB compact flash
# cat /etc/fstab
/dev/ad0s1a  / ufs  ro,noatime  2 2

# df -h /
Filesystem  1K-blocks   Used  Avail Capacity  Mounted on
/dev/ad0s1a    939M    555M    309M    64%    /

RAM has been boosted as suggested in ZFS Tuning Guide
# cat /boot/loader.conf
vm.kmem_size= 1610612736
vm.kmem_size_max= 1610612736
vfs.zfs.prefetch_disable=1

I haven't mucked much with the other memory settings as I'm using  
amd64 and according to the FreeBSD ZFS wiki, that isn't necessary.  
I've tried higher settings for kmem but that resulted in a failed  
boot. I have ample RAM And would love to use as much as possible for  
network and disk I/O buffers as that's principally all this system does.

Disks & ZFS options

Sun's "Best Practices" suggests limiting the number of disks in a  
raidz pool to no more than 6-10, IIRC. ZFS is configured as shown: http://matt.simerson.net/computing/zfs/zpool.txt

I'm using all of the ZFS default properties except: atime=off,  
compression=on.

Environment

I'm using these machines as backup servers. I wrote an application  
that generates a list of the thousands of VPS accounts we host. For  
each host, it generates a rsnapshot configuration file and backs up up  
their VPS to these systems via rsync. The application manages  
concurrency and will span additional rsync processes if system i/o  
load is below a defined thresh-hold. Which is to say, I can crank up  
or down the amount of network and disk IO the system sees.

With vfs.zfs.prefetch_disable=1, a hang will occur within a few hours  
(no more than a day). If I keep the i/o load (measured via iostat)  
down to a low level (< 200 iops) then I still get hangs but less  
frequently (1-6 days).  The only way I have found to prevent the hangs  
is by setting vfs.zfs.prefetch_disable=1.

Matt Simerson




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5E8D64DE-EC9B-4B11-BCB4-17BA63650BB7>