Date: Fri, 24 Oct 2008 10:09:16 -0500 From: Dan Nelson <dnelson@allantgroup.com> To: Danny Braniss <danny@cs.huji.ac.il> Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: zfs & waiting on zio->io_cv Message-ID: <20081024150916.GB41283@dan.emsphone.com> In-Reply-To: <E1KtIbt-000PhA-HW@cs1.cs.huji.ac.il> References: <E1KtIbt-000PhA-HW@cs1.cs.huji.ac.il>
next in thread | previous in thread | raw e-mail | index | archive | help
In the last episode (Oct 24), Danny Braniss said: > there is a big delay (probably more than 1 sec.) when doing simple tasks > on this zfs, like ls(1), or 'zfs list', long enough to hit ^T > and get the same [zio->io_cv)], any hints? > > store-01# zfs list > (hitting ^T)load: 0.00 cmd: zfs 88376 [zio->io_cv)] 0.00u 0.00s 0% 1672k > (hitting ^T)load: 0.00 cmd: zfs 88376 [zio->io_cv)] 0.00u 0.00s 0% 1684k > NAME USED AVAIL REFER MOUNTPOINT > h 472G 11.2T 23K /h > h/home 466G 11.2T 466G /h/home > h/home@23-10-08 54K - 466G - > h/root 18K 11.2T 18K /h/root > h/src 18K 11.2T 18K /h/src > h/system 5.64G 11.2T 5.64G /h/system That's sort of the equivalent to waiting in "biord" on a UFS filesystem, I think. ZFS is just waiting for the disk to return a block. If you happen to do something during the window where ZFS is commiting its transaction group, it has to wait until the sync finishes. If some other process is doing a lot of writes, or you only have one disk in your zpool, or your pool is close to full, it may take a couple seconds to sync. There's a couple of things you can try to improve interactive performance. Raising zfs's arc_max is the easiest to do, and will let ZFS cache more stuff, increasing the likelyhood that an "ls" will be able to read from cache instead of having to go to disk. Setting it at 1/4 your physical RAM is probably as high as you can go without causing panics. Raising txg_time ( in /sys/cddl/.../zfs/txg.c ) from 5 to say 30 will tell zfs to sync less often, which can be a win if you don't actually do that much writing. With a single spindle, it may take a substantial fraction of a second just to sync a tiny txg due to the number of copies of metadata ZFS writes for redundancy. If you do a lot of writing, lowering zfs_vdev_max_pending ( in /sys/cddl/.../zfs/vdev_queue.c ) from 35 down to 16 or less will reduce the number of simultaneous I/Os ZFS will try to send to each disk, which will let your reads compete a little better with other I/O. On ATA or SATA disks, you might want to set it to 2. -- Dan Nelson dnelson@allantgroup.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081024150916.GB41283>