From owner-freebsd-fs@FreeBSD.ORG Wed Dec 2 21:11:32 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F09171065679 for ; Wed, 2 Dec 2009 21:11:31 +0000 (UTC) (envelope-from kevin@your.org) Received: from mail.your.org (chi02.mail.your.org [204.9.55.23]) by mx1.freebsd.org (Postfix) with ESMTP id B71F88FC14 for ; Wed, 2 Dec 2009 21:11:31 +0000 (UTC) Received: from mail.your.org (chi02.mail.your.org [204.9.55.23]) by mail.your.org (Postfix) with ESMTP id 221991806A6F for ; Wed, 2 Dec 2009 20:55:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=your.org; h=from :content-type:content-transfer-encoding:subject:date:message-id :to:mime-version; s=selector1; bh=7tg1p5XR2M9gxbTVFs2SrAYQ+Ls=; b=o6oceVUY/VIJqJzl3BvSru9D0wSXIU0ykIDWQ4xMKK5nWdaWbZ/ApqRyEKUVX 2GY8EVfocvTRN5FjGPUPKOssSja/2RvzKuhx6b3hF5b6huPFLN9RN5EbKEdlf3i4 5mtgRTSGpO48+8IikTtK2y9xrcJg7gOoNvN0DXebrtW78I= DomainKey-Signature: a=rsa-sha1; c=nofws; d=your.org; h=from:content-type :content-transfer-encoding:subject:date:message-id:to: mime-version; q=dns; s=selector1; b=b881oy/Z5C6Y63i+7nmK3GJhWuYP 1M3PMZBmO4FLQAO3ymHY3Qh1OxkaWGAUzTqjMO3bv5vfPP6iTZ2LPjbzQcYWd7D2 ACaRyki/9D8jm/twJaMJYtYsBjKZHovb2GNAQw4l86YW3PxiEpEG702qaLAkrP2R dHSY1KayP9xwwWI= Received: from vpn177.ord02.your.org (vpn177.ord02.your.org [204.9.55.177]) by mail.your.org (Postfix) with ESMTPA id BCB8A1806A6D for ; Wed, 2 Dec 2009 20:55:23 +0000 (UTC) From: Kevin Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Date: Wed, 2 Dec 2009 14:55:23 -0600 Message-Id: <30582AF2-B1E8-4C07-A487-C220845963D2@your.org> To: freebsd-fs@freebsd.org Mime-Version: 1.0 (Apple Message framework v1076) X-Mailer: Apple Mail (2.1076) Subject: "zfs receive" lock time X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Dec 2009 21:11:32 -0000 I have two very very fast systems (12-disk 15krpm raid array, 16 cores, etc). I'm using zfs send/receive to replicate a zfs volume from the "master" box to the "slave" box. Every minute, the master takes a new snapshot, then uses "send -i" to send an incremental snapshot to the slave. Normally, no files are changed during the minute so the operation is very fast (<1 second, and most of that is ssh negotiation time). If the slave is completely idle, "zfs receive" takes a fraction of a second. If the slave has been very busy (lots of read activity, no writes - the slave has everything mounted read only), suddenly "zfs receive" can take 30 seconds or more to complete, the whole time it has the filesystem locked. For example, I'd see: 49345 root 1 76 0 13600K 1956K zio->i 9 0:01 1.37% zfs 48910 www 1 46 0 36700K 21932K rrl->r 3 0:24 0.00% lighttpd 48913 www 1 46 0 41820K 26108K rrl->r 2 0:24 0.00% lighttpd 48912 www 1 46 0 37724K 23484K rrl->r 0 0:24 0.00% lighttpd 48911 www 1 46 0 41820K 26460K rrl->r 10 0:23 0.00% lighttpd 48909 www 1 46 0 39772K 24488K rrl->r 5 0:22 0.00% lighttpd 48908 www 1 46 0 36700K 21460K rrl->r 14 0:19 0.00% lighttpd 48907 www 1 45 0 30556K 16216K rrl->r 13 0:14 0.00% lighttpd 48906 www 1 44 0 26460K 11452K rrl->r 6 0:06 0.00% lighttpd At first, I thought it was possibly cache pressure... when the system was busy, whatever data necessary to create a new snapshot was getting pushed out of the cache so it had to be re-read. I increased arc_max and arc_meta_limit to very high values, and it seemed to have no effect, even when arc_meta_used was far below arc_meta_limit. Disabling cache flushes had no impact. Disabling zil cut the time in half, but it's still too long for this application. ktrace on the "zfs receive" shows: 1062 zfs 0.000024 CALL ioctl(0x3,0xcc285a11 ,0x7fffffffa320) 1062 zfs 0.000081 RET ioctl 0 1062 zfs 0.000058 CALL ioctl(0x3,0xcc285a05 ,0x7fffffffa2f0) 1062 zfs 0.000037 RET ioctl 0 1062 zfs 0.000019 CALL ioctl(0x3,0xcc285a11 ,0x7fffffffa320) 1062 zfs 0.000055 RET ioctl 0 1062 zfs 0.000031 CALL ioctl(0x3,0xcc285a11 ,0x7fffffff9f00) 1062 zfs 0.000053 RET ioctl 0 1062 zfs 0.000020 CALL ioctl(0x3,0xcc285a1c ,0x7fffffffc930) 1062 zfs 24.837084 RET ioctl 0 1062 zfs 0.000028 CALL ioctl(0x3,0xcc285a11 ,0x7fffffff9f00) 1062 zfs 0.000074 RET ioctl 0 1062 zfs 0.000037 CALL close(0x6) 1062 zfs 0.000006 RET close 0 1062 zfs 0.000007 CALL close(0x3) 1062 zfs 0.000005 RET close 0 The 24 second call to 0xcc285a1c is ZFS_IOC_RECV, so whatever is going on is in the kernel, not a delay in getting the kernel any data. "systat" is showing that the drives are 100% busy during the operation, so it's obviously doing something. :) Does anyone know what "zfs receive" is doing while it has everything locked like this, and why a lot of read activity beforehand would drastically effect the performance of doing this? -- Kevin