From owner-freebsd-fs@FreeBSD.ORG Sun Feb 27 18:53:35 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3E5D4106566B for ; Sun, 27 Feb 2011 18:53:35 +0000 (UTC) (envelope-from royce.williams@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 0A85D8FC16 for ; Sun, 27 Feb 2011 18:53:34 +0000 (UTC) Received: by iwn33 with SMTP id 33so2839450iwn.13 for ; Sun, 27 Feb 2011 10:53:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:from:date:message-id:subject:to :content-type; bh=JxwaHhmqF8/T4ZCzS7moix8yIWb9s2kL74mJ9XKYok8=; b=PeAxdiAhFoQpjM1SO+TScMHPR/I14PaBfNrjLEMFxDPaMc2GOaKGTTkFQN/uXwGe8G PqXYIhiAKG9ce2LZ4TFvyrxC4wAxHdY15yAEM1wP0IOweIITUhvVzbXqFUGeDe8KjpPQ UxVXubnWfCUEJBpdStNiPhlg+Ch6+pQoc70fc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=AkB3SHr0SVHUiyMiFLO+PUA7LGFnOV0nHHA+w1qkQ0sOvnqFbIxH9lTSUw1A8THU8x 2gW9oi8jNIORoHml+ZBUG7a6AgVaaH9J/gXJWSOQsLZana0nO4tUzZZJ3UP2Pj1+q82c uAOB63Swol36dszZtIjgxAJ5hZQ4TeI3+KjnU= Received: by 10.231.12.131 with SMTP id x3mr4777810ibx.76.1298831458366; Sun, 27 Feb 2011 10:30:58 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.183.139 with HTTP; Sun, 27 Feb 2011 10:30:38 -0800 (PST) From: Royce Williams Date: Sun, 27 Feb 2011 09:30:38 -0900 Message-ID: To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: slow "zfs destroy snapshot" with predictable time pattern X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Feb 2011 18:53:35 -0000 I thought that 'zfs snapshot destroy' should be fast (on the order of a few seconds), but mine are taking a predictably long time on a pretty modest filesystem (details below). I discovered this when a typo caused many more snapshots than I intended (every minute!); I had about 12,000 of them before I noticed. Destroying the first snapshot took about 39 wallclock seconds on an otherwise idle system. A few more destroys took almost exactly the same amount of time. I know little about ZFS under the hood, but I wanted to investigate a little bit. I scripted a loop of 'time zfs destroy snapshot' and let it run overnight. Each destroy was consistently taking 37-40 seconds, but then after hundreds of deletions in that time range, I saw a jagged spike, followed by a consistent drop that has stayed in the 23-25s range: [hours of 38-39s destroys snipped] real 0m38.205s real 0m38.455s real 0m38.580s real 0m37.414s real 0m35.330s <-- small drop here real 0m35.347s real 0m35.380s real 0m35.355s real 0m35.255s real 0m35.514s real 0m35.422s real 0m35.464s real 0m46.121s <-- small spike here real 0m44.630s real 0m46.021s real 1m19.443s <-- big spike here real 0m40.896s real 0m22.848s <-- drop into the 20s range real 0m29.039s real 0m29.831s real 0m26.348s real 0m22.623s real 0m29.314s real 0m29.589s real 0m26.573s real 0m22.773s [hours of of 23-25s destroys snipped] I know very little about ZFS under the hood, but this model might fit the facts: * Normally, 'zfs destroy snapshot' is fast (on the order of a few seconds); * 'zfs destroy snapshot' has to briefly analyze all snapshots prior to destruction; * A particular 'problem' snapshot can slow that full analysis by a consistent amount of time; * Destroying that 'problem' snapshot drops the analysis time by that amount. If my model is correct, I'm going to see one or more spikes, followed by corresponding drops, until the destroys return to a reasonable rate. This guy had a problem that might also fit that model -- that particular snapshots can be very slow, and removing them removes the time delay. That thread notes that it was due to a low-memory condition, and OpenSolaris bug 6542681 was filed for it. I do not think that my problem is because of low memory. http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg07647.html I have stopped the destroys in case the remaining 'problem' snapshot is useful. The system is 8.1-SECURITY, amd64, 4GB RAM, no sysctl or loader tweaks, ZFS v3, zpool v14, single 58GB ZFS pool. # zfs list NAME USED AVAIL REFER MOUNTPOINT atoz-backup 15.4G 58.0G 25.5K /atoz-backup atoz-backup/usr 15.3G 58.0G 14.8G /atoz-backup/usr # df -ki | egrep 'atoz|Filesystem' Filesystem 1024-blocks Used Avail Capacity iused ifree %iused Mounted on atoz-backup 60789979 25 60789953 0% 6 121579907 0% /atoz-backup atoz-backup/usr 76281655 15491701 60789953 20% 714124 121579907 1% /atoz-backup/usr Royce