From owner-freebsd-fs@freebsd.org Sun Sep 13 13:04:02 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EF65CA03ACF for ; Sun, 13 Sep 2015 13:04:01 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from mail-wi0-f173.google.com (mail-wi0-f173.google.com [209.85.212.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8E6461B18 for ; Sun, 13 Sep 2015 13:04:01 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: by wiclk2 with SMTP id lk2so110309296wic.0 for ; Sun, 13 Sep 2015 06:03:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=uZlz0eI5sXUjhKewweK6fqri5U10/bKH6elzoU06kFM=; b=CMSsWv5peS0FA7VSfIYGpIJaAHjZj00l+tJ6VR/4rormdhHDAAL7nB0Txe/pW4z+Yz 3La+TVePCl/MvJzFWgTiDbT43XHnI9D1o645R1vEKCtnqvEt/AT6bTaK93z3BhmlJpse iFO+MY/g8mCWL7ArPb6e4WAxwBlL9Gi69D7/gZndIvOzhes9NUeGoBh16EI1zxn9dJ2P R2deqI0KrHD6o2MGMf72FfMgmdLwnEgdIbIxP0xRA1wUP6qCrOuRNJ3nqdU/7KA2olFO fCn2/EAutd+j2QVtEtXDzAl15+ki+MVWfUy4FaWtT2nu9eFpsgX1gAfZue3o5mj//ebv WqKQ== X-Gm-Message-State: ALoCoQn4J8i+SkQFaz2n8MiLRr+8XMJkyW968HJDpZg1yDyNHA1SG9IAnHJnEQhNHmWWOKNbjmQy X-Received: by 10.180.92.138 with SMTP id cm10mr15506136wib.33.1442149439660; Sun, 13 Sep 2015 06:03:59 -0700 (PDT) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by smtp.gmail.com with ESMTPSA id jw9sm10257051wjb.28.2015.09.13.06.03.58 for (version=TLSv1/SSLv3 cipher=OTHER); Sun, 13 Sep 2015 06:03:59 -0700 (PDT) Subject: Re: zfs_trim_enabled destroys zio_free() performance To: freebsd-fs@freebsd.org References: <55F308B7.3020302@FreeBSD.org> From: Steven Hartland Message-ID: <55F57439.8060000@multiplay.co.uk> Date: Sun, 13 Sep 2015 14:03:53 +0100 User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <55F308B7.3020302@FreeBSD.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Sep 2015 13:04:02 -0000 Do you remember if this was this causing a deadlock or something similar that's easy to provoke? Regards Steve On 11/09/2015 18:00, Alexander Motin wrote: > Hi. > > The code in question was added by me at r253992. Commit message tells it > was made to decouple locks. I don't remember much more details, but may > be it can be redone somehow else. > > On 11.09.2015 19:07, Matthew Ahrens wrote: >> I discovered that when destroying a ZFS snapshot, we can end up using >> several seconds of CPU via this stack trace: >> >> kernel`spinlock_exit+0x2d >> kernel`taskqueue_enqueue+0x12c >> zfs.ko`zio_issue_async+0x7c >> zfs.ko`zio_execute+0x162 >> zfs.ko`dsl_scan_free_block_cb+0x15f >> zfs.ko`bpobj_iterate_impl+0x25d >> zfs.ko`bpobj_iterate_impl+0x46e >> zfs.ko`dsl_scan_sync+0x152 >> zfs.ko`spa_sync+0x5c1 >> zfs.ko`txg_sync_thread+0x3a6 >> kernel`fork_exit+0x9a >> kernel`0xffffffff80d0acbe >> 6558 ms >> >> This is not good for performance since, in addition to the CPU cost, it >> doesn't allow the sync thread to do anything else, and this is >> observable as periods where we don't do any write i/o to disk for >> several seconds. >> >> The problem is that when zfs_trim_enabled is set (which it is by >> default), zio_free_sync() always sets ZIO_STAGE_ISSUE_ASYNC, causing the >> free to be dispatched to a taskq. Since each task completes very >> quickly, there is a large locking and context switching overhead -- we >> would be better off just processing the free in the caller's context. >> >> I'm not sure exactly why we need to go async when trim is enabled, but >> it seems like at least we should not bother going async if trim is not >> actually being used (e.g. with an all-spinning-disk pool). It would >> also be worth investigating not going async even when trim is useful >> (e.g. on SSD-based pools). >> >> Here is the relevant code: >> >> zio_free_sync(): >> if (zfs_trim_enabled) >> stage |= ZIO_STAGE_ISSUE_ASYNC | ZIO_STAGE_VDEV_IO_START | >> ZIO_STAGE_VDEV_IO_ASSESS; >> /* >> * GANG and DEDUP blocks can induce a read (for the gang block >> header, >> * or the DDT), so issue them asynchronously so that this thread is >> * not tied up. >> */ >> else if (BP_IS_GANG(bp) || BP_GET_DEDUP(bp)) >> stage |= ZIO_STAGE_ISSUE_ASYNC; >> >> --matt >