From owner-freebsd-fs@FreeBSD.ORG Mon Oct 28 21:56:18 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 0BBCB8B4; Mon, 28 Oct 2013 21:56:18 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from anubis.delphij.net (anubis.delphij.net [IPv6:2001:470:1:117::25]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id CD5452244; Mon, 28 Oct 2013 21:56:17 +0000 (UTC) Received: from zeta.ixsystems.com (unknown [69.198.165.132]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by anubis.delphij.net (Postfix) with ESMTPSA id 6A10A2A681; Mon, 28 Oct 2013 14:56:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=delphij.net; s=anubis; t=1382997377; bh=hFDiaefO6Vyv9Pm+hXXEAYlFoqxtdxhMDc9lG75NM9c=; h=Date:From:Reply-To:To:CC:Subject:References:In-Reply-To; b=xhA5qE177C3gHdztxLSH5p+1SGKeOpuA1GwDjWdDcKh4My6TyG1xOA8ldbbw/RrjN jMzfYFLxPEjb3ZQGfKA3O6GDAQeu6nRd/mA9XQRieOcxitCAeBSg3hxNJuJXKL10Rb 5IaesbjzSiHAS+5wjp1Bfm57SaYsXOpoNzF+aQQ8= Message-ID: <526EDD81.8000109@delphij.net> Date: Mon, 28 Oct 2013 14:56:17 -0700 From: Xin Li Organization: The FreeBSD Project MIME-Version: 1.0 To: Slawa Olhovchenkov , d@delphij.net Subject: Re: ZFS txg implementation flaw References: <20131028092844.GA24997@zxy.spb.ru> <9A00B135-7D28-47EB-ADB3-E87C38BAC6B6@ixsystems.com> <20131028213204.GX63359@zxy.spb.ru> <526ED956.10202@delphij.net> <20131028214552.GY63359@zxy.spb.ru> In-Reply-To: <20131028214552.GY63359@zxy.spb.ru> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org, Jordan Hubbard X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: d@delphij.net List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Oct 2013 21:56:18 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 10/28/13 14:45, Slawa Olhovchenkov wrote: > On Mon, Oct 28, 2013 at 02:38:30PM -0700, Xin Li wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 >> >> On 10/28/13 14:32, Slawa Olhovchenkov wrote: >>> On Mon, Oct 28, 2013 at 02:22:16PM -0700, Jordan Hubbard >>> wrote: >>> >>>> >>>> On Oct 28, 2013, at 2:28 AM, Slawa Olhovchenkov >>>> wrote: >>>> >>>>> As I see ZFS cretate seperate thread for earch txg writing. >>>>> Also for writing to L2ARC. As result -- up to several >>>>> thousands threads created and destoyed per second. And >>>>> hundreds thousands page allocations, zeroing, maping >>>>> unmaping and freeing per seconds. Very high overhead. >>>> >>>> How are you measuring the number of threads being created / >>>> destroyed? This claim seems erroneous given how the ZFS >>>> thread pool mechanism actually works (and yes, there are >>>> thread pools already). >>>> >>>> It would be helpful to both see your measurement methodology >>>> and the workload you are using in your tests. >>> >>> Semi-indirect. dtrace -n 'fbt:kernel:vm_object_terminate:entry >>> { @traces[stack()] = count(); }' >>> >>> After some (2-3) seconds >>> >>> kernel`vnode_destroy_vobject+0xb9 >>> zfs.ko`zfs_freebsd_reclaim+0x2e kernel`VOP_RECLAIM_APV+0x78 >>> kernel`vgonel+0x134 kernel`vnlru_free+0x362 >>> kernel`vnlru_proc+0x61e kernel`fork_exit+0x11f >>> kernel`0xffffffff80cdbfde 2490 > > 0xffffffff80cdbfd0 : mov %r12,%rdi > 0xffffffff80cdbfd3 : mov %rbx,%rsi > 0xffffffff80cdbfd6 : mov %rsp,%rdx > 0xffffffff80cdbfd9 : callq 0xffffffff808db560 > 0xffffffff80cdbfde : jmpq > 0xffffffff80cdca80 0xffffffff80cdbfe3 > : nopw 0x0(%rax,%rax,1) > 0xffffffff80cdbfe9 : nopl 0x0(%rax) > > >>> I don't have user process created threads nor do fork/exit. >> >> This has nothing to do with fork/exit but does suggest that you >> are running of vnodes. What does sysctl -a | grep vnode say? > > kern.maxvnodes: 1095872 kern.minvnodes: 273968 > vm.stats.vm.v_vnodepgsout: 0 vm.stats.vm.v_vnodepgsin: 62399 > vm.stats.vm.v_vnodeout: 0 vm.stats.vm.v_vnodein: 10680 > vfs.freevnodes: 275107 vfs.wantfreevnodes: 273968 vfs.numvnodes: > 316321 debug.sizeof.vnode: 504 Try setting vfs.wantfreevnodes to 547936 (double it). Cheers, - -- Xin LI https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -----BEGIN PGP SIGNATURE----- iQIcBAEBCgAGBQJSbt2BAAoJEJW2GBstM+nsknMP/1QQQ0BHJOu//nG2M2HnYGsQ bS0he2xdom/GpPuMS3AwGYYwZTWwauGwr3c2K4czW5AzghNDxpVfycobuGeWVvcB mvyBgkGhxy33nxVuw9hH4FJW62vJc9sJKlgg5QNQhER81OpCBS2AcVv7qNNtj9f6 svZrhu6X28maas+JnwSr5U82gudC1uhHD3h1pZqc+ogFiEgHlQOoL3Pl6SrpTKUZ WNFnKd9xWQ/28n26r+jzQu9SlTSStKNQcZiCsMO/5TcGs6Ul8Ft2pS0EKYvVMdVF poPLItT7qa38nM9BXZYNiESIoZpe1coYXX0en6NMTa0q7JerN05tk3d8q31Rn/Hp toodJuZB8zA+ZN732s295G06j9gDbSj/iFLumV/0s9OHMVT5lgqVjxmPurmjE+ay nnPrTDpO3Ef45nC6Gb87yN2ML2GG40de5kYWtieLFt5aSJhQjvmDA+zOxdC9orrh raspOHfgysvSh8ykaS9SsNdzgEJr5TTzbxh91Ft06e65TEdIzX9HhnqxOLBT+lC1 E6OKYVuU1rLjZPPTplCFI922JbyKEhSc73Gu03zPma8cJEzP/ztCxm/Jv0PrV+4b SzphVQdMbUr2TMKAUIJXcCwHSWhCCCCmqmODoDcHoTbC0kBAqyAbaTCZ8PJaR/A8 jxbZvQV8dGjSYu0LVhnT =3Xt/ -----END PGP SIGNATURE-----