From owner-freebsd-current@FreeBSD.ORG Mon Oct 21 12:59:52 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 7377391D for ; Mon, 21 Oct 2013 12:59:52 +0000 (UTC) (envelope-from satan@ukr.net) Received: from hell.ukr.net (hell.ukr.net [212.42.67.68]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 32BCA265D for ; Mon, 21 Oct 2013 12:59:51 +0000 (UTC) Received: from satan by hell.ukr.net with local ID 1VYF5J-0003V5-9k for current@freebsd.org; Mon, 21 Oct 2013 15:59:49 +0300 Date: Mon, 21 Oct 2013 15:59:49 +0300 From: Vitalij Satanivskij To: current@freebsd.org Subject: How to debug whats cause to much __mtx_lock_sleep in system Message-ID: <20131021125949.GB13109@hell.ukr.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Oct 2013 12:59:52 -0000 Hello. Have 10.0-BETA1 #7 r256765 whith terible load's "load averages: 23.31, 30.53, 31" wich degraded more and more with time. Kernel compilied with dtrace support and using script called hotkernel from DTraceToolkit-0.99 found some stange statistics zfs.ko`lz4_compress 5045 0.2% kernel`0xffffffff80 5185 0.2% kernel`uma_zalloc_arg 5302 0.2% kernel`bcopy 5322 0.2% kernel`_sx_xlock 7310 0.3% kernel`_sx_xunlock 7434 0.3% zfs.ko`l2arc_feed_thread 9797 0.4% zfs.ko`lzjb_compress 9912 0.4% zfs.ko`list_prev 17894 0.7% kernel`__rw_wlock_hard 30522 1.2% kernel`spinlock_exit 31310 1.3% kernel`acpi_cpu_c1 103495 4.1% kernel`_sx_xlock_hard 138743 5.5% kernel`vmem_xalloc 175869 7.0% kernel`cpu_idle 371159 14.8% kernel`__mtx_lock_sleep 1345815 53.8% Theris another same machine with simple data and usage but with old curent r245701 Which have none problem's with load zfs.ko`fletcher_4_native 2366 0.1% kernel`uma_zfree_arg 2387 0.1% zfs.ko`lzjb_decompress 2392 0.1% kernel`__rw_rlock 2477 0.1% zfs.ko`dmu_zfetch 2553 0.1% kernel`bcopy 3035 0.1% kernel`vm_page_splay 3089 0.1% kernel`_mtx_trylock_flags_ 3346 0.2% kernel`bzero 3411 0.2% kernel`0xffffffff80 3665 0.2% kernel`_sx_xunlock 3818 0.2% kernel`uma_zalloc_arg 4216 0.2% kernel`vmtotal 4702 0.2% kernel`_sx_xlock 5117 0.2% kernel`free 5476 0.2% zfs.ko`lzjb_compress 6674 0.3% kernel`spinlock_exit 21590 1.0% kernel`__mtx_lock_sleep 40819 1.9% kernel`acpi_cpu_c1 311077 14.1% kernel`cpu_idle 1639418 74.6% Both servers have same hardware, same software of cause not system version. So which way is the right to investigate problem and find resolution?