From owner-freebsd-fs@FreeBSD.ORG Tue Sep 3 09:28:36 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 61D5482A for ; Tue, 3 Sep 2013 09:28:36 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5DCFD2704 for ; Tue, 3 Sep 2013 09:28:34 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA06532; Tue, 03 Sep 2013 12:28:17 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1VGmuF-0001WF-W4; Tue, 03 Sep 2013 12:28:16 +0300 Message-ID: <5225AB77.9020208@FreeBSD.org> Date: Tue, 03 Sep 2013 12:27:19 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130810 Thunderbird/17.0.8 MIME-Version: 1.0 To: Grant Gray Subject: Re: ZFS livelock / deadlock on pure SSD pool References: <522599A9.9070107@grantgray.id.au> In-Reply-To: <522599A9.9070107@grantgray.id.au> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Sep 2013 09:28:36 -0000 on 03/09/2013 11:11 Grant Gray said the following: > I haven't yet enabled the kernel debugger to get a stack trace/lock status, but > procstat -kk -a is here: > http://pastebin.com/raw.php?i=SYhmyhGj I believe that this another ARC deadlock triggered by low memory condition. This time it seems to be FreeBSD-specific too: 6 100059 zfskern arc_reclaim_thre mi_switch+0x194 sleepq_wait+0x42 _sx_xlock_hard+0x4d6 _sx_xlock+0x75 arc_buf_remove_ref+0x8a dbuf_rele_and_unlock+0x132 dbuf_evict+0x11 dbuf_do_evict+0x53 arc_do_user_evicts+0xe2 arc_reclaim_thread+0x264 fork_exit+0x11f fork_trampoline+0xe 5338 102410 vorbisgain - mi_switch+0x194 sleepq_wait+0x42 _sx_xlock_hard+0x4d6 _sx_xlock+0x75 arc_lowmem+0x38 kmem_malloc+0xb0 uma_large_malloc+0x4a malloc+0xd9 arc_get_data_buf+0x1f4 arc_read+0x225 dbuf_read+0x445 dmu_buf_hold_array_by_dnode+0x168 dmu_buf_hold_array+0x67 dmu_read_uio+0x3f zfs_freebsd_read+0x483 VOP_READ_APV+0x6e vn_read+0xed vn_io_fault+0x90 Thread 100059 acquired arc_reclaim_thr_lock before calling arc_do_user_evicts and now it wants to take a buf header hash lock. Thread 102410 acquired the hash lock in arc_read, then it got into arc_lowmem because of a memory allocation problem (and M_WAIT flag) and now it wants to take arc_reclaim_thr_lock. A classic deadlock. -- Andriy Gapon